The bubbles in VR, cryptocurrency and machine learning are all part of the parallel computing bubble

Nope, talking about the WHO using ML on epidemiological data for rollouts of vaccines, field workers, and other resources throughout the developing world.

Don’t really care for snake oil, but real academics and organizations are using ML and data science for good.

2 Likes

Doesn’t change my point. If you don’t like the Voodoo Graphics as an example, in industry GPU’s have been around for almost 40 years. Silicon Graphics introduced the IRIS 1000 in 1984.

2 Likes

That wouldn’t make sense, even as a contrafactual. At a new node, the big players aim to recoup all their capital and R&D investments in the first couple years after launch in order to stay in business. Once it’s no longer the newest node, anything else you can get out of it is gravy, since the investment has already been recovered. That’s most of why older hardware is cheap, and it’s more due to the economics of the current model than to any particular technology.

1 Like

There’s a reason the z14 and z15 mainframe chips were 5.2 GHz…

I ask the following in all seriousness! How would a shrink from 7nm to 5nm allow for an 80% increase in transistor density per die? That math doesn’t make sense to me, although admittedly, I haven’t traditionally been much of a math guy! :grimacing:

Transistor density per unit area scales as the inverse square of linear feature size. If the linear dimensions of features shrink to 71% of what they had been, their area shrinks to 51% of what it had been. There are overheads, so transistor density doesn’t quite double, but 80+% sounds reasonable.

3 Likes

There are plenty of reasons to question ML, especially it’s conflation with AI (which annoys me to no end) and the ridiculous level of marketing (and market) hyperbole that surrounds it.

But shit, it also has led to an astounding improvement in things like machine translation, search engine utility, and lots of other things that are really quite astounding. We are talking about some of the more impressive inventions humanity has been able to muster, ever — IMHO. Game-changing stuff! Question new tech and the claims surrounding it always, no question. But giving into total cynicism surrounding something as powerful ML seems to be equally off the mark.

1 Like

Thank you, makes sense — I figured it probably had to do with a squaring, at least intuitively. :pray:

What I tend to say is, ‘machine learning is what I do when I don’t have physics.’ That isn’t to say that it’s useless, but it points out key limitations. ML generates models without insight. If you do already have insight into a mechanism, simply throwing ML at the problem throws that insight away. And ML is notoriously bad at explaining its results.

As it turns out, most of the problems I work on professionally are ones where I do have physics. I wind up doing things that look like ML in that they take a bunch of data and come out with a predictive model, but they work by fitting the parameters of a mechanistic model. The results almost always agree with the real world much better than ML, and when they don’t, there’s usually an insight to be had about a previously unsuspected phenomenon.

2 Likes

There just aren’t a lot of super-scalar problems that can’t be accelerated by improved algorithms or special purpose hardware on an attached processor. The only one mentioned that makes sense is emulation, and even emulation could be moved to a co-processor if it becomes important enough. This has happened with graphics, de/compression, signal processing, error correction and a host of other frequently used algorithms.

If we had much faster scalar processors than we do now, I’m guessing we would have loaded them with hardware for faster context switching to emulate parallel processing. Imagine a 100GHz clock speed, 1,000 threads and a 10 picosecond context switch. (Obviously, these specs would be better balanced. I’m just waving my hands here.) We’d still be offloading stuff to specialized processors because even with superfast switching and a boodle of threads, a GPU would render an image in less time and with less heat and energy.

2 Likes
  1. There’s two dimensions, so actual 7nm in each dimension to 5nm would mean the area for a gate would go from 49 square nm to 25.

  2. Nodes are pretty much marketing speak now, and have not generally literally represented the gate width for quite a while.

2b) Different manufacturers have different transistor densities for the “same” node. E.g. Intel’s 10nm node has roughly equivalent transistor density to TSMCs 7nm node.

3 Likes

Thank you, this is very interesting! I didn’t realize that different chip fabricators are essentially using different frames of reference when describing the sizes of their processes, that as you said, Intel’s 10nm is close to TSMC’s 7nm process.

Is it safe to (roughly) assume that for 3D NAND, one would look at a cubing of density occurring, when the transistors shrink?

No, because 3D NAND number of vertical layers is limited by etching aspect ratio, so it probably will max out somewhere between 50 and 300 layers. For it to scale as a cube law, that would have to be millions of layers.

Also, when you scale down on a planar wafer, there are more process steps to make it smaller, but you get the full density benefit by running each process step once, since each step is run on the entire wafer. With 3D NAND, many steps have to be repeated for each layer of NAND cells, so you don’t get the same kind of scaling benefit.

1 Like

This is true – ultimately, we will get to a point where for any realistic purpose, you just can’t get a faster processor – but I suspect we’ve already reached a point where if something can be computed at all (and isn’t NP-hard), what we have is sufficient.

It’s kina like closet space. However much you have, you’ll fill it up and want more; but ultimately you’re not going to solve that problem just by getting bigger and bigger closets, unless you want to end up on that hoarders show. And if you really do need acres of storage space for some reason, then you’re going to have to get into the warehouse business (which is server farms in this analogy).

I did think about mentioning that, because there are problems relating to control that depend on raw scalar processing speed. Like, if you want to make a robot arm that responds fast enough to pluck bullets out of the air, you need to do a lot of sequential calculations very fast. But processors can already do a great deal of calculation in a microsecond (and there are some impressive demos along these lines), so I think you’d be hard pressed to ever make real systems where the bottleneck is processing, rather than electrical or mechanical limits. From a pentium’s point of view, the physical world is barely even moving.

2 Likes

Not even remotely close, in the case of scientific computation or industrial simulation.

Most simulation problems are inherently parallelisable, which is why HPC has for years been driven by cluster size rather than chip performance. If chip makers halted all their R&D today, it wouldn’t make much difference to the roadmap for developing faster supercomputers.

My point with the closet space analogy was that being able to find a use for something is not the same as needing that thing.

1 Like

I work in a field where running a single simulation can easily cost hundreds of thousands of dollars each more in CPU time, so there are a lot of cases where you can accurately characterize a behavior of the system, and make an algorithm that accurately describes it, but you can’t use that algorithm in the simulation because you can’t figure out a way to make it run fast enough.

Even though it is almost entirely parallelizable, you still very strongly resource-limited in terms of what kinds of computations you can do. Just because something is theoretically possible doesn’t mean it’s plausible to pay for the datacenter you need to do it, or even pay for the power to run that cluster.

2 Likes

There’s a difference between trivially parallelisable algorithms (divide a program into n chunks, send each chunk off to m cores, assemble the result) and more complex algorithms where the chunk n depends on its p nearest neighbors.

2 Likes

True, but you run into problems when the accelerated hardware is for a problem that isn’t executed often enough to recoup the money you’d invest into chip production, or if the number of bespoke coprocessors you’d wind up connecting overcomplicates your general purpose hardware.

1 Like

Right, but faster processors won’t translate directly into cheaper computing. Say I have a cluster of 1000 nodes, and I’m charging you $100 for 500 “CPU hours“. If I build a new cluster which can do your job in 250 CPU hours, it doesn’t mean I’m now going to charge you $50, because then my original cluster isn’t making money, and the new cluster isn’t paying for itself. In fact, if turnaround time wasn’t an issue, I’d charge more based on the total number of instructions executed, and processor speed wouldn’t affect the cost to you at all.

Of course, turnaround time does matter a bit – getting a result in a day is worth more than getting it in a year – but I don’t know that people would pay that much to turn a 10-day job into a 9-day job; either way you’re going to have to find other stuff to do while you wait. Besides, I could do that by adding more slow CPUs anyway.

(I know I’m glossing over MTBF, energy usage, floor space etc., but those things could cut either way)

I’ve thought about this on a small scale, because a modest DIY render farm would make parts of my work easier and/or better. But it’s such a black hole – there’s no limit to how much I could invest and still end up waiting on some urgent revision – that it feels more like gambling than a business decision. Sometimes I have to tell a client that something is beyond me, but if I had a 64-node render farm, that could easily still happen, whereas I can often manage with just the 4 cores. So we’re well into diminishing-returns territory with regard to hardware performance.

(If you’re simulating an airframe rather than some frivolous artwork, maybe you can’t say “no”, but you can say “I’m afraid that will take six months and cost $350,000”, which never works when I try it)