Am I the only one that is shocked by the fact that a 1978 computer, even if a supercomputer (but still using the technology of the time) was 1/4 the speed of a Raspberry? The Pi, if you look at the big picture of computing, is a very fast computer. For comparison: you can run a 1 billion parameters LLM on a Raspberry pi at decent speed. This means that the Cray could run it, even if slowly. That's incredible.
But seriously, while we could have run it maybe speedwise, it definitely lacked the memory, not? And if one tried to train it he wouldn't be finished today.
But would make a fun backwards sci-fi story imagining a time traveller that brought the 80ies an LLM from today, what would the world say and do with that slow oracle?
I can't find it now but I wrote a bit of fanfic where Ken Thompson sent an LLM back in time (referencing his "Love, Ken" UNIX tapes that he would send out) to save humanity. He was always a bit head of his time.
Does that mean just a multi gig file? What is INSIDE the LLM that would be of value? How does one speak to an LLM WRT 80's tech, and what could one glean from it....
I don’t think they’d have any trouble with the math, it’s just a bunch of regressions and matvecs, right?
I think the process of collecting and storing all the data would be more mind blowing to them—of course they were at the beginning of Moore’s law, so they could see the trajectory if they looked for it, but it is one thing to stand on the coast with waves lapping at your ankles and imagine how the ocean gets deeper as you keep going and another to get chucked out of a helicopter in the middle of the Pacific.
In a way that someone in the 80s could understand?
An LLM is a very highly compressed store of knowledge combined with an advanced parser than understands questions in plain English. A consequence of the compression is that sometimes the answers lose some accuracy, which is a deliberate trade-off to make it work at all.
My story plot would sure include LLM (coefficients file as today) + code to run it. So 80s humans could run it on the Cray, ask it questions, and get answers (after some time :D).
LLM could explain itself what it is.. (if there are not more important questions to ask, contention would ensue).
LLM is a lossy compression of the internet.
We can provide it in a form that is directly executable on 80s computers though gpt4 tries to convince me that it is practically impossible the reduced model would be much weaker (somebody doesn't want to be sent to 80s ;)
"1/4 the speed of a Pi" applies to the original (slow) 2012 Pi which is unable to run LLM as fast as you think. However the 2020 Pi 400 (equivalent to Pi 4), which can run the LLM workload, is about 100 times faster than the Cray 1:
"Raspberry Pi ARM CPUs - The comment above was for the 2012 Pi 1. In 2020, the Pi 400 average Livermore Loops, Linpack and Whetstone MFLOPS reached 78.8, 49.5 and 95.5 times faster than the Cray 1." http://www.roylongbottom.org.uk/Cray%201%20Supercomputer%20P...
A Pi 4 can infer ~0.8 tokens/sec with some of the more optimized configs (as per https://www.dfrobot.com/blog-13498.html). So the Cray would have needed ~2 minutes per token, so ~2.5 hours to generate one sentence... if hypothetically it had enough RAM (it didn't).
In 1978 RAM cost about $25k per megabyte (https://jcmit.net/memoryprice.htm). Assuming you needed 4GB for inference, RAM would have cost $100M in 1978 dollars, or $470M in today's dollars.
For comparison, the Cray cost $7M in 1978 which is $32M in today's dollars. So once you buy a Cray you would have had to spend 14 times that amount on building a custom RAM device extension of 4GB, somehow hooked to the Cray, to finally be able to generate one sentence every 2.5 hours...
But in 1978, even if RAM was available to do LLM inference, it would have been impossible to train the model, as vastly more compute power is needed than for inference.
That was on a 700 MHz Raspberry Pi 1. On an 1800 MHz Raspberry Pi 400 NEON SIMD the difference was another order of magnitude.
[QUOTE]
Comparison - The three 700 MHz Pi 1 main measurements (Loops, Linpack and Whetstone) were 55, 42 and 94 MFLOPS, with the four gains over Cray 1 being 8.8 times for MHz and 4.6, 1.6, 15.7 times for MFLOPS.
The 2020 1800 MHz Pi 400 provided 819, 1147 and 498 MFLOPS, with MHz speed gains of 23 times and 69, 42 and 83 times for MFLOPS. With more advanced SIMD options, the 64 bit compilation produced Cray 1 MFLOPS gains of 78.8, 49.5 and 95.5 times.[/QUOTE]
Remarkable, yes. Shocking, no. Exponential growth was something experience in the computer industry for decades, and people where quite normalized to it.
This guy Time Travels. (check his hands, he likely has extra fingers)
But... lets look at the availability of DATA in the 80s..
Frankly, this is how hacking/phreaking was invented.
Dumpster-diving for line-printer discards in dumpsters to understand what their systems did.
(This is an actual story; people were bin dipping (at&t?) dumpsters and finding exploits (social or electronic) in the discarded line-printer outputs....
Brian Roemmele says they've been dumpster diving for decades salvaging huge collections of microfilm/microfiche that's been thrown out by libraries, research institutions, etc.
The supercomputers of the time were very heavily designed to run floating-point operations (IIRC) and so while the FPU performance might be comparable, I'm not sure a Cray could be used as a "1/4 speed Pi" for general computing things like running Linux.
I used a Cray around 1998 (from the Pittsburgh Supercomputing Center IIRC) and it was super fast on very particular tasks. Specifically, there was some type of processing pipeline that once you had it set up, it would produce a stream of calculations very quickly.
I wonder if the Raspberry Pi is faster on all tasks, or is there some type of computation the old Cray is still competitive?
I suspect the Cray is "competitive" for some value of "doesn't absolutely stink" for things that are designed for it.
But you can emulate a Cray on an FPGA: https://www.chrisfenton.com/homebrew-cray-1a/ so I suspect that while it could still do "real work" you can also beat the pants off it if you setup your code as designed to run on modern GPUs.
The shocking thing is that every contemporary PC and handheld device would place on the TOP500 list in the 90's yet they're still burdened with slow software when doing basic operations.