Hacker News new | past | comments | ask | show | jobs | submit login

The description of DeepSeek reminds me of my experience in networking in the late 80s - early 90s.

Back then a really big motivator for Asynchronous Transfer Mode (ATM) and fiber-to-the-home was the promise of video on demand, which was a huge market in comparison to the Internet of the day. Just about all the work in this area ignored the potential of advanced video coding algorithms, and assumed that broadcast TV-quality video would require about 50x more bandwidth than today's SD Netflix videos, and 6x more than 4K.

What made video on the Internet possible wasn't a faster Internet, although the 10-20x increase every decade certainly helped - it was smarter algorithms that used orders of magnitude less bandwidth. In the case of AI, GPUs keep getting faster, but it's going to take a hell of a long time to achieve a 10x improvement in performance per cm^2 of silicon. Vastly improved training/inference algorithms may or may not be possible (DeepSeek seems to indicate the answer is "may") but there's no physical limit preventing them from being discovered, and the disruption when someone invents a new algorithm can be nearly immediate.




Another aspect that reinforces your point is that the ATM push (and subsequent downfall) was not just bandwidth-motivated but also motivated by a belief that ATM's QoS guarantees were necessary. But it turned out that software improvements, notably MPLS to handle QoS, were all that was needed.


Nah, it's mostly just buffering :-)

Plus the cell phone industry paved the way for VOIP by getting everyone used to really, really crappy voice quality. Generations of Bell Labs and Bellcore engineers would rather have resigned than be subjected to what's considered acceptable voice quality nowadays...


I've noticed this when talking on the phone with someone with a significant accent.

1. it takes considerable work on my part to understand it on a cell phone

2. it's much easier on POTS

3. it's not a problem on VOIP

4. no issues in person

With all the amazing advances in cell phones, the voice quality of cellular is stuck in the 90's.


I generally travel to Europe, and it baffles why I can't use VoLTE there (maybe my roaming doesn't allow that), and fallback to 3G for voice calls.

At home, I use VoLTE and the sound is almost impeccable, very high quality, but in the places I roam to, what I get is FM quality 3G sound.

It's not that cellular network is incapable of that sound quality, but I don't get to experience it except my home country. Interesting, indeed.


In which countries?

3G networks in many European countries were shut off in 2022-2024. The few remaining ones will go too over the next couple of years.

VoLTE is 5G, common throughout Europe. However the handset manufacturer may need to qualify each handset model with local carriers before they will connect using VoLTE. As I understand the situation, Google for instance has only qualified Pixel phones for 5G in 19 of 170-odd countries. So 5G features like VoLTE may not be available in all countries. This is very handset/country/carrier-dependent.


> VoLTE is 5G

Technically, on 5G you have "VoNR"[0], where VoLTE is over 4G.

[0] https://en.wikipedia.org/wiki/Voice_over_NR


Which AFAIK is only a thing in China. Most network outside it are still stuck on NSA 5G, let alone VoNR.


VoLTE can very well be 5G now, and it can vary from country to country, but my first memoryof VoLTE is originally started with LTE/4G networks.

https://en.wikipedia.org/wiki/Voice_over_LTE


Yes, I think most video on the Internet is HLS and similar approaches which are about as far from the ATM circuit-switching approach as it gets. For those unfamiliar HLS is pretty much breaking the video into chunks to download over plain HTTP.


Yes, but that's entirely orthogonal to the "coding" algorithms being used and which are specifically responsible for the improvement that GP was describing.

HLS is really just a way to empower the client with the ownership of the playback logic. Let the client handle forward buffering, retries, stream selection, etc.


>> Plus the cell phone industry paved the way for VOIP by getting everyone used to really, really crappy voice quality

What accounts for this difference? Is there something inherently worse about the nature of cell phone infrastructure over land-line use?

I'm totally naive on such subjects.

I'm just old enough to remember landlines being widespread, but nearly all of my phone calls have been via cell since the mid 00s, so I can't judge quality differences given the time that's passed.


Because at some point, someone decided that 8 kbps makes for an acceptable audio stream per subscriber. And at first, the novelty of being able to call anyone anywhere, even with this awful quality, was novel enough that people would accept it. And most people did until the carriers decided they could allocate a little more with VoLTE, if it works on your phone in your area.


> Because at some point, someone decided that 8 kbps makes for an acceptable audio stream per subscriber.

Has it not been like this for a very long time? I was under the impression that "voice frequency" being defined as up to 4 kHz was a very old standard - after all, (long-distance) phone calls have always been multiplexed through coaxial or microwave links. And it follows that 8kbps is all you need to losslessly digitally sample that.

I assumed it was jitter and such that lead to lower quality of VoIP/cellular, but that's a total guess. Along with maybe compression algorithms that try to squeeze the stream even tighter than 8kbps? But I wouldn't have figured it was the 8kHz sample rate at fault, right?


Sure, if you stop after "nobody's vocal coords make noises above 4khz in normal conversation", but the rumbling of the vocal coords isn't the entire audio data which is present in-person. Clicks of the tongue and smacking of the lips make much higher frequencies, and higher sample rates capture the timbre/shape of the soundwave instead of rounding it down to a smooth sine wave. Discord defaults to 64kbps, but you can push it up to 96kbps or 128kbps with nitro membership, and it's not hard to hear an improvement with the higher bitrates. And if you've ever used bluetooth audio, you know the difference in quality between the bidirectional call profile, and the unidirectional music profile, and wished to have the bandwidth of the music profile with the low latency of the call profile.


> Sure, if you stop after "nobody's vocal coords make noises above 4khz in normal conversation"

Huh? What? That's not even remotely true.

If you read your comment out loud, the very first sound you'd make would have almost all of its energy concentrated between 4 and 10 kHz.

Human vocal cords constantly hit up to around 10 kHz, though auditory distinctiveness is more concentrated below 4 kHz. It is unevenly distributed though, with sounds like <s> and <sh> being (infamously) severely degraded by a 4 kHz cut-off.


AMR (adaptive multi-rate audio codec) can get down to 4.75 kbit/s when there's low bandwidth available, which is typically what people complain about as being terrible quality.

The speech codecs are complex and fascinating, very different from just doing a frequency filter and compressing.

The base is linear predictive coding, which encodes the voice based on a simple model of the human mouth and throat. Huge compression but it sounds terrible. Then you take the error between the original signal and the LPC encoded signal, this waveform is compressed heavily but more conventionally and transmitted along with the LPC signal.

Phones also layer on voice activity detection, when you aren't talking the system just transmits noise parameters and the other end hears some tailored white noise. As phone calls typically have one person speaking at a time and there are frequent pauses in speech this is a huge win. But it also makes mistakes, especially in noisy environments (like call centers, voice calls are the business, why are they so bad?). When this happens the system becomes unintelligible because it isn't even trying to encode the voice.


The 8KHz samples were encoded with relatively low encoding complexity PCM (G.711) at 8KHz. That gets to a 64kbps data channel rate. This was the standard for "toll quality" audio. Not 8kbps.

The 8kbps rates on cellular are the more complicated (relative to G.711) AMR-NB encoding. AMR supports voice rates from about 5-12kbps with a typical 8kbps rate. There's a lot more pre and post processing of the input signal and more involved encoding. There's a bit more voice information dropped by the encoder.

Part of the quality problem even today with VoLTE is different carriers support different profiles and calls between carriers will often drop down to the lowest common codec which is usually AMR-NB. There's higher bitrate and better codecs available in the standard but they're implemented differently by different carriers for shitty cellular carrier reasons.


> The 8KHz samples were encoded with relatively low encoding complexity PCM (G.711) at 8KHz. That gets to a 64kbps data channel rate. This was the standard for "toll quality" audio. Not 8kbps.

I'm a moron, thanks. I think I got the sample rate mixed up with the bitrate. Appreciate you clearing that up - and the other info!


And memory. In the heyday of ATM (late 90s) a few megabytes was quite expensive for a set-top box, so you couldn't buffer many seconds of compressed video.

Also, the phone companies had a pathological aversion to understanding Moore's law, because it suggested they'd have to charge half as much for bandwidth every 18 months. Long distance rates had gone down more like 50%/decade, and even that was too fast.


Love those analogies . This is one of main reason I love hacker news / reddit . Honest golden experiences


I worked on a network that used a protocol very similar to ATM (actually it was the first Iridium satellite network). An internet based on ATM would have been amazing. You’re basically guaranteeing a virtual switched circuit, instead of the packets we have today. The horror of packet switching is all the buffering it needs, since it doesn’t guarantee circuits.

Bandwidth is one thing, but the real benefit is that ATM also guaranteed minimal latencies. You could now shave off another 20-100ms of latency for your FaceTime calls, which is subtle but game changing. Just instant-on high def video communications, as if it were on closed circuits to the next room.

For the same reasons, the AI analogy could benefit from both huge processing as well as stronger algorithms.


> You’re basically guaranteeing a virtual switched circuit

Which means you need state (and the overhead that goes with it) for each connection within the network. That's horribly inefficient, and precisely the reason packet-switching won.

> An internet based on ATM would have been amazing.

No, we'd most likely be paying by the socket connection (as somebody has to pay for that state keeping overhead), which sounds horrible.

> You could now shave off another 20-100ms of latency for your FaceTime calls, which is subtle but game changing.

Maybe on congested Wi-Fi (where even circuit switching would struggle) or poorly managed networks (including shitty ISP-supplied routers suffering from horrendous bufferbloat). Definitely not on the majority of networks I've used in the past years.

> The horror of packet switching is all the buffering it needs [...]

The ideal buffer size is exactly the bandwidth-delay product. That's really not a concern these days anymore. If anything, buffers are much too large, causing unnecessary latency; that's where bufferbloat-aware scheduling comes in.


The cost for interactive video would be a requirement of 10x bandwidth, basically to cover idle time. Not efficient but not impossible, and definitely wouldn’t change ISP business models.

The latency benefit would outweigh the cost. Just absolutely instant video interaction.


It is fascinating to think that before digital circuits phone calls were accomplished by an end-to-end electrical connection between the handsets. What luxury that must have been! If only those ancestors of ours had modems and computers to use those excellent connections for low-latency gaming... :-)


Einstein would like to have a word…

And for the little bit of impact queueing latency has (if done well, i.e. no bufferbloat), I doubt anyone would notice the difference, honestly.


You’re arguing for a reduction in quality in internet services. People do notice those things. It’s like claiming people don’t care about slimmer iPhones. They do.


Man, I saw a presentation on Iridium when I was at Motorola in the early 90s, maybe 92? Not a marketing presentation - one where an engineer was talking, and had done their own slides.

What I recall is that it was at a time when Internet folks had made enormous advances in understanding congestion behavior in computer networks, and other folks (e.g. my division of Motorola) had put a lot of time into understanding the limited burstiness you get with silence suppression for packetized voice, and these folks knew nothing about it.


I remember my professor saying how the fixed packet size in ATM (53 bytes) was a committee compromise. North America wanted 64 bytes, Europe wanted 32 bytes. The committee chose around the midway point.


53 byte frames is what results in the exact compromise of 48 bytes for the payload size.


> ... guaranteed minimal latencies. You could now shave off another 20-100ms of latency for your FaceTime calls...

I already do this. But I cheat - I use a good router (OpenWrt One) that has built-in controls for Bufferbloat. See [How OpenWrt Vanquishes Bufferbloat](https://forum.openwrt.org/t/how-openwrt-vanquishes-bufferblo...)


> The horror of packet switching is all the buffering it needs, since it doesn’t guarantee circuits.

You don't actually need all that much buffering.

Buffer bloat is actually a big problem with conventional TCP. See eg https://news.ycombinator.com/item?id=14298576


Doesn’t your point about video compression tech support Nvidia’s bull case?

Better video compression led to an explosion in video consumption on the Internet, leading to much more revenue for companies like Comcast, Google, T-Mobile, Verizon, etc.

More efficient LLMs lead to much more AI usage. Nvidia, TSMC, etc will benefit.


No - because this eliminates entirely or shifts the majority of work from GPU to CPU - and Nvidia does not sell CPUs.

If the AI market gets 10x bigger, and GPU work gets 50% smaller (which is still 5x larger than today) - but Nvidia is priced on 40% growth for the next ten years (28x larger) - there is a price mismatch.

It is theoretically possible for a massive reduction in GPU usage or shift from GPU to CPU to benefit Nvidia if that causes the market to grow enough - but it seems unlikely.

Also, I believe (someone please correct if wrong) DeepSeek is claiming a 95% overall reduction in GPU usage compared to traditional methods (not the 50% in the example above).

If true, that is a death knell for Nvidia's growth story after the current contracts end.


I can see close to zero possibility that the majority of the work will be shifted to the CPU. Anything a CPU can do can just be done better with specialised GPU hardware.


Then why do we have powerful CPUs instead of a bunch of specialized hardware? It's because the value of a CPU is in its versatility and ubiquity. If a CPU can do a thing good enough, then most programs/computers will do that thing on a CPU instead of having the increased complexity and cost of a GPU, even if a GPU would do it better.


We have both? Modern computing devices like smart phones use SoCs with integrated GPUs. GPUs aren't really specialized hardware, either, they are general purpose hardware useful in many scenarios (built for graphics originally but clearly useful in other domains including AI).


People have been saying the exact same thing about other workloads for years, and always been wrong. Mostly claiming custom chips or FPGAs will beat out general purpose CPUs.


> Anything a CPU can do can just be done better

Nope. Anything inheriantly serial is better off on the CPU due to caching and it's architecture.

Many things that are highly parallizable are getting GPU enabled. Games and ML are GPU by default, but many things are migrating to CUDA.

You need both for cheap, high performance computing. They are different workloads.


Yes, I was too hasty in my response. I should have been more specific that I mean ML/AI type tasks. I see no way that we end up on general purpose CPUs for this.


The graphics in games are GPU by default. But the game logic itself is seldom run on the CPU as far as I can tell.


In terms of inference (and training) of AI models, sure, most things that a CPU core can do would be done cheaper per unit of performance on either typical GPU or NPU cores.


On desktop, CPU decoding is passable but it's still better to have a graphics card for 4K. On mobile, you definitely want to stick to codecs like H264/HEVC/AVC1 that are supported in your phone's decoder chips.

CPU chipsets have borrowed video decoder units and SSE instructions from GPU-land, but the idea that video decoding is a generic CPU task now is not really true.

Now maybe every computer will come with an integrated NPU and it won't be made by Nvidia, although so far integrated GPUs haven't supplanted discrete ones.

I tend to think today's state-of-the-art models are ... not very bright, so it might be a bit premature to say "640B parameters ought to be enough for anybody" or that people won't pay more for high-end dedicated hardware.


> Now maybe every computer will come with an integrated NPU and it won't be made by Nvidia, although so far integrated GPUs haven't supplanted discrete ones.

Depends on what form factor you are looking at. The majority of computers these days are smart phones, and they are dominated by systems-on-a-chip.


That's just factually wrong, DeepSeek is still terribly slow on CPUs. There's nothing different about how it works numerically.


  No - because this eliminates entirely or shifts the majority of work from GPU to CPU - and Nvidia does not sell CPUs.
I'm not even sure how to reply to this. GPUs are fundamentally much more efficient for AI inference than CPUs.


I think SIMD is not so much better than SIMT for solved problems as a level in claiming a problem as solved.


What do you think GPUs are? Basically SIMD asics.


That's also what AVX is but with a conservative number of threads.. If you really understand your problem I don't see why you would need 32 threads of much smaller data size or why you would want that far away from your CPU.

Whether your new coprocessor or instructions look more like a GPU or something else doesn't really matter if we are done squinting and calling it graphics like problems and/or claiming it needs a lot more than a middle class PC.


It lead to more revenue for the industry as a whole. But not necessarily for the individual companies that bubbled the hardest: Cisco stock is still to this day lower than it was at peak in 2000, to point to a significant company that sold actual physical infra products necessary for the internet and still around and profitable to this day. (Some companies that bubbled did quite well, AMZN is like 75x from where it was in 2000. But that's a totally different company that captured an enormous amount of value from AWS that was not visible to the market in 2000, so it makes sense.)

If stock market-cap is (roughly) the market's aggregated best guess of future profits integrated over all time, discounted back to the present at some (the market's best guess of the future?) rate, then increasing uncertainty about the predicted profits 5-10 years from now can have enormous influence on the stock. Does NVDA have an AWS within it now?


>It lead to more revenue for the industry as a whole. But not necessarily for the individual companies that bubbled the hardest: Cisco stock is still to this day lower than it was at peak in 2000, to point to a significant company that sold actual physical infra products necessary for the internet and still around and profitable to this day. (Some companies that bubbled did quite well, AMZN is like 75x from where it was in 2000. But that's a totally different company that captured an enormous amount of value from AWS that was not visible to the market in 2000, so it makes sense.)

Cisco in 1994: $3.

Cisco after dotcom bubble: $13.

So is Nvidia's stock price closer to 1994 or 2001?


I agree that advancements like DeepSeek, like transformer models before it, is just going to end up increasing demand.

It’s very shortsighted to think we’re going to need fewer chips because the algorithms got better. The system became more efficient, which causes induced demand.


It will increase the total volume demanded, but not necessarily the amount of value that companies like NVidia can capture.

Most likely, consumer surplus has gone up.


More demand for what, chatbots? ai slop? buggy code?



No, it doesn't.

Not only are 10-100x changes disruptive, but the players who don't adopt them quickly are going to be the ones who continue to buy huge amounts of hardware to pursue old approaches, and it's hard for incumbent vendors to avoid catering to their needs, up until it's too late.

When everyone gets up off the ground after the play is over, Nvidia might still be holding the ball but it might just as easily be someone else.


If you normalize Nvidia's gross margin and take into account of competitors sure. But its current high margin is driven by Big Tech FOMO. Do keep in mind that 90% margin or 10x cost to 50% margin or 2x cost is a 5x price reduction.


So why would DeepSeek decrease FOMO? It should increase it if anything.


Because DeepSeek demonstrates that loads of compute isn't necessary for high-performing models, and so we won't need as much and as powerful of hardware as was previously thought, which is what Nivida's valuation is based on?


That's assuming there isn't demand for more powerful models, there's still plenty of room for improvement from the current generation. We didn't stop at GPT-3 level models when that was achieved.


It improves TSMC' case.. Paying Nvidia would be like paying Cray for every smartphone that is faster than a supercomputer of old.


Yes, over the long haul, probably. But as far as individual investors go they might not like that Nvidia.

Anyone currently invested is presumably in because they like the insanely high profit margin, and this is apt to quash that. There is now much less reason to give your first born to get your hands on their wares. Comcast, Google, T-Mobile, Verizon, etc., and especially those not named Google, have nothingburger margins in comparison.

If you are interested in what they can do with volume, then there is still a lot of potential. They may even be more profitable on that end than a margin play could ever hope for. But that interest is probably not from the same person who currently owns the stock, it being a change in territory, and there is apt to be a lot of instability as stock changes hands from the one group to the next.


> Anyone currently invested is presumably in because they like the insanely high profit margin, [...]

I'm invested in Nvidia because it's part of the index that my ETF is tracking. I have no clue what their profit margins are.


> I'm invested in Nvidia [...] my ETF

That would be an unusual situation for an ETF. An ETF does not usually extend ownership of the underlying investment portfolio. An ETF normally offers investors the opportunity to invest in the ETF itself. The ETF is what you would be invested in. Your concern as an investor in an ETF would only be with the properties of the ETF, it being what you are invested in, and this seems to be true in your case as well given how you describe it.

Are you certain you are invested in Nvidia? The outcome of the ETF may depend on Nvidia, but it may also depend on how a butterfly in Africa happens to flap its wings. You aren't, by any common definition found within this type of context, invested in that butterfly.


Technically, all the Nvidia stock (and virtually all stocks in the US) are owned by Cede and Co. So Nvidia has only one investor.[0] There's several layers of indirection between your Robinhood portfolio and the actual Nvidia shares, even if Robinhood mentions NVDA as a position in your portfolio.

The ETF is just one more layer of indirection. You might like to read https://en.wikipedia.org/wiki/Exchange-traded_fund#Arbitrage... to see how ETFs are connected to the underlying assets.

You will find that the connection between ETFs and the underlying assets in the index is much more like the connection between your Robinhood portfolio and Nvidia, than the connection between butterflies and thunderstorms.

[0] At least for its stocks. Its bonds are probably held in different but equally weird ways.


> Technically, all the Nvidia stock (and virtually all stocks in the US) are owned by Cede and Co.

Technically, but they extend ownership. An ETF is a different type of abstraction. Which you already know because you spoke about that abstraction in your original comment, so why play stupid now?


I have no clue what you mean by 'extend ownership', and it's supposed to be different from what ETFs are doing.

An ETF typically holds the underlying assets, and you own a part of the ETF.


It seems more stark even. The energy costs that are current and then projected for AI are staggering. At the same time, I think it has been MS that has been publishing papers on LLMs that are smaller (so called small language models) but more targeted and still achieving a fairly high "accuracy rate."

Didn't TMSC say that SamA came for a visit and said they needed $7T in investment to keep up with the pending demand needs.

This stuff is all super cool and fun to play with, I'm not a nay sayer but it almost feels like these current models are "bubble sort" and who knows how it will look if "quicksort" for them becomes invented.


>but there's no physical limit preventing them from being discovered, and the disruption when someone invents a new algorithm can be nearly immediate.

The rise of the net is Jevons paradox fulfilled. The orders of magnitude less bandwidth needed per cat video drove much more than that in overall growth in demand for said videos. During the dotcom bubble's collapse, bandwidth use kept going up.

Even if there is a near-term bear case for NVDA (dotcom bubble/bust), history indicates a bull case for the sector overall and related investments such as utilities (the entire history of the tech sector from 1995 to today).


Another example: people like to cite how the people who really made money in the CA gold rush were selling picks and shovels.

That only lasted so long. Then it was heavy machinery (hydraulics, excavators, etc)


I always like the "look" of high bit rate Mpeg2 video. Download HD japanese TV content from 2005-2010 and it still looks really good.


I love algorithms as much the next guy, but not really.

DCT was developed in 1972 and has a compression ratio of 100:1.

H.264 compresses 2000:1.

And standard resolution (480p) is ~1/30th the resolution of 4k.

---

I.e. Standard resolution with DCT is smaller than 4k with H.264.

Even high-definition (720p) with DCT is only twice the bandwidth of 4k H.264.

Modern compression has allowed us to add a bunch more pixels, but it was hardly a requirement for internet video.


The web didn't go from streaming 480p straight to 4k. There were a couple of intermediate jumps in pixel count that were enabled in large part by better compression. Notably, there was a time period where it was important to ensure your computer had hardware support for H.264 decode, because it was taxing on low-power CPUs to do at 1080p and you weren't going to get streamed 1080p content in any simpler, less efficient codec.


Right.

Modern compression algorithms were developed but not even computationally available for some of the time.


DCT is not an algorithm at all, it’s a mathematical transform.

It doesn’t have a compression ratio.


Correct. DCT maps N real numbers to N real numbers. It reorganizes the data to make it more amenable to compression, but DCT itself doesn't do any compression.

The real compression comes from quantization and entropy coding (Huffman coding, arithmetic coding, etc.).


> DCT compression, also known as block compression, compresses data in sets of discrete DCT blocks.[3] DCT blocks sizes including 8x8 pixels for the standard DCT, and varied integer DCT sizes between 4x4 and 32x32 pixels.[1][4] The DCT has a strong energy compaction property,[5][6] capable of achieving high quality at high data compression ratios.[7][8] However, blocky compression artifacts can appear when heavy DCT compression is applied.

https://en.wikipedia.org/wiki/Discrete_cosine_transform


Exactly, it’s not an algorithm, it’s one mechanism used in many (most?) compression algorithms.

Therefore, it has no compression ratio, and it doesn’t make sense to compare it to other algorithms.


I'm sure it helped, but yeah, not only e2e bandwidth but also the total network throughput increased by vast orders of magnitude.


Yes, that is a very apt analogy!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: