I'm surprised to hear all packets are processed in userspace... If one is doing ...

hackernudes · 2025-05-09T05:02:20 1746766940

> Drawing on existing research [3], our preliminary analysis of these programs and configurations suggests that the network stack architecture is somewhat similar to DPDK [4], mainly relying on a user-space C++ program to bypass the kernel for handling network packets.

The way it usually works is that the initial packets are handled in software but once the endpoints are established it flows through hardware. Sometimes certain patterns are always handled in software. The software could be a patched kernel or a XDP style kernel bypass.

Source: worked peripherally on an Intel Puma cable modem router/gateway that used DPDK or something like it. So I'm not 100% sure, but it is an educated guess.

dilyevsky · 2025-05-09T07:34:12 1746776052

> I'm surprised to hear all packets are processed in userspace...

Specifically for cases of forwarding DPDK-style approach can be faster because it will incur fewer buffer copies.

Starlink only does 25-200Mbps and average packets are like 7-8x larger so at most you're doing ~36000 PPS which is pretty manageable even on 1Ghz

riehwvfbk · 2025-05-09T05:03:03 1746766983

Why would it be any less efficient than processing the packets in the kernel? There's a way to map the hardware queues into userspace (the article talks about the system being DPDK-like). At that point why does it matter that the polling code isn't in the kernel?

londons_explore · 2025-05-09T06:12:30 1746771150

Most hardware >100Mbps has hardware offload - ie. the hardware is told which packets to send where, and software doesn't touch individual packets (except rare packets like ping).

riehwvfbk · 2025-05-09T18:10:30 1746814230

Yes, but you can have GSO and GRO in combination with userspace protocol processing.

Hikikomori · 2025-05-09T15:33:21 1746804801

Not really, you can easily find CPU only routers in every product segment from home routers to enterprise for that speed, even many years ago.

rapsey · 2025-05-09T05:50:20 1746769820

> which is 100 byte UDP packets

100 byte?? Starlink has regular 1500 byte MTU.

tuetuopay · 2025-05-09T08:08:47 1746778127

In networking, it is the norm to measure performance in packets per second, so with small packets. Unless you're performing DPI or encryption, routers only use the headers to take routing decisions, so whether the payload is 10 bytes or 1000 bytes does not matter: the processing cost will be identical. Only the hardware bandwidth will matter for large packets, though this is rarely the issue (I've hit DDR4 limits once using XDP, and fixed by adding another stick of memory);

Tepix · 2025-05-09T07:48:01 1746776881

With RTP traffic you often have lots of small packets.

_joel · 2025-05-09T11:32:49 1746790369

Doing it in userspace aviods another memcpy, it's much faster.

jcalvinowens · 2025-05-09T23:46:22 1746834382

Modern Linux can do zero-copy network I/O through the kernel, although it requires some effort.