Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm surprised to hear all packets are processed in userspace...

If one is doing 1Gbps of traffic which is 100 byte UDP packets, that's a million packets per second you're gonna need to process.

A 1Ghz CPU only then gets 1000 cycles to process each one...

Very doable, but certainly not easy unless your engineers like hand coding assembly and having to think about every lookup table trick in the book...



> Drawing on existing research [3], our preliminary analysis of these programs and configurations suggests that the network stack architecture is somewhat similar to DPDK [4], mainly relying on a user-space C++ program to bypass the kernel for handling network packets.

The way it usually works is that the initial packets are handled in software but once the endpoints are established it flows through hardware. Sometimes certain patterns are always handled in software. The software could be a patched kernel or a XDP style kernel bypass.

Source: worked peripherally on an Intel Puma cable modem router/gateway that used DPDK or something like it. So I'm not 100% sure, but it is an educated guess.


> I'm surprised to hear all packets are processed in userspace...

Specifically for cases of forwarding DPDK-style approach can be faster because it will incur fewer buffer copies.

Starlink only does 25-200Mbps and average packets are like 7-8x larger so at most you're doing ~36000 PPS which is pretty manageable even on 1Ghz


Why would it be any less efficient than processing the packets in the kernel? There's a way to map the hardware queues into userspace (the article talks about the system being DPDK-like). At that point why does it matter that the polling code isn't in the kernel?


Most hardware >100Mbps has hardware offload - ie. the hardware is told which packets to send where, and software doesn't touch individual packets (except rare packets like ping).


Yes, but you can have GSO and GRO in combination with userspace protocol processing.


Not really, you can easily find CPU only routers in every product segment from home routers to enterprise for that speed, even many years ago.


> which is 100 byte UDP packets

100 byte?? Starlink has regular 1500 byte MTU.


In networking, it is the norm to measure performance in packets per second, so with small packets. Unless you're performing DPI or encryption, routers only use the headers to take routing decisions, so whether the payload is 10 bytes or 1000 bytes does not matter: the processing cost will be identical. Only the hardware bandwidth will matter for large packets, though this is rarely the issue (I've hit DDR4 limits once using XDP, and fixed by adding another stick of memory);


With RTP traffic you often have lots of small packets.


Doing it in userspace aviods another memcpy, it's much faster.


Modern Linux can do zero-copy network I/O through the kernel, although it requires some effort.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: