> I’ve found UDP to be great for latency but pretty awful for throughout.
UDP/multicast can provide excellent throughput. It's the de facto standard for market data on all major financial exchanges. For example, the OPRA feed (which is a consolidated market data feed of all options trading) can easily burst to ~17Gbps. Typically there is a "A" feed and a "B" feed for redundancy. Now you're talking about ~34Gbps of data entering your network for this particular feed.
Also, when network engineers do stress testing with iperf we typically use UDP to avoid issues with TCP/overhead.
That’s interesting. And I’m sure they have some very knowledgable people working for them who may(/will) know things I don’t.
That being said, it wouldn’t surprise me if they were pushing 17G of UDP on 100G transports. Probably with some pretty high-end/expensive network hardware with huge buffers. I.e you can do it if you’ve got the money, but I bet TCP would still have better raw throughput.
Yep, 100G switches are common nowadays since the cost has come down so much, and you can easily carve a port to 4x10G, 4x25G, and 40G. In financial trading you tend to avoid switches with huge buffer as that comes to a huge cost in latency. For example, 2 megabytes of buffer is 1.68ms of latency on a 10G switch which is an eon in trading. Most opt for cut-through switches with shallow buffers measured in 100s of nanoseconds. If you want to get really crazy there are L1 switches that can do 5ns.
That is a really good point that I hadn’t considered. Presumably this comes at the risk of dropped packets if the upstream link becomes saturated? Does one just size the links accordingly to avoid that?
Basically yes, but the links themselves are controlled by the exchanges (and tied in to your general contact for market access).
In general UDP is not a problem in the space because of overprovisioning. Think "algorithms are for people who don't know how to buy more RAM", but with a finicial industry budget behind it.
It’s actually pretty easy to monitor the throughout with the right tools. The network capture appliance I use can measure microbursts at 1ms time intervals. With low latency/cut through switches there are limited buffers by design. You are certain to drop packets if you are trying to subscribe to a feed that can burst to 17Gbps on a 10Gbps port.
Market data typically comes from the same RP per exchange in most cases. Some exchanges split them by product type. Typically there’s one or two ingress points (two for redundancy) into your network at a cross connect in a data center.
Have you tried to get inline-timestamping going on those fancy modern NICs that support PPT? Orders of magnitude cheaper than new ingress ports on that appliance whose name starts with a "C", also _really_ cool to have more perspectives on the network than switch monitor sessions.
UDP/multicast can provide excellent throughput. It's the de facto standard for market data on all major financial exchanges. For example, the OPRA feed (which is a consolidated market data feed of all options trading) can easily burst to ~17Gbps. Typically there is a "A" feed and a "B" feed for redundancy. Now you're talking about ~34Gbps of data entering your network for this particular feed.
Also, when network engineers do stress testing with iperf we typically use UDP to avoid issues with TCP/overhead.