I also got the impression from the one other page of autogenerated code briefly visible in this video that the firewall logic is also full of regular expressions. And yet, it completes its analysis in roughly less than 1 millisecond.
It's almost a crime I haven't learned LuaJIT yet. :P
That said, looking at the slideshare link, you can see that Lua as a language has some pretty annoying syntax, so I've sorta been shying away from it.
> CloudFlare's firewall system uses LuaJIT to analyse approximately 4 million GET requests per second.
While I agree with your premise (LuaJIT is a pretty awesome project), this example means nothing in terms of performance.
— "analyse" doesn't give an indication of the workload -- it could vary from "matching against a precompiled regexp" to running advanced statistical models.
— 4 millions requests per second does not mean anything without knowing the size of the deployment. You can always throw more hardware at a "trivially" parallelizable problem.
— Throughput is only one dimension on the "speed" continuum, latency (and especially tail latency) is also very important. Some languages have very good throughput and fail hard at tail latency (Java, Go until recently). Some others have lower throughput but more predictable latency.
You make incredibly fair points and I don't have the data to answer what you've raised. This could be lack of research or the fact that CF hasn't released it. I expect a 20/80 non-researched/non-available mix.
My understanding is that the WAF detects trivial stuff like SQL injection and XSS, and the video also mentions it detected ShellShock and so forth (yeah, this was a while ago).
"Thousands of regular expressions" is mentioned verbally in the video around ~11:25. That's where I got that from.
There are some bits mentioned about the hardware around 19:30:
- E5-2630v3 (16 cores, 32 w/ HT which is turned off), 128GB RAM, 10Gbps
- Custom built by Quanta: 4 nodes per 2U (each node is like half-width and 1U high), redundant 2xPSU in the middle
- 40 machines run Kafka storing 400TB of log data (ingest @ 15Gbps), 50TB disk per node
- 5 machines run CitusDB (sharded PostgreSQL) to power realtime graph generation, 12TB SSD per node
- 100+ machines for realtime analytics running Go (and presumably LuaJIT) - LuaJIT seems to be the realtime "hmm, interesting" flagging system, while Go is used for in-depth intelligent analysis that's (I assume) slightly behind realtime.
I'm curious to understand what tail latency is, and very interested to learn more about language latency and performance in general. What subjects would you recommend I learn about / read? (I would describe myself as interested from a hobbyist-intermediate standpont, but from a non-academic POV; I unfortunately get dismally lost with formulaic presentations, math is not a strong subject for me.)
> While I agree with your premise (LuaJIT is a pretty awesome project), this example means nothing in terms of performance.
Yup, without knowing deployment details etc this number doesn't really mean much.
perhaps you might have heard of the snabb-switch project ? which does line rate 10gbps forwarding of 64b packets on a single x86 core. entire thing is written in lua including the user land 'drivers' for Intel 10g cards :)
from 'user' perspective you are still writing lua code. lua-jit would be an optimizing compiler doing it's magic no ? get the code from here : https://github.com/snabbco/snabb