And if you're using FreeBSD with a fast NIC, look into the pfilctl command and use a "head" input hook into your NIC driver's ethernet input handling routine. Doing this allows the firewall input checks to happen early, before the packet has been received into the network stack, while its still in DMA-mapped, driver-owned buffers. If the firewall decides a packet is to be dropped, the NIC can recycle the buffer with minimal overhead (skips DMA-unmap, buffer free, ethernet input into the IP stack, and the replacement buffer alloc, DMA map).
This allows for very fast dropping of DOS flood attacks. I've tested this using ipfw up to screening and dropping several 10s of millions of packets on 2014 era Xeons with minimal impact to traffic serving. I wrote a paper about this for a conf. that was cancelled due to COVID. I really need to re-submit it someplace.
This works for Mellanox, Chelsio, and NICs that use iflib (ix, ixl, igb, bnxt, and more that I'm forgetting).
I'm picturing that scene a few episodes into the soul art anime - I barely remember it but the main character was so overpowered, he was standing there absorbing a full barrage from a much lower ranked opponent flooding him with punches and kicks; the main character's health regenerated so quickly that the attack didn't have effects.
In 2014 I was working on a hardware appliance for a company that has something to do with packet capture, and I found an intel driver that implemented a ring buffer on a 1 gigabit Ethernet adapter that allowed me to capture line rate without dropping a single packet over the course of hours; prior to this the adapter was barely able to capture 90% of the packets. I remember marveling at the design that must have gone into it, and here too to your description of this performance improvement.
But don’t have to cobble together a bunch of arcane iptables commands and then combine bpf and other userland tools … when one can just use the clean syntax of PF especially for home use that’s a clear win.
I've used both extensively and I find eBPF+iptables (and sometimes nft) significantly more flexible and easier to use in the real world (not just simple examples) than PF. shrug
I have -- I let the OpenBSD firewalls take care of it :P
Seriously though it's something I need to get familiar with, I do still have plenty of Linux boxes that face the public Internet and are currently dependent on iptables/ip6tables rulesets. The problem is I'm currently masking that pain with Ansible.
There is definite lack of a declarative tool that glues it all.
Typical hardware switches and routers just have one (sometimes expanded by includes/macros but still) config syntax to control every part of networking stack.
So you can configure interface and set its vlans all in one place instead of creating a dozen of ethX.Y devices then crerating a bunch of brY bridges and then attaching the interfaces to them
In linux instead you'd be using iproute2 set of tools to configure interfaces and static routing, iptables for IP ACLs, ebtables for ethernet ACLs (or now nftables I guess), without any tool to apply/revert changes at once
Many tried doing that but IMO haven't seen anything good.
Many also try to "simplify" iptables and all it ends up is me being annoyed coz I know which iptables commands I need to run but I need to translate it back into "higher" level config syntax. One exception being ferm ( http://ferm.foo-projects.org/ ), because it keeps iptables-like keywords just expands on that, but it is iptables only and kinda superseded by nftables syntax anyway.
iptables/ebtables is deprecated even in RHEL. While people are free to continue not to transition to nftables complaining about problems with iptables after a decade of its replacement is a bit silly.
For a bit of context regarding some of the "unfixed" issues in FreeBSD, some of it comes down to different security contexts. In openbsd, pf is pretty much root only, and root -> kernel escalation isn't a big deal. There's a bug, it gets fixed, but no hoopla. Easy to miss if you're not paying close attention. Maybe fixed incidentally as part of a refactor.
But FreeBSD also has jails/vnet, which makes root -> kernel escalation a lot more spicy. Eventually the bug gets found, some years later, although not really the result of negligence.
Michael W. Lucas is such a good author, and I just wanted to give a shout-out: "TLS Mastery" opened my eyes to the world of TLS / SSL certs in an understandable way that used interactive examples on Linux.
I'm sure i read on a comment here recently that the latest macos is using a recent openbsd pf? As quite a heavy user of freebsd pf, I wonder if anyone knows more details on that?
I haven't looked in a while, but an update would be nice. When I was looking at it in the last couple years, many things in the networking stack were unchanged since the late 90s/early 2000s, so macos didn't have syn flood protection built in, and while the macos pf had synflood stuff, it only works if the macos host is strictly a firewall, using the syn protection for traffic where macos is an endpoint results in no connectivity.
If they pulled in a more recent pf from either OpenBSD or more current FreeBSD would be welcome. (And you know, a recent tcp stack would be nice too; although they've added in things like MPTCP that they'd need to port forward by 20+ years)
I'm just super surprised it's there at all considering no rules are defined by the OS, and nobody uses it anymore because all the firewall vendors moved over to system extensions.
DragonFly and FreeBSD have radically different ideas on how multithreading should work. (Understatement of the century right there!)
FWIW the threading in the network stack is one of the original major divergences between Open/FreeBSD PF. The way FreeBSD's PF works is within the context of FreeBSD's threaded network stack. FreeBSD PF is not "single threaded" in the way it used to mean.
I would really like to have the modern PF syntax/structure from OpenBSD in FreeBSD though. The unified rdr/nat/pass/block mechanism in OpenBSD is so much cleaner and nicer to work with.
DragonFly and FreeBSD have radically different ideas on how multithreading should work. (Understatement of the century right there!)
To elaborate for people who don't know the history: Disagreement about how multithreading should work are why Matt got booted from the FreeBSD project and started DragonFly.
It's not that simple. Disagreements are one thing, but the real reason was the conflict and drama that happened when things didn't go his way. The threading approach just happened to be a really good trigger for those conflicts.
Conflicting personalities makes resolving technical disagreements so much harder.
Playing well obviously important (and I wasn’t involved, have no other insight), but do you know if the poor showing for fbsd 5.x correspond w the disagreements?
FreeBSD 5.x was a very stressful time -- bringing full SMP to the entire kernel was a massive technical challenge, and it didn't help that a lot of the people who were expecting to work on it lost their jobs during the dot com crash.
5.x was a poor vintage because making the kernel SMP was hard, and social issues became problematic because SMP was hard, but Matt's departure was neither the result or cause of FreeBSD 5.x having issues.
Assuming "the DragonflyBSD approach" is the same as when it forked 20 years ago: FreeBSD's approach was definitely the right one, for two reasons.
First, adding locking is fundamentally easier to get right than rewriting everything into a message-passing model. It took ~5 years to get SMP right in FreeBSD (from when 4.0 was released to when 6.0 was released) but it would have taken double that to get things equally stable and performant with the more intrusive redesign Matt wanted to go with.
Second, pretty much everyone else went with the same locking approach in the end -- which has resulted in CPUs being optimized for that. Lock latencies dropped dramatically with microarchitectural changes between ~2005 and ~2015.
As a theoretical concept, I love the idea behind Dragonfly, but it simply wasn't practical in the real world.
I'm not an OpenBSD user but it sounds like this changed recently in OpenBSD-land as well, because the "Will multiple processors help?" FAQ entry disappeared some time between January and March 2023: https://web.archive.org/web/20230112170731/https://www.openb...
I didn’t know this but it makes sense. Either way for most of the use cases (homelabs on residential internet speeds) either FBSD or OpenBSD’s PF will perform just swimmingly.
Multiqueue NICs do great work on networking loads when you have 1 core per NIC queue; and you can eliminate or at least reduce cross-core communication for most of the work.
It's not complex firewalling, but I did some HAProxy stuff in tcp mode, and the throughput available when running without queue alignment was miniscule compared to the throughput available when properly aligned. Firewalling has the benefit that queue alignment should happen automagically, because of where it runs. If you're doing a lot of processing in userspace, it makes a lot less of a difference, but on a very lightweight application, there was no point in using more cores than nic queues, because cross-core communication was too slow.
This allows for very fast dropping of DOS flood attacks. I've tested this using ipfw up to screening and dropping several 10s of millions of packets on 2014 era Xeons with minimal impact to traffic serving. I wrote a paper about this for a conf. that was cancelled due to COVID. I really need to re-submit it someplace.
This works for Mellanox, Chelsio, and NICs that use iflib (ix, ixl, igb, bnxt, and more that I'm forgetting).