More

drv · 2024-05-29T04:10:31 1716955831

It's cool to see more systems code written in Rust! I also previously worked on SPDK, so it was neat to see it being chosen as a point of comparison.

However, I was waiting for the touted memory safety to be mentioned beyond the introduction, but it never really came up again. I was hoping for the paper to make a stronger argument for memory-safe languages like Rust, something like "our driver did not have bugs X, Y, and Z, which were found in other drivers, because the compiler caught them".

Additionally, in a userspace device driver that is given control of a piece of hardware that can do DMA, like an NVMe controller, the most critical memory safety feature is an IOMMU, which the driver covered by the paper does not enable; no amount of memory safety in the driver code itself matters when the hardware can be programmed to read or write anywhere in the physical address space, including memory belonging to other processes or even the kernel, from totally "safe" (in Rust semantics) code.

While the driver from the paper may certainly have a "simplified API and less code", I don't expect much of this to be related to the implementation language; it's comparing a clean-sheet minimal design to a project that has been around for a while and has had additional features incrementally added to it over time, making the older codebase inevitably larger and more complex. This doesn't seem like a particularly surprising result or an endorsement of a particular language, though it perhaps does indicate that it would be useful to start from scratch now and again just to see what the minimum viable system can look like. I certainly would have liked to rewrite it in Rust, but that wasn't really feasible. :)

In any case, it's great to see proof that a Rust driver can have comparable performance to one written in C, since it will hopefully encourage new code to be written in a nicer language than C. I definitely don't miss having to deal with manual memory management and chasing down use-after-frees now that I write Rust instead of C.

(As a side note, I'd encourage anyone thinking of using a userspace storage driver on Linux to check out io_uring first before going all in; if io_uring had existed before SPDK, I don't know that SPDK would have been written, given that io_uring gets you most of the way there performance-wise and integrates nicely with the rest of the kernel. A userspace driver has its uses, but I would consider it to be a last resort after exhausting all other options, since you have to reinvent all of the other functionality normally provided by the kernel like I/O scheduling, filesystems, encryption, etc., not just the NVMe driver itself. That is, assuming the io_uring security issues get resolved over time, and I expect they will.)

jvanderbot · 2024-05-29T12:59:32 1716987572

Comparing clean sheet designs to legacy bug-patched, security-focused implementations is pretty common for early-days Rustaceans. Most of the touted simplicity and compile speed is lost now that all the easy problems have been solved by an over-general crate that solves way more than you need it to. The language isn't going to save you from ecosystem bloat, and it isn't going to magically handle all security problems, especially those that occur at design time not compile or runtime.

But for those who want to get a handle on how rust might be used for something other than yet another crypto project or a toy webasm app, TFA is exactly what the doctor ordered.

tempaccount420 · 2024-05-29T14:40:46 1716993646

Because writing a linked list by hand for the 1000th time is definitely safer than importing it from a crate with many collections already implemented... Not

jvanderbot · 2024-05-29T14:53:15 1716994395

I'm not saying we shouldn't use crates. I'm agreeing that maybe we still have to be cautious about them, and in the "early days" when we were doing hand-coded stuff and saying "See how easy this is? Why was this hard in C" are long gone. For the very reasons you implied with your sarcasm.

stefanha · 2024-05-29T16:15:38 1716999338

There is even NVMe passthrough support via io_uring, so it's still possible to send custom NVMe commands when using io_uring instead of a userspace driver: https://www.usenix.org/system/files/fast24-joshi.pdf

Normal block I/O use cases don't really need NVMe io_uring passthrough, but it addresses the more exotic cases and is available in mainline Linux. And NVMe passthrough might eke out a little more performance.

anonymousDan · 2024-05-29T06:13:22 1716963202

Very interesting comment. By I/O scheduling, you mean across multiple processes (i.e. multiplexing the device)?

drv · 2024-05-29T07:42:19 1716968539

I/O scheduler was probably a bad example, since you might not need/want one for fast NVMe devices anyway, but yes, they help ensure limited resources (storage device bandwidth or IOPS) get shared fairly between multiple users/processes, as well as potentially reordering requests to improve batching (this matters more on spinning disks with seek latency, since a strategy of delaying a little bit to sort requests could save more time on seeks than it would spend on the delay+CPU overhead).

The more general point is that if you need any of the many features of a general-purpose OS kernel, a full userspace driver may not be a very good fit, since you will end up reinventing a lot of wheels. Cases where it could be a good fit would be things like database backends or dedicated block storage appliances, situations where the OS would just get in the way and where it's viable to dedicate a whole storage device (or several) and a whole CPU (or several) to one task.

drv · on Aug 29, 2017

The assembly version is using the packed version of the FMA instruction (that's what the "P" in the mnemonic stands for), but as far as I can tell, it's only using one of the packed values, whereas the instruction can calculate two (AVX) or four (AVX2) FMA operations at once. It might be possible to get some speedup by rearranging the calculation so it can use the full width of the vector registers - at first glance, at least the two sides of the division should be possible to calculate in parallel with half as many FMA instructions.

drv · on March 1, 2017

nextAfter is probably also including the denormals (an additional 2^23 values near 0).

mturmon · on March 1, 2017

Yep:

  1,056,964,609 - 1,065,353,216 = 8388607 = 2^23 - 1

because the denormal that is all-zero is already present.

mark-r · on March 2, 2017

And the author explicitly stated he was excluding the denormals.

drv · on Dec 15, 2016

One possible reason is that storing a bool separately from the index makes it difficult to update the producer or consumer index atomically. With the implementations that store the full producer and consumer states as a single word each, only single-word atomic operations are necessary to build a lock-free ring. Storing the bool as the high bit of the index would also suffice.

drv · on Nov 17, 2016

It's presumably not standard Host Memory Buffer, since the spec says "The controller shall function properly without host memory resources."

drv · on Nov 1, 2016

2001, not 2011. Time flies. :)

drv · on Oct 20, 2016

(Assuming you mean MailChimp)

I don't know anything about how MailChimp operates, but a quick search turns up this blog post about how to set up SPF records [1].

From there, you can get a list of which IPs MailChimp authorizes as a sender; following the SPF include directive, you can see they specify two IPv4 ranges, both of which are in class C space, so it seems unlikely that MailChimp has their own class A for SMTP senders.

[1]: https://blog.mailchimp.com/senderid-authentication-for-your-...

drv · on April 13, 2016

It has a C API, but everything under the hood is C++. The source is actually public now: https://github.com/Microsoft/Windows-Driver-Frameworks

drv · on April 13, 2016

Microsoft has been shipping static analysis tools with the Windows DDK for a long time (originally PREfast[1], now Static Driver Verifier[2]). I believe the static analysis is even integrated with Visual Studio now.

[1]: http://research.microsoft.com/en-us/news/features/prefast.as... [2]: https://msdn.microsoft.com/en-us/library/windows/hardware/ff...

drv · on Feb 1, 2016

In practice, this is probably correct (Microsoft cares a lot about backward compatibility, and many programs depend on MSVCRT.DLL).

However, the official word from Microsoft is that MSVCRT.DLL is only intended for operating system components to use, not user applications. For example, see Raymond Chen's blog on the subject. [1]

[1] https://blogs.msdn.microsoft.com/oldnewthing/20140411-00/?p=...

cesarb · on Feb 1, 2016

The problem is that their official word changed.

As that link itself says, MSVCRT.DLL was the C library for several versions of their official C compiler. That is, it had always been intended for both operating system components and user applications. A quick Wikipedia search tells me that this was true until 2002, which is when MSVC 7 was released.

Wikipedia also tells me that MinGW (which is probably the main source of programs linking to MSVCRT.DLL nowadays) was first released in 1998, so their use of MSVCRT.DLL as the global C library was correct. It's not MinGW's fault that Microsoft changed their mind.