Linux exposes system information like processes and devices in the psuedo file s...

josephg · on May 20, 2023

Yeah, I know how it works in linux. I just hate it. Polling and writing a parser for a pseudo-file always seems like a super roundabout, inefficient, brittle way to send information from kernel to userspace.

Think about it - the kernel stores some data in a nice, neat struct. And the kernel knows exactly when that struct is modified. Userspace wants that data, and wants to know when the data changes. So every second, userspace "reads" a pseudo file. The kernel packs the struct's contents into a big string with some sloppy, hand written text format and passes it to userspace. Then in userspace you need to write a brittle parser for whatever the kernel gives you, knowing that in different versions of the kernel you'll get slightly differently formatted strings.

When the data changes in the kernel, the application has no way of being notified. It just polls and parses in a loop. Its both inefficient (when there are no changes, its all wasted effort). And its laggy (when changes happen, it'll take your application up to the polling interval to find out about the changes).

For things like filesystem watching, there are special APIs applications can consume. But those APIs are different for every kind of data the kernel manages. Devices, mount points, USB connectivity, filesystem watching, CPU usage, thermal sensors, open network connections and on and on. All of this stuff has totally different kernel-level APIs for "watching" changes. (If you're lucky - a lot of it can only be polled). Its a big fat mess. I want to try replacing all of that with a nice clean API for fetching & subscribing to any kind of data. So, the kernel has a standard API for fetching (maybe it returns JSON). And a standard API for subscribing to changes - such that your application is notified when a change occurs, and the change is sent in a standard, easily digestible patch format.

And ideally, eventually, I'd like applications to be able to share their own changes back with the kernel in the same form & format. And maybe, share data objects between applications like that - with the kernel acting as a broker for sending patches back and forth, and syncing them with the filesystem. I think that would be really neat and useful for all sorts of applications.

But I want to prototype it all first in something thats not linux. I'm just not sure where to start. I don't really want to write an OS from scratch just to try out this generic state passing idea.

grepfru_it · on May 20, 2023

In Unix everything is a file. Linux is not Unix but tries to be like Unix. This all files represent a resource on the system. It’s hard to separate this from POSIX requirements. Lose POSIX compatibility and you end up with a half implemented system like Windows. And then you must recreate the entire application ecosystem yourself or you end up with this funky BeOS/Haiku style POSIX compatibility layer that excludes a good subset of applications that already exist.

PS /proc isn’t a file system, it is an api (that can be accessed by the file system)

nerdponx · on May 20, 2023

How do you access the contents of /proc programmatically?

nurettin · on May 20, 2023

read/select loop? Or am I too outdated?

ww520 · on May 20, 2023

cat, fopen, fread, fwrite, etc.

nerdponx · on May 21, 2023

Right, so it's an API, but it's only usable through the file system, and it's a pretty bad API. You still need to parse the crappy ad hoc format instead of getting a struct from the kernel.

grepfru_it · on May 21, 2023

Yep. So the trade off is then app compatibility or a radical new interface

josephg · on May 21, 2023

If it were in Linux it would probably make sense to keep both interfaces, for the time being. Existing apps can use the current interface, and newer apps can opt in to a fancier interface / api.

ww520 · on May 20, 2023

I agree with you mostly. In some cases the performance might not be desirable to deliver the events. One minor nitpick, pollable endpoints on /proc means that a program can run select() or epoll() on the file descriptor. Those can be blocked to sleep, no need to re-poll at intervals.

Windows has WMI and other systems have SNMP MIB as a uniform interface to access system resource. Those never get anywhere.

josephg · on May 21, 2023

I don’t see why performance would be worse. Presumably, the application would register an event handler to get change events. And optionally block until an event arrives just like you can do now.

My expectation is performance would be better because you don’t need to parse the data as a string.

And you also avoid the ABA problem this way. If you wait for a file in procfs to change, by the time you read it it may have changed to something else - and you’ve lost data.