Hacker News new | past | comments | ask | show | jobs | submit login
Linux Hardening Guide (madaidans-insecurities.github.io)
460 points by FlyMoreRockets on Dec 31, 2020 | hide | past | favorite | 295 comments



As a *nix sysadmin since before Google was a thing, I find articles like this frustrating and arguably dangerous.

They lay out a set of "rules" the author has collated as a dogmatic doctrine that only a fool would not follow. But they provide no "why", take absolutely no context in to account, and talk only about their perceived upsides.

Anyone reading this should take each point as "here's a possible thing you should go and research yourself", because there are consequences to most of these rules.

I'm not saying anything they've written is wrong (though there is a lot of unsubstantiated opinion there), just that you should learn what any of these changes truly do before implementing them AND that your particular context is really important.

Edit:

There is a disclaimer at the top which mustn't be forgotten while reading.

> DISCLAIMER: Do not attempt to apply anything in this article if you do not know exactly what you are doing. This guide is focused purely on security and privacy, not performance, usability, or anything else.

I do appreciate people putting work into producing this sort of content, but I think the article could be improved by phrasing the various steps as suggestions and perhaps linking out to more detailed documentation on the "why" elsewhere.


Yeah, that's to put it mildly.

This gem is right up there in the first page:

    Use a distribution with an init system other than systemd. systemd contains a lot of unnecessary attack surface; it attempts to do far more things than necessary and goes beyond what an init system should do. An init system should not need many lines of code to function properly.
Wow! To put such a controversial feature, and present it as "this is the truth, just how it is", without even mentioning that it's just an opinion and a lot of people have a completely opposite one, and both for good reasons. That just makes me distrust this guide because it is so opinionated and subjective.


Systemd is a security nightmare, a huge amount of code, taking over various network services ... poorly. Doesn't seem controversial at all to me. Gives me the heebie jeebies to think of systemd talking to the network before I even have the client firewall up.

Generally if security is your top priority I'd disable as much of systemd as possible. Things like find a good local resolver (like unbound) and disable systemd's implementation, etc.


I'd like to read a guide on hardening systemd. Systemd is really convenient and I hadn't realized it poses a large attack surface. A good list with possible attack vectors and how to mitigate them would be epic.

Is it really an established idea in the security community that systemd is a security nightmare?


>Is it really an established idea in the security community that systemd is a security nightmare?

It's not. The presence of CVEs in software is a sign of maturity, not insecurity. It's frustrating to see people use CVE numbers to point at some larger problem. The key element is to realize that security issues are just bugs, especially in the context of C software. What you should be cautious about is upstreams that doesn't handle security issues well. Systemd is not one of these upstreams.


Hehe... around 2002 I worked in the IT support division of a telco (South America.) At some time I suggested the use of ssh to login to our farm of Digital/Compaq/HP servers (I believe it was not included in the base OS.)

The IT Security Officer never contacted me, but convinced the management about the insecurity of SSH by showing a big list of CVEs. As a result we were instructed to continue using... telnet.


Haha yeah, because telnet never suffered overflows during telnet options processing.


> It's not. The presence of CVEs in software is a sign of maturity, not insecurity.

It really depends, depending on both the complexity and scope of use of the project, and whether the rate of CVEs is decreasing.

Flash had CVEs regularly for years. I would say it was getting more mature, but the rate they were coming out and the fact that they just kept coming indicated to me not that the project was maturing (at least not at a rate I was comfortable with), but that either the programmers were inept, or more likely, that earlier design decision lead to very hard to reason about security and made it extremely hard to harden prior to exploits being discovered. Neither are conclusions that made me want to use that software.

OpenSSL fell into a similar situation a couple years back. Seeing all the CVEs that were coming out, once could reason "this is just a mature project", but the truth (exposed by numerous people doing audits) was that the code base and developer process was a dumpster fire.

> The key element is to realize that security issues are just bugs, especially in the context of C software.

They are, but not all people and projects produce the same quality and quantity of bugs.

All this is probably how you already see it, just not quite spelled out in your prior comment. I just thought it was important to make that distinction obvious. :)

P.S. This is somewhat divorced from the context of whether systemd is a good code base or not. I didn't intend for this to be an implicit assessment of systemd, I'm not making any judgements on it here.


All security issues are not just bugs.

Design is not a bug. Some things just aren’t designed to meet security goals. Telnet is plaintext, in most environments that’s a pretty bug security issue. That’s not a bug in the code, it’s just not designed to protect the data from tampering, evasedropping and hijacking. It just can’t operate any other way.

Configuration errors are security issues, but they are not bugs. Users can setup up things insecurely.

Human beings present their own security issues, and they are definitely not bugs you can code away.

The biggest myth about software security is that’s it’s all just bugs. This leads to after the fact thinking (well just patch it), and a huge blind spot to the fact that security isn’t something you can just build, it’s an entire process that goes way beyond just code.


I think this is mostly an issue of overloaded terms. There are security design considerations, and security issues. Telnet being plaintext is not a security issue for telnet, it's a security issue for those using telnet for something it's unsuited for. HTTP being unencrypted is not a security issue for the HTTP protocol, or an application that wants to support that (a browser), but it may be for an application that makes requests over HTTP instead of HTTPS when those requests require some level of privacy.

If an application has a design goal to be secure in some aspect, but the design they chose doesn't accomplish that, then the design itself is a bug and needs to be fixed (or they need to change their design goals). Buggy designs exist, they're the designs that don't fulfill the desired purpose.

All security issues in a the context of a project which intends to provide security in that aspect are bugs.


No way man, I'd much rather use this other tossed together init system that sombody started as a pet project. No CVEs!


By that reasoning, Flash would be safe enough to stay around, seeing how many CVEs it had (1078 since the first CVE for it was published Dec 31, 2005).

It’ll be gone tomorrow though, or January 12, or later in 2021.


I don't think flash proves anything. It's an extreme example after all.


How would you modify your argument rather than dismiss the evidence against it?

For example, could you propose an equation using the number of CVEs and their severity along with the the rough number of known users, showing that some curve or values would indicate maturity rather than risk?

Some data for Flash[1] CVEs are available at CVE Details. Possibly you could find other products you consider safe and see how they compare.

I’d be curious if an analysis would show correlation of what you’d consider greater risk with a year that Adobe started outsourcing to a particular company or decided to EOL it; I’m not proposing that quality went down due to one of those things or that they outsourced, but analyzing the data could be useful.

[1]- https://www.cvedetails.com/vulnerability-list/vendor_id-53/p...


flash predates systemd by around 5-6 years. Flash has 1000+ CVEs. Systemd has less then 20. I don't think pointing at flash proves anything.


You believe the security of systemd is strong, and that CVEs are a sign of use and maturity, then you’re saying ~2 CVEs per year on average is compatible with good security.

I understand you believe this, but I don’t see how it’s rational.


> Is it really an established idea in the security community that systemd is a security nightmare?

Here's a list of systemd-specific CVEs over the years: https://www.cvedetails.com/product/38088/Freedesktop-Systemd...

Note that these apply to all programs in systemd distribution (not sure how many of these are specific to systemd init program).

It seems that a lot of criticism is due systemd doing more than a classic init process does, but some of it seem unjustified since you it is not an "all or nothing" package.

For example, this specific CVE affected systemd-resolved (systemd's DNS daemon), but IIRC Debian was not affected since it did not use systemd-resolved by default: https://www.cvedetails.com/cve/CVE-2017-15908/

So it is perfectly fine for a distribution use systemd init system but use other network daemons.


>>It seems that a lot of criticism is due systemd doing more than a classic init process does, but some of it seem unjustified since you it is not an "all or nothing" package.

I hate this argument because, in practice, it is an "all-or-nothing" package from a user standpoint. Packages have a terrible habit of setting dependencies on one or two of the "modules," which wind up bringing the whole thing along with them. Sure, you can "disable" it after the fact, but then you're left with a system that will break in fun and exciting ways when something a package you requested wants isn't available.

If you don't want systemd, your only option at this point is also-ran distros.


systemd is extremely modular if you’re the one compiling it (i.e. you’re a distro-builder) but not as an end-user since it’s used as the plumbing for your distro. And since distros aren’t usually in the business of telling users not to use features they ship everything because someone might want it.


The fact that systemd bundles a lot of functionality (which is different from being a monolith) means it has a disadvantage in the terrible metric of "CVE counts".

Now if only the systemd-hating crowd would take a look at how many CVEs affected their DHCP client, nscd, nss and other resolvers, various PAM modules, syslog/klogd, ad-hoc init scripts writing insecurely to /tmp, ntpd, etc.


> I'd like to read a guide on hardening systemd.

Here is a great guide detailing some of the security features available to systemd units from earlier this year (2020):

https://www.ctrl.blog/entry/systemd-service-hardening.html


Systemd by itself comes with this wonderful analysis tool.

Systemd-analyze security Systemd-analzye security <service>

There it details what capabilities, and restrictions are in place for a given systemd service. From there you can edit the service to restrict even further as required if you want.


Completely agree with this. One of the biggest risks is using journald if you ask me. It makes logs fairly impenetrable if you have a forensic situation to deal with.

Now there did need to be a better solution in the init system space but systemd was possibly the worst answer that could be mustered for it.

Systemd is merely what you get when Redhat owns the development community, not the best technical solution.


> It makes logs fairly impenetrable if you have a forensic situation to deal with.

You mean with its sealing which can detect corruption, common export formats, the on-disk format clearly documented (https://www.freedesktop.org/wiki/Software/systemd/journal-fi...), and independent readers like https://fluentbit.io/documentation/0.14/input/systemd.html ?

What exactly is impenetrable here?


Well it’s more what you do with a node that has been isolated or is suspect. Mounting the file system on a scratch node for analysis and then looking at the logs is a pain in the ass when you have journald in the way.

As for forwarding, there is likely a disparity between the last log written to disk and the last log received by the aggregator. What happened in that window is always interesting and always on disk on the node, unless the attacker has nuked it, which neatly leads to...

Sealing and tamper protection on logs is absolutely no use as any competent attacker will just destroy them outright. Learned that from years of windows NT. you can possibly recover them from analysing disk blocks though which is even more difficult if you have journald in there again.

I prefer rsyslog, plain text and time marking. Much easier to alert on suspicious things when the mark stops arriving.

Log journaling has few practical benefits. I. Fact I’m sceptical of any from experience.


> Mounting the file system on a scratch node for analysis and then looking at the logs is a pain in the ass when you have journald in the way.

What's hard about:

    journalctl -D /mounted/path/var/lib/journald ...usual options
> Sealing and tamper protection on logs is absolutely no use as any competent attacker will just destroy them outright.

But that's the point - as long as you've been monitoring the signatures, you know which one's been removed/changed.


That’s fine until it doesn’t work or the journal is corrupt or you’re pulling the journal off the file system as mentioned further down the post I made. I’ve been there and the entire toolset fell to pieces in front of me.

Plain text is orders of magnitude easier and safer to deal with and easily recoverable even if only partially which is extremely difficult with journald


I mean the journald format isn’t magic or anything and 99% of what it stores is plain-text. The only issue with “corruption” is that the tooling is bad at handling it. Which is a valid complaint. But in an alternative universe people would be complaining about plain-text because less crashed on any utf decoding issue.

Basically every journald using distro immediately forwards the logs to syslog so you can just pretend it doesn’t exist and call it a day. I don’t because journald’s metadata and filtering is super powerful but nothing is stopping you from just grepping like usual.


it's a bit odd to defend the new kid on the block, journald, and then go and link to fluentbit.

I feel like we're missing the elephant in the room, which is that almost everyone is going to forward logs over the network completely bypassing journald/journalctl altogether. At which point you have to ask: who is journald for? In the server environment, you're going to send everything to Splunk or whatever.

Which just leaves desktop users to deal with the headache that is journald. Desktop/workstation users don't care about tamper detection, corruption, or any of that nonsense. If there is corruption going on, I doubt they care that their boot logs are invalid. I'm 100% sure they care more that their 3D render that took 30 hours to render is now corrupt, or some other asset or document.

And in the rare case they do need to dive into the logs, they aren't going to want to re-learn journalctl again.

journald is a solution to a 1990s problem that no longer exists.


I linked to fluentbit only as an independent journal reader, not for what it does as a service.

To be clear, I wasn't defending usefulness of sealing and signing. Just the fact that not only it's independently readable, but also you can find out about corruption of it matters to you for forensic purposes raised.

Journalctl does provide at least one significant feature though - on a properly configured system "journalctl -u foobar" gives you the logs of foobar. No more chasing which file they live in, no more stdout goes here, stderr there, and logs get split in that special way over there. This is great for desktop users.


This is a pretty good description of the problems for sure. Thanks for writing it up.


there was Upstart


[flagged]


You are entitled to your own opinions, but not to your own facts.


Can you provide some links about that? Or an explanation?

I am genuinely interested.


This was the 2014 vote of the Debian Technical Committee:

https://lwn.net/Articles/585504/

Of the 8 voting members, at the time Steve Langasek and Colin Watson were currently employed and Ian Jackson had been previously employed by Canonical.

None of the 8 voting members were employed by Red Hat at the time.

Here's a comment with more research into affiliations at the time: https://lwn.net/Articles/585026/


Thanks!


When I run systemd-analyze security systemd-networkd it seems to have set quite a lot of capabilities/restrictions.

It seems there was quite a lot of restrictions in place to specifically minimize the attack service of that service.

That is unlike a significant number of other services running on my system.


The problem is that a modern Linux system has a microservices architecture. Systemd tries to address it as such, but it's not enough.


why stop there - also use a system that doesn't have IME/PSP, supports coreboot.


Don't forget that if you run the wrong command, you can brick your entire computer: https://github.com/systemd/systemd/issues/2402


The CVE trend for systemd is worrisome as well:

https://www.cvedetails.com/product/38088/Freedesktop-Systemd...

Disappointing that an init system is on the network at all.


That really isn't systemd's fault. Software needs to manage the efi variables, so systemd mounts efivarfs writable.


What software actually needs to repeatedly write to these variables? Systemd itself does, but there was absolutely a world before having this option on by default. Grub and such can just unmount after they are done.


What do you mean it talks to the network? Can you give some reference?


Sure, try https://en.wikipedia.org/wiki/Systemd

By default systemd includes journald, rather controversial (and mentioned in a comment on this thread) picked binary logging (harder for humans) and much less robust in the face of a small corruption, and ignored encryption. I believe it doesn't support encryption (by default). This replaces more robust solutions (by default) like syslog-ng and rsyslog that have long histories of battle tested real world usage.

resolved replaces the local DNS resolvers, and at least for awhile ignored DNSSEC. This is particularly bothersome since it is such a security sensitive daemon. Sure you can disable it and pick unbound, but it's not default.

timesyncd is a time daemon, replacing NTP, another security sensitive app. Not that NTP is a paragon of security, but various projects have arisen to improve things.

networkd is a replacement for DHCP and similar for IPv6. Again not nearly as nice or secure as existing solutions.

So basically decades of developer work, security audits, and competition among network services had been ditched and is now part of systemd and a typically cavalier attitude towards security.

There are good parts of systemd, it filled a need. But this swallowing of security sensitive network services really bugs me. Being able to be compromised so early in the process and the tight integration going against the idea the unix philosophy of using minimalist modular software. Normally I'd run unbound as a non-root user with access to read few directories, and write to even fewer, DNSSEC enabled, and logging via syslog (human readable). By default on an ubuntu 20.04 (and 18.04 before that) I'd have issues with dns failures... till I replaced the systemd resolver. I'd mention it to other admins and they would shrug and mention that systemd sucks.


The security track record of those existing systems you refer to: - bind - isc DHCP - dnsmasq - ntpd - radvd - rsyslog - syslog-ng

The name enumeration alone should ring bells.

There's always more that can be done, but https://github.com/systemd/systemd/tree/master/src/fuzz contains more than most of the aforementioned combined.

As for how your run your alternative services as non root, you may wish learn about what the contents of this file means: https://github.com/systemd/systemd/blob/master/units/systemd... or this one: https://github.com/systemd/systemd/blob/master/units/systemd...

Can you point to a commonly used initrc that comes even remotely close?

You should also read https://systemd.io/JOURNAL_FILE_FORMAT/ and NetworkManager, which is what Ubuntu uses.

By all means bash away (pun intended), but I keep seeing these points go uncontested and they're not very well founded.


I claim init systems shouldn't open network connections.

Sure, bind has a terrible security record, and that was part of the reason why people started writing more secure replacements like unbound.

NTP has a terrible security record, and that's part of the reason people started writing more small secure replacements like NTPsec and chrony.

Similarly sendmail's security issues resulted in improvements like postfix, which hasn't been swallowed into system yet, mostly kidding.

Linux often has multiple implementations for a given service and ease of use, performance, security and related allows them to compete. This is a sign of a healthy ecosystem and generally I think it's working well.

However systemd, by rolling this functionality into systemd subverts that system and the vast majority of systems will just accept the defaults. It also makes systemd huge and complex, last I looked there was somewhere around 5% of the lines of code in systemd than the entire linux kernel which I find scary.


systemd the init system does not open network connections. systemd PID1 does not talk to the network. `networkd` is a separate binary.

Second, systemd does not mandate using any of these components apart from `journald` and `logind`. You can pipe journald into any syslog daemon of your choice, there's a config option to do so. If you've got issues with logind I can't help you there. I don't know what it was intended to replace (consolekit, I think?) but I do remember it was badly maintained.

Third, a vast majority of deployments do _not_ accept defaults. I know that both Red Hat and Debian go for third party network managers (networkmanager and Debian's ifup/ifdown stuff) and rsyslogd was there on a default install of both Debian Buster and CentOS 8 iirc.

Finally, I repeat, systemd is a monorepo which contains many programs. Sure, you can argue about how they're tightly bound, but I can point you towards FreeBSD/OpenBSD if you'd want them to be broken up into separate repos to be more "unix-y", and if you look at systemd PID1 it's a fairly small binary which doesn't seem to offer many security holes.


Finally the biggest mistakes systemd project did is on the marketing and strategy sides.

Would have they remove the "systemd-" prefix from all side binaries and marketted them as independant projects on the website, made them usable without systemd and maybe on sub repositories, would have the systemd project just had to explain "yes we rewrite ntp, dns, etc, why not ?"

Instead they received complains about a "bloat ware" while often rewrites of industry-standard by unknown/junior people are acclaimed on HN :)

Even better, they could have integrated chronyd or others by creating "systemd integration standards" and submitting patches to these projects to gain community support and permit an easy switch to one or another implementation and let the users chose. Though it s still already easy to use something else than timesyncd on centos at least, thanks to systemd project and distri maintainers too :)


You can't have any stable integration standard when systemd's scope is ever increasing, or paper it over with marketing. Even stuff like home directories now Must Be Improved.

It isn't comparable with typical rewriting of industry-standard by newbies.


thank you for the follow up. You have made some good points but I couldn't find any explanation to what you said : systemd talking to the network before I even have the client firewall up.


Worth noting that these are optional. You can forward logs to your preferred syslog daemon. You can turn off resolved, timesyncd, networkd and use something else instead of them.

It's not ideal when looking at the whole environment, but specifically for our services - we've got the choice.


Systemd's scope is increasing over time, most recently with handling system /home directories. Seems very possible that the ability to disable features in the future versions of systemd will be removed because they do not have the required integration into systemd's view of the world.

In any case it's the default behavior of systems that should be secure and systemd is the exact opposite of that.


Socket activation (by its very nature) is systemd listening to the network on your programs behalf. If nothing else.


I see, I've heard about it but I couldn't find any reference that systemd starts listening on sockets before the firewall service is started. Even if it did, with systemd you can setup your firewall service to start before the eth link goes up.


You can't typically start the firewall before the eth link goes up, since rules that contain network interfaces might not work before those are created (e.g. in case of bridge or bonding interfaces). Systemd's "start everything at once, let the dependencies sort out automatically" is a major regression for server systems. With sysv-init you could just pick the right ordering and be done, everything was stable for every boot after that.


You can, using the directives "Wants=network-pre.target" & "Before=network-pre.target". This will make sure the service is up before any network interface is configured.

Source: https://www.freedesktop.org/wiki/Software/systemd/NetworkTar...


Not quite. First, this doesn't apply to any interfaces that are created by system services like VPN tunnels. Second, you would usually want and need network-online.target, because firewall config often is applied after network-pre only. Third, dynamic stuff like VLAN interfaces also only appear after network-pre and often only after network-online. So no cookies for you. This kind of non-attention to details is actually what makes systemd bad: Everything looks like the authors laptop, if it doesn't, tough luck.


Edit: Sorry, I just realised what you mean. I don't know about iptables, but I've got nftables set to come up before the network is set up and it works perfectly fine without enp3s0 or wg0 having come up. The rules are set and then the interfaces come up.

Original comment follows:

I'm... not sure I understand. VPNs would only work after your physical interfaces are up, right? So if you want your firewall rules to be applied before a VPN interface comes up then you'd be perfectly fine with network-pre. From the page I linked in the comment you replied to:

"network-pre.target is a target that may be used to order services before any network interface is configured. Its primary purpose is for usage with firewall services that want to establish a firewall before any network interface is up. It's a passive unit: you cannot start it directly and it is not pulled in by the the network management service, but by the service that wants to run before it. Network management services hence should set After=network-pre.target, but avoid any Wants=network-pre.target or even Requires=network-pre.target. Services that want to be run before the network is configured should place Before=network-pre.target and also set Wants=network-pre.target to pull it in. This way, unless there's actually a service that needs to be ordered before the network is up the target is not pulled in, hence avoiding any unnecessary synchronization point."

So ideally you'd run your VPN service after network-online.target has been reached, which would certainly be after the firewall rules have been applied.

The great thing about persistent interface names is that if you know the names of the interfaces which are going to be coming up, then you can set up a firewall before any of them come up. Which takes care of dynamic VLAN interfaces. You can use IP addresses or blocks to refer to them in your firewall config, am I right?

If you think I'm unable to grasp your situation, can you give me a more detailed example?


> Systemd's "start everything at once, let the dependencies sort out automatically" is a major regression for server systems. With sysv-init you could just pick the right ordering and be done, everything was stable for every boot after that.

This is a big point. All of the systemd systems I've been exposed to in the wild have non-deterministic inits, i.e. if it takes longer than average to boot up once (A problem I had a few times), starting it again doesn't replicate said problem. If the network craps itself when booting sometimes, that problem is variable so it's impossible to tell if you've fixed the problem, and it was what you suspected, or if it's something else and you only think that you've fixed it.

On the other hand, my alpine system consistently boots up. If it errors once, it will error consistently enough to allow tracing of the cause, but regardless, it has never had a problem on boot.


iptables and nftables both have ways to declare rules for interfaces that come and go. Identifying dynamic interfaces is done by string comparison instead of by index (for static interfaces).


Which doesn't always help you if you are actually using one of the make-firewalling-easier-daemons (which I personally would advise against). Those often only support very basic features, but are pushed heavily by the commercial distros.


> Systemd is a security nightmare

I have yet to see solid evidence it is any bigger of a nightmare than a mess of bash scripts that was what came before, but ok.

> taking over various network services ... poorly.

The networking features of systemd are entirely optional. If you don't know how to disable them if you don't want them, the right course of action is not to blame systemd.

People who grew up on SysV init like to shit on systemd as it erases their existing knowledge, but to pretend that SysV init wasn't a total, inconsistent mess or that anyone was maintaining consolekit before logind came around is distorting the truth, to put in mildly.


Indeed, I read the following:

“systemd contains a lot of unnecessary attack surface; it attempts to do far more things than necessary and goes beyond what an init system should do.”

I’d want the author to explain this further. Perhaps they could tell the admin of the system how to reduce the attack surface, detail what services it should not be providing (and why), what services it provides that are worse than the alternative, and how to turn them off or harden them.

“An init system should not need many lines of code to function properly.”

That’s a ridiculous statement. Lines of code is an absurd measurement to determine the security of anything.


Systemd provides a lot of network functionality in systemd-networkd, journald, timesyncd, etc. that is remote attack surface. All the systemd "cloud of daemons" is tightly coupled by dbus interfaces that enable an attacker to move from one exploited system service to the next. Even if the attacker doesn't manage to find an exploit in another system service, DoS is easily possible because the DBUS interfaces are quite fragile. Even as a benevolent admin it is easily possible to get the system into a state where e.g. clean shutdown is no longer possible because systemctl doesn't want to talk to systemd any longer and you cannot fix that. systemd-udevd also has raceconditions galore, so sending any message to it in the wrong order relative to another one will kill the system, maybe even open exploit vectors. At the very least I would, for hardening, recommend not using any network-facing systemd functionality.

And lines of code are not ridiculous, they are the best first-order estimate available. Of course an actual inspection of the code is better for a comparison, but that is a huge task. sloccount is quick and easy.


> Systemd provides a lot of network functionality in systemd-networkd, journald, timesyncd, etc. that is remote attack surface.

Everything systemd does is something the OS was already doing before, just using a bunch of independent-upstream-project daemons glued together with RC scripts. That attack-surface existed before. That LOC-count existed before. It just wasn't monolithic under one project/org, and so wasn't being perceived as a problem.

The various interactions between daemons, also weren't getting audited during upstream-project security audits, because all the interaction potential only existed in the downstream distribution, rather than being part of systemd's monolith + integration test-suite.

> DBUS interfaces are quite fragile

What, and what it replaced (arbitrary ug+w sockets in /var/run, speaking to custom one-off RPC-listener ABIs for each daemon) wasn't fragile?

IMHO, all that's changed is that there's more uniformity, meaning a single set of tools can be used to test+harden all such RPC interfaces, and security flaws can be fixed in one place (DBUS itself) rather than requiring theory-level descriptions of vulnerabilities to be communicated to each project, whereupon they're hopefully translated to separate one-off patches the affected daemon.

> systemd-udevd also has raceconditions galore, so sending any message to it in the wrong order relative to another one will kill the system, maybe even open exploit vectors

100% true of what came before it. Hell, some of my Linux systems from the 90s and early 2000s would lock up hard just because I `modprobe`d a webcam driver when the webcam was already USB-connected.


> What, and what it replaced (arbitrary ug+w sockets in /var/run, speaking to custom one-off RPC-listener ABIs for each daemon) wasn't fragile?

Sockets like initctl are far less fragile. You don't need a dbus-daemon that you can't restart if you cant talk to systemd. Your parser needs to recognise single-character commands instead of a complex rpc syntax plus an xml parser or javascript vm for filtering. You can't honestly believe DBUS isn't miles from the robustness of initctl.

> IMHO, all that's changed is that there's more uniformity, meaning a single set of tools can be used to test+harden all such RPC interfaces, and security flaws can be fixed in one place (DBUS itself) rather than requiring theory-level descriptions of vulnerabilities to be communicated to each project, whereupon they're hopefully translated to separate one-off patches the affected daemon.

Instead of easy-to-audit socket permissions you get a complex mess of xml and javascript filtering rules. This isn't progress in terms of security, its quite the opposite. Pounds of additional attack surface and additional obfuscation hindering defense.


What systemd gives is structure + consistency; service definitions are declarative and knowledge distro portable.

To me that that is the big benefit of systemd over sysV init.

> Pounds of additional attack surface

Again, show me a major in the wild exploit that abuses these supposed 'pounds of attack surface'. As was said before, systemd doesn't actually have that much new surface that wasn't there before. It's just before it was not maintained in any sort of organized manner but rather was scattered across tens of different projects in various state of no longer being actively maintained, as I said consolekit is a perfect example.

systemd is actually pretty modular. You don't have to run the components that are not useful to you thus limiting the attack surface greatly, if you're actually concerned about it.

I know some people have an allergic reaction to Poettering, but the guy's willing to touch some of the rotten parts of the ecosystems few others dear.


> People who grew up on SysV init like to shit on systemd as it erases their existing knowledge...

Painting people who doesn't like systemd like old, beardy creatures afraid of learning new stuff is extremely simplifying the situation if not being misleading.

My personal opposition about systemd is about binary logs and going against "doing one thing, well" principle. I may rant it all day long about the things I don't like about systemd and praise its useful features for a similar amount of time but, I'd never bring up "I learn to everything from ground up again!" as an issue.

Software evolves, everything evolves. We need to adapt regardless of we are system administrators of a big farm or a user of a measly Linux system.


> My personal opposition about systemd is about binary logs

syslog-ng supports the systemd journal natively so you'd never see a difference from before if you don't want to.

> doing against "doing one thing, well" principle

The principle is dogma that has held back Unix imo. It's useful for simple CLIs but that is about it. By your definition the kernel itself goes against the principle.

systemd is a systems manager, it takes care of it during its entire lifecycle, keeping home directories portable is one thing it can do for you that is entirely optional but very useful, managing containers is another. These are however separate components - homed and systemd-nspawn respectively. It is not all rolled into a single 'systemd', this is a misrepresentation.

systemd is very modular. People like to paint it as a monolith because it lives in a monorepo for the most part, but it is not actually built as one and individual components can be disabled or swapped out.


> syslog-ng supports the systemd journal natively so you'd never see a difference from before if you don't want to.

We use rsyslog with systemd and it works too, however it's just another level of abstraction so, I'm not fond of piling stuff over and over. Binary logs' and journals' usefulness can be also debated. It of course brings some advantages to the table but, it doesn't enable anything groundbreaking for me.

> The principle is dogma that has held back Unix imo. It's useful for simple CLIs but that is about it. By your definition the kernel itself goes against the principle.

Actually I don't consider it dogma, because it has some very important results and corollaries. This principle reduces secondary complexity (glue logic, complexity required to do many things with one code base, etc.) greatly hence the software can be relentlessly optimized. grep, find, sed, awk, cut, head, tail, less all are extremely performant tools. I develop high performance scientific applications and making a tool accomplish more than a couple different functions either reduce performance (generalized algorithms) or increase size a lot (specialized functions for everything). Its refactoring, verification, maintenance gets complicated and expensive fast.

UNIX pipes and small optimized tools are much more useful beyond simple CLI commands and small scripts. Since the tools are fast, low memory footprint performance behemoths, you can build very fast and reliable machinery with them. You can even do data mining with them [0]. We have some impressive stuff running under the hood in our cluster which only use standard GNU utilities.

Also, tools like rsync and rdiff can lift much more than their size. They're proverbial ants of data transfer and they also do one thing well. Same thing can be said for vi and nano too for text editing. Similarly, modern tools like Atom still shudder and die with a single big file while using extreme amounts of memory. OTOH, vi doesn't even sneeze with files thrice as large.

Kernel doesn't breach this principle either. It provides an interface between the hardware and software & lives in its own space; that's it.

> systemd is a systems manager, it takes care of it during its entire lifecycle, keeping home directories portable is one thing it can do for you that is entirely optional but very useful, managing containers is another. These are however separate components - homed and systemd-nspawn respectively. It is not all rolled into a single 'systemd', this is a misrepresentation.

The problem with systemd is not purely technical. Also, I want to be clear that I'm not a die-hard systemd critic. The technical side can be summarized as "Hey! This thing is complicated, developed very fast and can bring some (stability and security) problems back during its teething. Please be careful". Other side is social and can be summarized as "Hey! You're developing this, but you're not listening to us and pushing your opinions down to us using your power. This dangerous in every aspect. We can develop this together to something better but, you don't listen to us.".

Also I want to remind that, parallel sysV-init was pretty fast machinery and was very manageable.

I'm aware that systemd is modular. I'm not using more than half of it right now and I'm happy this way. One of the problems with systemd is opaqueness. When systemd overtakes a part of the system with its module, it's very hard to discover it. Disabling the module and using an older or alternative approach is also not straightforward sometimes. As a result, systemd feels like an overzealous octopus with no indication of its intention. Holding everything it can reach with a death-grip and requiring some serious strength to pry it off.

With a better communication and feedback loop, we'd be in a very different place. Maybe systemd would be the same systemd but, without the unproductive rock-throwing, flaming & shouting.

Linux is evolving like every other software or anything in life, but some ground principles are much more valuable than they seem. Throwing them away and labeling them as dogma just because they're old is considered harmful and dangerous.

[0]: https://adamdrake.com/command-line-tools-can-be-235x-faster-...


> I want to remind that, parallel sysV-init was pretty fast machinery and was very manageable.

That does nothing for the fact that SysV init was a pile of shell scripts inconsistent across distros and even individual packages. I'd take systemd's declarative, portable service definitions any day over that mess.

SysV init did not take care of consolekit being an unmaintained mess either. systemd did.

Besides, systemd can recognize that I had plugged in a specific kind of device that needs this sort of setup and do that for me automatically. Hence the 'manager' part. None of its supposed 'competitors' can do that themselves.

> Kernel doesn't breach this principle either. It provides an interface between the hardware and software & lives in its own space; that's it.

I don't know, eBPF for example is an entire complex beast of its own that could be argued to violate that, well beyond what's 'required'.

If you take the view that as long as the system has a single defined job, (interfaces to hardware and exposes APIs to userspace for the kernel), systemd does have a definition like that too. It manages the system's dynamic resources during its lifetime. That's a 'single' job. It's not if you decompose it, but the kernel is in a similar situation at that point.

> When systemd overtakes a part of the system with its module, it's very hard to discover it.

systemd does not do that, sounds like you're blaming bad distro defaults (optional components being enabled by default) on systemd.


> That does nothing for the fact that SysV init was a pile of shell scripts inconsistent across distros and even individual packages.

I've never seen the inconsistencies, sorry. All the service files I've written were greatly portable around what I've used so far.

> SysV init did not take care of consolekit being an unmaintained mess either. systemd did.

No other service should hide the shortcomings of other services. This sounds like WoW's extreme tricks or Kubernetes' dockershim layer to keep compatibility. Now, doing this is wrong.

> Besides, systemd can recognize that I had plugged in a specific kind of device that needs this sort of setup and do that for me automatically. Hence the 'manager' part. None of its supposed 'competitors' can do that themselves.

That's the point. If that task can be handed over to another daemon (like udev) for setting it up, why systemd assumes that setting it up is its job? I'm not trying to say that doing this is flat out wrong but, when asked about why, getting "we're doing it , because why not?" as an answer leaves a bad taste in the mouth.

> systemd does not do that, sounds like you're blaming bad distro defaults (optional components being enabled by default) on systemd.

I think I failed to convey what I tried to say there. I wanted to say that there's no easy way to see whether a feature is managed by systemd or not and hence systemd tries to "self-heal" things sometimes, system management becomes a tug-of-war until one understands that systemd is managing that. If there was an easy way to understand that, it'd be an easier path.

All in all, I personally don't oppose what systemd brings to a table and see the inspiration from macOS' launchd. However, the main thing I strongly oppose and criticize is the blind egg-throwing to other init systems, the overzealous protection of systemd and avoiding discussing its potential shortcomings and problems altogether.

I want to reiterate that I'm not against change and evolution, I'm against forcing it with a stance of "we, only we know the best. now shut-up and use it!".


> I've never seen the inconsistencies,

I've seen plenty to convince me that having a full shell scripting language and not a very good one at that as your service definition language is a terrible idea.

> No other service should hide the shortcomings of other services.

I don't think you get it. It is not hiding the shortcomings of other services. It is implementing its own service because the other one was a pile of unmaintained crap. But I guess that kind of security risk is OK as long as it's a separate repo? What a weird dogma.

> That's the point. If that task can be handed over to another daemon (like udev) for setting it up, why systemd assumes that setting it up is its job?

It does hand it over to udev, but someone needs to notify udev stuff is happening. systemd has easy access to that information so it does the job. Also, you do realize that the people who maintain udev are the same who maintain systemd right? Again, because it was a hard job and nobody else, including other init systems, were willing to pick up the slack.

> what I tried to say there. I wanted to say that there's no easy way to see whether a feature is managed by systemd or not

I understand that but imo that should be on the distro to communicate. On Arch for example, the wiki always gives a clear indication of whether a thing like DNS can and is managed by a systemd component or not.

A `systemctl component list --status=enabled` would perhaps be a nice addition, agreed.

> I strongly oppose and criticize is the blind egg-throwing to other init systems, the overzealous protection of systemd and avoiding discussing its potential shortcomings and problems altogether.

It's funny you say that because as a very early adopter of systemd I saw things up close and it was the exact opposite. People were literally threatening Lennart Poettering with violence for writing a (good imo) piece of free software. That he decided to stay on the scene despite that is honestly remarkable.

But as for other init systems, I don't doubt there's some good ones, sysV init wasn't it, but the main problem is that all the others seem only interested in the init part and are happy to depend on rotten parts of the ecosystem. Unlike systemd, none of them were saying; 'ok, consolekit is unmaiintained, let's pick up the slack on session management' (maybe in a separate repo if you insist) or 'oh, udev is struggling, let's pick up the slack' - no they were like am going to do init and hope these other pieces don't come crashing down like a pile of bricks.

Also, I've started using systemd-homed to make my setup portable across my machines. None of the supposed competitors of systemd offer that either.

I like to compare it with PulseAudio; people suggest to me all the time how OSS, ALSA etc. are 'good enough' etc. but I have a fully FLOSS wireless audio house setup right now thanks to PulseAudio, none of the supposed 'competitors' offer that, so they're not competitors for me, (PipeWire eventually might by literally re implementing all of Pulse).


bayindirh is raising valid points of criticism, which I think you can tackle without trying to raise a stink.

The point about scope e.g. isn't "weird dogma" imo, but very valid, because it has security implications for an init system. The points you raise about unmaintained alternatives is also valid, but doesn't invalidate his criticism.

I understand you feel that systemd has been treated unfairly before, but I suggest you shouldn't try to match the level of spite and throw it the other way, especially when someone is very nicely bringing up what they are worried about.


> I've seen plenty to convince me that having a full shell scripting language and not a very good one at that as your service definition language is a terrible idea.

If we look enough, we can find convincing examples about Python being very backwards and brainfuck being the best language in the world. I had a friend who tried to solve every problem in OCaml. Another one thought that Lisp is the one and only way to do anything with computers. If there's one important thing I've learned over the years, it's the harmful nature of being fixated to ideas. Again, I want to re-iterate that I'm no enemy of systemd. I use it everyday on a 1000+ server fleet. I also used sysV-init on the same fleet. Both got the job done, albeit differently.

> I don't think you get it. It is not hiding the shortcomings of other services. It is implementing its own service because the other one was a pile of unmaintained crap.

Yep, I didn't get it. I missed when systemd reimplemented/replaced consolekit. I thought it shadowed it and hid its shortcomings via intervention, my bad, honestly. When systemd started to get adopted fast, I was working literally underground in a time sensitive project, isolated from outside world. I spend ~2 years without being able to track this stuff closely.

> It does hand it over to udev, but someone needs to notify udev stuff is happening.

I thought there were external mechanisms which notified udev reliably. I remember triggering it for a USB flash drive related event, intertwined with PAM, some network services and other inputs.

> Also, you do realize that the people who maintain udev are the same who maintain systemd right?

Nope, I was buried underground, cut-off from outside world. Again, it's one of the parts I've probably missed.

> Again, because it was a hard job and nobody else, including other init systems, were willing to pick up the slack.

Other init systems didn't notify udev because they thought that it's not their job but another deamon had to do it, because other so-called competitors only care about services. Hardware, hostname, time, resolver, containers, home folders, and the kitchen sink shall have their own daemons they've thought probably.

> It's funny you say that because as a very early adopter of systemd I saw things up close and it was the exact opposite. People were literally threatening Lennart Poettering with violence for writing a (good imo) piece of free software. That he decided to stay on the scene despite that is honestly remarkable.

I'd never threaten anyone about the software they write or the way they behave. Again, at that time I was underground so I had no chance to see it. I also don't support anyone doing this to anyone, for any reason. However, if one of my fellow developers got banned from submitting patches to kernel because they're not fixing their own mistakes and kicking everyone like a spoiled goat, I'll definitely have a good talk with him about it. Using fame to blame others is not the good way to do it.

> But as for other init systems, I don't doubt there's some good ones, sysV init wasn't it, but the main problem is that all the others seem only interested in the init part and...

Pardus' "Mudur" was similar to systemd in many aspects however, it was so integrated that it was not possible to install it part by part. It was faster than both upstart and parallel sysV-init ever be. IIRC Ubuntu wanted to use it, but adopting it was too disruptive and they gave up the effort.

> Also, I've started using systemd-homed to make my setup portable across my machines.

If that fits the bill, that's nice. I don't need portable homes across my systems (all of them are intentionally setup differently due to reasons and preferences) and wouldn't use it for now. If I decide to use it, I'll enable it but, I'm fine with the old method now.

> I like to compare it with PulseAudio; people suggest to me all the time how OSS, ALSA etc. are 'good enough' etc. but I have a fully FLOSS wireless audio house setup right now

PulseAudio brought some needed things to Linux audio, but it needed a lot of hammering in the head to bring it to this level. Initial implementation was a bit boneheaded in behavior. Its automatic umpixing of stereo streams to multichannel still doesn't make sense in most cases. So it needs more sensible defaults from the source tree.

Also, you use the word competitor in a rather aggressive and provocative way. I don't know whether that's your intention but, we're just conversing here. I'm not angry or something or I'm not trying to sabotage the projects you like. I'm just telling the things I see as wrong and try to make some constructive criticism. I may be failing at doing it in an excellent way, but I'm no way hostile to you, or the projects in general.


You know? There were init systems before systemd, working differently than 'with bash scripts', and working very well. They still are available, still working very well, still being developed, still adapted, still lean and mean.

This dichotomy systemd vs. SysV init is simply false, and only applies to the fortgeschrittene Fertigfutterfresser who are slaves of shrink wrapped anything ready to run.

Like obese McDonaldians, trying to discuss gourmet-food :-)


Were these other systems willing to pick up the slack in maintaining consolekit or coming up with their own solution? Didn't look like it. What about udev? Why did systemd devs have to pick the slack on that too?

Is is perhaps because the other init systems are only willing to touch a very narrow definition of init while depending on rotten, unmaintained pieces of the ecosystem like consolekit without being able/willing to maintain it themselves?

Or perhaps the plan is to let systemd folks do that while shitting on it so we can cook up a 600 LoC 'minimal init' while outsourcing a whole lot to systemd maintainers while taunting our minimalism?


Have you ever heard of mdev? As described in "mdev like a boss"? Or its cousin mdevd from https://skarnet.org/software/ ?

Personally, I wouldn't shit on systemd, because it already stinks like hell, not a nice place to dump :-)


> Have you ever heard of mdev? As described in "mdev like a boss"? Or its cousin mdevd from https://skarnet.org/software/ ?

I have. I also have heard it runs into trouble with the likes of GNOME/KDE.

Is this seriously your 'competitor'? I did have a good belly laugh at this one, thanks!


I feel this way about current Gnome/KDE, and the like.

So no real disadvantage for my cozy caveman setup.

Anyways, laughing is good.

Even if it sounds terminally insane.


A caveman may well find that while he was hiding there the outside world no longer dumps in random places.


Is that really the case? I had visions about some place on some far away coast where exactly that seems to be the case.

Sankt Franziskus, or something like that.


You mean to say me that there’s no such thing as “superior programming practices”./s

I’d take a heavily used and tested system over one programmed “with superior practices”, like obsoleting rarely used code, any day of the week.

Yes, heartbleed was bad, but those type of bugs are rare, and once made public patches come fast. You don’t make your system more secure by exchanging the rare occasional general bug with obscure and unknown bugs (with unknown reach).

A lot of people thinking in security tend to fall for the availability bias, they want to protect themselves from the bugs that make news, while ignoring the ones that are removed day by day in a changelog of some minor utility they are using.


I'd say it depends.

> heartbleed was bad, but those type of bugs are rare

Not really. There's a constant stream of security vulnerabilities arising from buffer-overflows and memory-management bugs, even in the most high-profile C/C++ codebases like the Linux kernel and Chromium. C and C++ are minefields for undefined behaviour, and many security vulnerabilities can be tracked back to instances of undefined behaviour. Rewriting in a safe language, like SPARK or the safe subset of Rust, would close the door on these vulnerabilities.

That's not to say I'd uncritically jump aboard a safe-Rust replacement for OpenSSL. As you say, there's much value in mature and battle-tested code, and Rust's safety guarantees wouldn't guarantee you a bug-free SSL library, they only guarantee the absence of certain kinds of errors. Even a fully formally verified implementation in SPARK could still have side-channel vulnerabilities (e.g. timing issues).


Bugs are frequent, but most of them have limited scope, bugs that break everything from computers to smart toasters are rare.

As a C programmer myself I’m well aware of how big the mine field is, and I’m a big proponent for validation in the Rust style, but that was not the point.

This guide advocates for replacements for programs and libraries written in C++, with most of those replacements being written also in C++, but with “superior programming practices”, even when those replacements are very rarely used in most environments.


Even there it depends. 'Programming practices' is vague. Even C can be tamed, at great expense, using formal methods techniques. [0][1][2][3] Adoption of such methods can give a solid assurance of the lack of UB, like use of a safe language. Weaker measures, like adopting MISRA C, don't provide such strong assurances (although they can eliminate certain categories of errors), and as you indicate, their real value is a bit more subjective. Mandating a bad programming style could actively make things worse.

[0] https://trust-in-soft.com/

[1] https://www.eschertech.com/products/perfect_developer.php

[2] https://github.com/microsoft/Armada

[3] https://www.microsoft.com/en-us/research/project/vcc-a-verif...


> Rewriting in a safe language, like SPARK or the safe subset of Rust, would close the door on these vulnerabilities.

Unfortunately this is easier said than done, as evidenced by the fact that it is said a lot, and never done.


Agree. Not something I know a lot about but it seems to be a significant undertaking. I figure a production-grade implementation in safe Rust is more likely than a verified implementation in SPARK.

I don't know how serious the rustls implementation is. Nice to see it makes no use of Rust's unsafe features.

https://github.com/ctz/rustls


I stopped reading after that. I understand the author has good reasons from his perspective. I m also realizng the need for opinionated tech solutions in order to create a cohesive system. However the authors opinion is not moving towards a direction I want to follow.


Luckily (or perhaps unluckily for you because you stopped reading) there are plenty of systemd hardening tips in the article. Its quite strange to stop reading after an init system opinion but these init system religions are nothing if not strange.


It's not a religion, from my POV at least.

One of the things I try to do with my tech choices is watch my delta from mainstream options, where I expect most development and bug fixes happen. Given that systemd has been adopted by the bigger linux distributions I see more value in using it and have the big vendors fix any problems than trying to use niche solutions.

I m surprised that the author did have systemd hardening suggestions, didn't expect that. I 'm tempted to go back and check them.


Yet if you had to put a server on an unsecured network, his advice would be sound.

You would spend less time responding to CVEs and more time home playing with your kids (or trying to make some)


The lack of a CVE does not imply the lack of a security vulnerability.


This guide is indeed opinionated and subjective and in many ways misses the point but the "don't use systemd" advice is among the on point ones.

Still this is a minor issue for majority of users. Like most of this guide.


When I see stuff like that I usually just stop reading and move on knowning it will eventually get deleted as I delete history over 6 months old in my browser


..."this is the truth, just how it is"

It is. Anything else is brainwashing/newspeak/doublethink. Deal with it...


It's controversial only to incompetent admins, who should not be managing secure systems anyway.


My first mistake as a green "jack of all trades" admin was getting scared and feeling like I had to apply a "Hardening guide". Wasted so much time and energy because I didn't know how to determine what was a priority, both when it came to business needs and security. Guides like this presume you have generous amounts of time to put into a Linux hardening project with a questionable ROI when the person looking at it might be a solo overwhelmed admin in a security insensitive line of work.

Fact of the matter is that you can't take a broad checklist approach to security because you can't secure everything, you have to have an idea of where your biggest risks are and focus on those. If the computers in question aren't dealing with PII which one has some moral duty to protect, security risks can be outweighed by other business risks, and going on a security crusade can INCREASE business risk. Sometimes things that might be security risks are mitigated by other aspects of IT security or non-IT controls making them very low priority. Sometimes hardening certain things will have an especially adverse not worth it effect on certain users/applications (which is why these hardening settings aren't defaults).

Maybe worry about phishing attacks, that your authentication strategy makes sense, that your firewall is intact before worrying about arcane aspects of Linux Hardening. A guide like this is fairly useless in the real world without some discussion of the risk/benefits of hardening something and some mechanism to prioritize. To me this guide serves as an interesting starting point to constructing a practical hardening strategy, but I would advise against following it blindly. If you don't know what you're doing I would trust that Linux has fairly sane defaults and focus elsewhere to start with. Configuring your systems in non-standard ways is a Pandora's box of headaches.


A lot of this hardening is based on a fairly obsolete Unix view of the world. E.g. preventing users from seeing other users' PIDs, changing /etc/securetty, non-root X server... In real life, nowadays, a single system is likely to be running a single application, and once that's owned there's very little of the rest of the system to take over.

Owned a Web app with access to user accounts? That's enough to steal all the credit cards flowing through it, and spy/scam all the users in it. Becoming root, or peeking into other users, is usually unimportant because there's nothing special happening there.

I guess Kubernetes is the one thing that sort-of brings back these concerns, but even that has its own more modern particularities (e.g. the guide doesn't even mention cgroups).


I mean hell, in a corporate network where you might want to look at hardening the configuration you roll out to your machines, you're more likely to have a hardware firewall and a VPN requirement before you even get to your "squishy" Linux box.


Yes, and there's a clear mix of threat models (for instance, sometimes the guide seems to be emphasizing Whonix-style user anonymity, but other times something else entirely).

I see some very interesting things to go learn about in this guide, but as you note, it's not really presented that way. I would find it more useful if it either picked a more specific threat model and stuck to it, or suggested a particular system that has already made opinionated choices on some of these protection mechanisms (maybe like Whonix, Tails, or Qubes), or invited a discussion of "for each of these options, why isn't it already the default everywhere? did some people find the associated threats too exotic, unlikely, or irrelevant? are there harsh tradeoffs here?".

One example for me is the ptrace restrictions. It's clearly true that attackers can use ptrace to attach to a running process to steal secrets (if you are relying on processes as a security boundary, this feature will tend to pierce it), which was the main motivation for the kernel having an option to restrict it. On the other hand, if you're a software developer or security researcher dealing with binary executables, ptrace is very helpful to you and you may be irritated if you turn it off. That might be a sign that there should be different OS default profiles for "R&D" and for "production server" or "dedicated administrative workstation" (or maybe not), and perhaps if you don't know what ptrace is you should default to disabling it, but it still strikes me the wrong way not to see any context about this.


I agree with this and would add that some of the hardening they are suggesting will make debugging problems much harder. This article assumes that if you are doing this in a corporate or datacenter environment, that all of your engineers have vast experience. Even then, some of the tweaks mentioned will require making changes and rebooting to debug some problems.

Not only should people read up on and understand the implications of each change, but also do "battle testing" before doing this in production. That means going beyond functional testing you might get from a CI/CD environment and instead letting this bake in a staging environment for a few months first. It needs to be an environment with real traffic, real workloads, but low service level agreement and low revenue impact for an extended period of time.

As an aside, I can tell they have not tested some of the things they are suggesting. The first obvious one that pops out is the changes to /proc in fstab. That won't do anything on 100% of linux distros, as /proc gets mounted by the init process. That has to be done in a init script as a "remount". Also, the "ipv6.disable=1" in the kernel options is not advised. Do this in sysctl and in the network scripts, or some things may break depending on how your system is being used. The ipv6 module is used for more than ipv6. There are also some kernel options that change in kernel 5.x that this article does not mention. The ulimit settings will not apply to systemd processes and will not apply to root. They mention disabling debugfs, but not binfmt. If I had to pick a more dangerous thing to disable it would be binfmt, which brings in dangerous concepts from Windows 95/98. I need to stop reading this or I will spend all day writing notes on the gotchas.


> The ipv6 module is used for more than ipv6.

Would you tell me why the ipv6 module used for more than ipv6? If you don't want something running, disabling it at the kernel level makes sense to me, so I am curious as to why that is not the case. If things don't work with it disabled in the kernel, but DO work with it disabled via sysctl, what is actually being disabled?


One example would be other kernel modules that require it as a dependency. So if you disable loading the ipv6 module, you can no longer load any module that references it, even if you would not have been using ipv6 in that module. The first one that pops into mind is the sctp kernel module. This also used to prevent loading the bonding module but I am not sure if that is still the case. You might try testing that.

Newer kernels have the ipv6 code built into the kernel, so the kernel boot option to disable ipv6 won't do anything regardless. The supported way to disable it is sysctl and the networking scripts.


Ahhh, that makes a lot of sense. Thank you for taking the time to answer my question!


To be fair I can't blame the author. The why part probably doesn't exist because he wrote it for himself, or, he wrote it for people who he knows and would be able to seperate what advice to follow for their use-case.

This is just an assumption on my part, but I feel like he just wrote down a list of what he has followed down in the past. He didn't write down the "why" because he wrote it in context of his own experiences.

Like that systemd portion looks very icky, but when you have the disclaimer in context. You'd want to go to something with the least amount of attack surface. Systemd is a giant project, and a very useful one at that, and for context I am firm supporter of systemd, but I can see why he'd put that in the guide, since his disclaimer says the guide isn't really about usability but minimizing the attack surface.

In turn it got submitted to hackernews by someone who read their article, and unfortunately some people would treat it as gospel and turn off their brains, since as you mentioned there's no "Why?", but it makes sense to not include it if he didn't really write it for the masses but just to document something for himself and his friends/colleagues.


We need to be better at being suspicious about any information. And do our own research and experimentation. But who has the time for that in this day and era when information is expected to be received by push notification and most people can't comprehend messages longer then an SMS.


Can confirm. So many people asked him to write a hardening guide he just ended up doing it.


A lot of what they talk about also conflates security with anonymity. While it may be good advice for connecting to coffee shop WiFi on your laptop, for a secure network, being able to tell who performed an action is less of a security hazard and more like a security requirement.


The problem I find is that those who know this tend to not spend the time going into detail on what you should actually do.


Before Google was a thing, UNIX security was already something to worry about, hence why Tru64 was created, or HP-UX introduced vaults.


and just before that cavemen invented shift rotation to defend the entrance to their dwelling


Your disclaimer applies to every article ever. I agree 100% but there’s nothing really specific to this article that makes me worry more. Its just common sense.


The recommendation for using LibreSSL is even worse. The fork started off okay, but is now far behind and totally useless.


Microsoft is using LibreSSL in the OpenSSH Windows port - got any more info?


Do you have a source for that?



Do you know of any resource to read up on this?



Maybe Simonjgreen can help improve the article? Seems like a relevant topic to me. Glad people put work and share on hjis subject.


Some omissions IMO: never let people SSH in with a password, and for the love of god, stop leaving private SSH keys on servers. Privat SSH keys should never leave your physical proximity. They should be on your laptop, or desktop, or your yubikey, but never on servers. Also SSH agent forwarding is the devil. Don't do it.

Lots of really braindead advice in there too, like disabling RDRAND because there might be a backdoor.. come on.. even Linus knows that's obvious bullshit.[1]

[1] https://nakedsecurity.sophos.com/2013/09/11/rudest-man-in-li...


> never let people SSH in with a password

this seems to be a recent cargo cult rooted in the fact that people tend to choose weak passwords and/or reuse them, but there is technically nothing wrong about it and the opensshd defaults (MaxStartups) make bruteforcing reasonable username/password combinations unfeaseable.

A private key is just a very long password that is stored on disk and if there is no password on the key-file, it is stored there in plaintext an can be used by every other application (e.g. browser)


Much of security improvements these days are not just enabling better security, but making them easy and default.

Sure it's possible to: have a strong password, use it only for a single machine, keep it in a good password safe, never reuse it anywhere, have the discipline to not make it short/easy to type/easy to remember, change it every time a server is compromised, etc....

Or you could get all that for free with public keys. Yes a private key is like a very long password, but the important part is that it never leaves the client side and it's safe to use on every computer you want access to without the hassle of remembering one password per server.

If you access many machines, how will you know when one is compromised? How will you prevent an attacker from logging in as you?

Public keys really are the better way to handle user auth, less of a minefield for regular users.


> If you access many machines, how will you know when one is compromised? How will you prevent an attacker from logging in as you?

Even better is to use SSH certificates. That way you don't have to deal with authorized (usually permanent) keys.

Once the SSH CA is installed in the host, the client can generate temporary asymmetric keys and sign them with the CA key before every connection.

There are a few ways to set up this scenario, here's one using Vault:

https://www.vaultproject.io/docs/secrets/ssh/signed-ssh-cert...


Not sure that's any better in security terms. It has some ease of use and central management benefits but also some significant complexity (setup and maintenance of a CA).

My setups just used puppet to manage a authorized key directory on each machine (basically one line of code), assuming you have a working puppet setup of course.

I'd consider either approach significantly more secure than passwords which is a much worse approach.


> Public keys really are the better way to handle user auth, less of a minefield for regular users.

I'll go along with that, thou i consider a secret that resides in my brain vastly more secure than one residing on my disk.


I 100% agree. But with ssh (ideally) the passphrase never leaves the computer you are physically touching.

With passwords you are sending it to remote computers where it could be compromised. Thus the standard practice of forcing all users to change their passwords when a server is compromised.


I've come to realize that if I don't use a password at least once per month, I will forget it.

...so, this isn't great advice either. Password vaults are important..... at which point, you can just use a complex unique password.


Okay, so use the 56 bits that you can store in your head as a passphrase to control the 256 bits (for ed25519) on your disk.


It’s not only bad passwords you’re battling against but social engineering and poor administrative practices.

I’m comfortable with telling people to turn off password auth on their sshd’s and treat password auth as the exception rather than the rule.

Also, it’s generally a nicer user experience.


There is a very significant distinction - a private key does not (or should not) leave the user's computer. A private key is very long password that is magically¹ never revealed to anyone during normal use, only it's accessibility is proven.

If the user does not verify the remote host's fingerprint (vast majority doesn't), they could be sending the password to an attacker / honeypot or just wrong server / password. And if they reuse passwords, server can record passwords and someone could try them elsewhere.

If you're using passwords, it is very likely the password was copy-pasted to an inappropriate host (or passwords common for many hosts). It is also very likely the first connection to a server from a given computer was without verifying the remote host (server fingerprint), users are likely try and accept all kinds of things if they connection doesn't work, and I've even seen people disable host key verification so they wouldn't be bothered when a server is reinstalled / ip is reused.

Local accessibility of private keys is a significant issue, but if a program can read .ssh/..., it can usually also alias ssh=store-pasword-and-ssh, so using password doesn't help that much, vs. it's other issues.

If you're using passwords for yourself, it may be manageable, but I never give other people ssh login access using passwords.

¹ using asymmetric cryptography


> A private key is just a very long password that is stored on disk

Absolutely 100% not. You're missing the most important aspect of PKI based authentication, the asymmetry. With symmetric authentication technologies, like passwords, the only way a user can authenticate is to hand over to the server (which might be malicious) everything that the server needs to impersonate that user.

Practically, this means that when I root the Linux server in your environment, and start up 3snake, I'm going to start collecting the passwords of every person who logs in, and I can use those passwords to escalate privileges around the network. This is not even a little bit possible in an environment where SSH key authentication is enforced.


That source you shared seems to show that there could be a problem with tainted RDRAND, since it is xored in at the end of the function, and the linked email thread has comments by "maintainer of record (me) for the Linux RNG" who "quit the project about two years ago precisely because Linus decided to include a patch from Intel to allow their unauditable RdRand to bypass the entropy pool over my strenuous objections.

From a quick skim of current sources, much of that has recently been rolled back"

At the very least, that looks like there might be something in it. I don't know crypto well enough to say for sure, but it looks like Linus needs to re-read rand.h from what that article says about the state of the comments and previous rolled back changes from Intel. It does make one wonder if Linus' arrogance and rudeness could be used as cover for a genuine backdoor.


It is xor'd because all entropy sources are xor'd. That's just the design of /dev/random.

And because of this, Intel or whoever would need to have weakened each and every entropy source you use to make anything of it.


That's not true, because the RDRAND entropy is mixed in last. So once you're under the assumption that RDRAND is nefarious, the microcode only needs to detect the rdrand-to-xor pattern to make the entire entropy pool predictable (for example: by setting the non-rdrand input to the xor operation to zero it could disable all other entropy sources).


A microcode backdoor capable of reading the existing entropy pool state is going to be a hell of a lot more powerful than a RDRAND backdoor, to the point of making a RDRAND backdoor worthless.


> braindead advice in there too, like disabling RDRAND because there might be a backdoor

Two sources of entropy xor'd will always be as strong as the best source. Done properly it's perfectly fine mixing rng sources you don't trust.


It is also important that the entropy sources are independent. For example, if RDRAND were defined in terms of /dev/xrandom, say by just flipping its bits, then doing

(secret bits) xor (bits from /dev/xrandom) xor (bits from RDRAND)

is very weak encryption. But of course, there is no reason to believe RDRAND is dependent on any of the usual random sources.


> (secret bits) xor (bits from /dev/xrandom) xor (bits from RDRAND)

> is very weak encryption.

It's as strong as the strongest source. This easily proven mathematically.


The mathematical result I think you are referring to is: any uniformly distributed random variable xored with any other independent random variable yields a uniformly distributed random variable. On the other hand, if we take the strongest random source in the universe, say X, and xor X with itself we get:

X xor X = 0

The result is just a sequence of zeros, which is not "as strong as the strongest source". In practice, people may take lots of different "random" sources as input to their random generators to make them "more" random. But if you don't take care to check if they are independent, you may have a problem.


On the contrary, the above computes exactly the identity cipher: your "ciphertext" is your plaintext.

Mathematical results will lead you to surprising conclusions if apply them in the wrong context!


Why do you think this is a common problem? I've never heard of anyone leaving private keys on the server. Usually the ssh-keygen is run locally on their machine and the public key is deployed on the server.


Oh, I've seen it plenty times when users want to SSH from server A to server B, they copy their private key from their workstation to server A.


A natural consequence of server admins disabling ssh agent forwarding.

All these “security truisms” tend to ignore the real world. When you remove features, people will work around it in a less secure way.


Oh boy... I need to step out into the wilderness more :-D


A result of people not understanding how public/private keys work. They just want it to work, and copy everything over to get what they want done.

These same people probably also joke about the marketing folks down the hall leaving post-it notes with their password on their monitor.

Security is the first casualty in the battle for convenience.


Urk. Wow.

The bigger issue is "Don't copy a private key. Ever. Period."

How would you even do this? Don't the various private key containers make this quite difficult?


It is just plain text in linux. Nothing hard to copy about it


What if you need to git ssh clone? Can you use your private key locally while your are currently connected to the a server?

Genuine question, this is why I have a private key on the server in out office so I can clone our private repos.

Can you use your local private key then?


Yes, you can pretty easily use "SSH agent forwarding", which grand-parent comment calls "the devil", but really it's not that bad when used selectively, and a lot better than copying the private key to the server.


If you're cloning a specific application, the server could have its own key with read-only access to the repo. But if you're using that for deployment, you could (in AWS) grant the instance access to specific S3 paths and pass a specific artefact URL to be downloaded. (Or something similar for other environments)

But maybe you've got a different use case? I'd start with "why are you using git directly on a (production?) server?"


I'm gonna out myself because nobody is giving reasons

Why is it bad to git clone on a production server? Let's assume I'm not an idiot please, just misguided. This isn't reddit.

Let's also assume I'm using an ssh key that is only used for the purposes of pulling the repo, no write access, never reused (deploy keys as the git* services call them)

It's really hard to see these tips as anything but parroting someone else if nobody throws down a reason


I don't think you're outing yourself exactly; this is a scale-dependent question. People will give different answers because their experience is coloured by the size of the systems and organisations they have experience with.

At small scale it's fine. Plenty of useful and valuable stuff runs on a handful of servers that have consistent identities etc and people manage them interactively. Process lives in people's heads or (better) a wiki.

At large scale it's a completely bonkers thing to do. There are tons of nodes; it doesn't make sense to mutate just one unless something has gone badly wrong. Interactive login to a production system should be triggering an incident or at least linked to one. Really you shouldn't be mutating systems in any kind of "manual" way because that kind of change is supposed to be locked down to authorised deployment methods. The current state of the system has to be legible to other team members and other teams, and the easiest way to do that is by looking at what's gone through the deployment process, which is usually navigable via a dashboard.

In the middle, it's possible you could have a deployment system that relies on "git clone" with a key on the instance. That would be a little weird because git is not a great way to store or distribute build artefacts. Not crazy though - could make sense in some situations.


So you already protected most of it - no write access and scoped keys fix most of the issues.

Other potential problems:

- bad checkout location may mean unexpected content is available via .git paths on the web

- anyone with access to the server can copy the key and have external access to both the history of the project and all new commits - they can see the PRs with proposed security fixes before they get merged

- repository may contain domain names, credentials and other things which don't need to be deployed, but can be useful for the attacker doing recon

- potentially exposing information about customers if they got mentioned in the history

It's not terrible to use git directly. There are just ways you can deploy a little bit better if it's worth your time investment.


  > Why is it bad to git clone on a production server? Let's assume
  > I'm not an idiot please, just misguided. This isn't reddit.
This cargocult ritual is for those who manage farms of identical servers. If you have a single production server with no load balancing, it's fine.

But be careful, Git does not handle file permissions well by default. If you have a directory or file that needs special permissions (like a cron job script) then you should set `core.fileMode` to `false` in your git config.


> This cargocult ritual is for those who manage farms of identical servers.

Then it's not cargo-culting, since for farms of identical servers manually checking out a repo is going to cause you pain.


It _is_ cargo-culting when publications begin to recommend it as standard practice, without explaining the reasoning. Because at that point either the publisher is doing it without understanding why, or the reader will follow the ritual without understanding why.


If you are running a modern proper devops environment you shouldn't be git cloning anything on a production sever, nor would production servers be using the same ssh keys to do that via whatever production server login you configured to reach out externally as the keys from a local sys admin dev machine to ssh into to the production server, which hopefully if you're running anything significant in production is through a proper IAM account or service account through a command line tool authing you to a certain level of underlying vm/node access to all production services in that rbac role.

To be hotly debated but regardless of the most secure production deployment environment you should have production environments like layered images and all required installs on a private company image repo where the company vpc deploys through a build server that deploys image tags to a private company image repo in which an ingress pulls down from after running things like clair and what have you to make sure at the very least you have the latest library versions running with the most recent patches on stable and can deploy replicas as across many servers/nodes through an intermediary deployment architecture whether it be k8s or nomad (which also allows for VMs not just containers) in which you could standardize and highly specify seccomp profiles app armour and any custom kernel modules or what have you.

Anyone using the same server to git clone (messing around in a private development box) to deploy production services would be a no go for me.

Any gitclone should be automated from a jenkins or similar build where org repo keys are authed before pushing to a private company repo (which also required with)

The only thing more annoying than the comments scoffing at how basic the advice is the reflection of how basic the production deployment techniques of the scoffers are. Nothing should be so non standardized in production as got cloning on a prod server ad hoc.

Ssh keys on a server seem an irrelevant situation to me in any production scale server I've worked in.


You'd be surprised. I worked at a startup and our entire deployment system was based on "git pulls" onto each production node, recompiling binaries, etc. Yes, it was crazy and also slow.


If you are able to use https instead of ssh, I find "deploy keys" quite handy for this scenario. Gitlab/Github provide these as essentially a temporary https password for selective read-only access to a repo or group of repos with an expiry date. The main downside I find is incompatibility with anything that needs dependencies and expects git over ssh... but then I try to avoid creating the scenario where repos must be built or assembled on each server.

What I would really prefer is to be able to use git over ssh with U2F, i.e a hardware key, in place of a private SSH key this should work the same from the server as the client. U2F is already in openSSH but I am not sure how long it will take before it is commonly available and added to git hosting services... i'm also not sure if the protocol will work through the terminal to an openSSH client on a server.

I already use hardware keys for OTP with SSH (yubico pam) which makes doing ssh between servers secure and easy without private keys, and without client software compatibility issues since it's just a keyboard-interactive mode as far as SSH is concerned... In fact if you are also hosting your own git service you could use this for git cloning over ssh right now and not bother waiting for U2F.


> Also SSH agent forwarding is the devil. Don't do it.

Care to elaborate? First time I hear this.


Others already explained why. Matrix.orgs postmortem includes a real world exploitation: https://matrix.org/blog/2019/05/08/post-mortem-and-remediati...


If you forward SSH agent, anyone with privileges to your sockets can use the agent. So for example when you log into some production machine, a process waiting on that machine connects to your agent and has access to... everything. (If you're agent caches key password you won't even get notified)


Assuming your local agent silently signs all requests. This attack doesn't work if e.g. your Yubikey is configured to require a physical tap for every authentication.


Using ssh-add -c when adding keys means it asks for user confirmation (but can still cache the passphrase) each time key use is requested. This can mitigate some of the risk here in the absence of other solutions.


This is a good theory and I mostly agree but if you disallow password logins, private keys on servers and agent forwarding - how do you copy things server to server without going through your slow and possibly remote local connection?


Magic wormhole and friends are pretty good at doing this securely, especially between servers that can talk to each other without relays.

See also croc


python3 -m http.server —bind 0.0.0.0 8080 is my favorite command


If we're talking about security, then exposing your files to anything with network access to your server sounds like a terrible, terrible idea.


I 100% endorse this method of securely moving files way way way above using things like agent forwarding and throwing keys on servers.

It's shocking to me how people overlook HTTP as a tool for passing files around.


You say that having private keys on servers is bad, and you say that using SSH agent forwarding "is the devil."

How, exactly, do you expect people to do things like use git or pivot from Server A to Server B over SSH?


HTTP is a thing.

Yeah, when pushing up files from a client to a server, or pulling down files, SSH makes sense, but for server to server, there's no need to use something with a full control channel for a simple file transfer.

As to your question on how to use jump hosts, that's what -J is for.

ssh -J ServerA ServerB


How would you automate a job where a server uploads something, say a backup, to another server, without a private key?


Standalone private key for a separate account that has write-only access to the backup area and little else?

In this case the key would act more like an access token for a specific operation that you're trying to perform, rather than as a key to a generally usable account.


Yes that makes sense. I think grandparent could have been more precise. I assume he mean _personal_ private keys.


I think he's talking about

    $ ssh-keygen -t rsa
    $ mv .ssh/id_rsa.pub .ssh/authorized_keys
    $ cat .ssh/id_rsa
    $ exit
    logout


In theory, I know Genode and other capability based systems could be answer this question... but not in practice. I'm off on that tangent seeking answers. Thanks for the "hacking prompt"


Use HTTP, there's absolutely no reason to be using SSH if you're just transferring files around.


> Never let people SSH in with a password...

Why not add libcrack (constraints) and auto-expiration of passwords instead of locking people out completely?

Also adding fail2ban is a very effective solution to prevent brute forcing.

> and for the love of god, stop leaving private SSH keys on servers

Why not? Unencrypted private keys are dangerious, OK, but I fail to see the problem with a good password protected private key?


So maybe if you add constraints and force rotation you can eventually get a secure-enough system with passwords. You'll still be using a symmetric key rather than asymmetric, which means that compromises can be made worse (I get root on one server -> I get user passwords as they log in -> I have access to everything that they have access to now), and you've invested a lot of effort to secure it rather than just `PasswordAuthentication no`, `ChallengeResponseAuthentication no`, done.


Security is not a monolithic item with a single best practice. A horses for courses approach is better IMHO.

If you're exposing SSH to outer world, minimizing attack surface by removing users and/or passwords is good. OTOH, if you're not exposing these services outside, or putting them behind a VPN maybe with people connecting from unsecured locations, adding a password is always a good idea in my book.

A laptop maybe stolen, a workstation maybe breached if someone is not careful and losing a key create bigger consequence.

So maybe instead of a tug of war about which method is the silver bullet, we should discuss about which one is better, under which circumstances.


> Why not add libcrack (constraints)

If the user has a super secure password shared with a different, compromised service, libcrack will not detect that.

> and auto-expiration of passwords

Expiry results in passwords like: (prefix)Dec2020, (prefix)5, or cycling the last 2/3 entries. Keys can be relatively easily revoked and are guaranteed not guessable.


> If the user has a super secure password shared with a different, compromised service, libcrack will not detect that.

There's a module[0] for that (TM).

> Expiry results in passwords like: (prefix)Dec2020, (prefix)5

libcrack can enforce similarity and rotation checks too [1].

> or cycling the last 2/3 entries.

There's also another module[2] just for that.

> Keys can be relatively easily revoked and are guaranteed not guessable.

You're right but, a good password policy and infrastructure is not a simple straw man which can be set alight with a simple match. PAM can be a bit hard to understand but, once understood, it's pretty easy to create complex rules and flows.

If you get your workstation compromised somehow, you can always lose your keys. Then you need another password on top of your key to keep it encrypted.

At the end of the day, a good security policy is required. Keys, passwords, fingerprints and other identifiers are just tools. If you can design your defenses and moats well, you can secure your system with any method you want.

[0]: https://github.com/skx/pam_pwnd

[1]: http://www.linux-pam.org/Linux-PAM-html/sag-pam_cracklib.htm...

[2]: https://linux.die.net/man/8/pam_pwhistory


> libcrack can enforce similarity and rotation checks too [1].

How can it do that without the server storing plain text passwords?


Unless you're root, you enter your current password first for verification already?


Ah, ok, that makes a lot more sense. I was worried that the password history was stored somewhere.

> Unless you're root

I'm not sure I get this part, why does being root change things?


Normally PAM doesn’t ask for your current password during password changes if you’re root.

Also, root can change any user’s password without entering the current one.


No SSH keys, passwords, etc. Use Kerberos baby, SSO in!

SSH and Kerberos has been around for decades, but few companies have it implemented.

It completely removes the need for SSH keys and their related management, user/group management, etc., which is a huge security and administration win.


How do you do key exchange if no pw login in ssh (here helpdesk prints random pws)


You can bootstrap key exchange from a short-lived password. Not perfect but at least limits the attack time window.

Companies serious about security just have a trusted person hand out hardware with signed keys on it.


So is there a backup of the key on the hardware somewhere (in case it gets lost or destroyed)


Once you have exchanged the keys nothing stops you from creating new ones. You can for example register an additional backup Yubikey.


Small scale: You generate a keypair and give the public key to whoever is setting up your account on the server.

Large scale: You generate a keypair and give the public key to Vault or whatever, which signs it with the CA that all servers know to trust.


> RDRAND

Did you read your own link?


This list is cool, but I think it would be more useful with some quick notes on side effects. I think all but the most hardcore Linux lovers won't have enough breadth and depth to understand the full ramifications of many of these config changes.

I'm no Linux expert but even having used it for a fair number of years, I only have a shallow understanding of most of the things that are tweaked here, if I've even heard of them at all. For example, it says "kernel.kexec_load_disabled=1", and points out the threat which I can understand, but I don't know what that's used for in the first place. Perhaps it isn't really ever used for anything, but it'd be nice to be told that, because my initial line of thinking is that these defaults must be defaults for a reason and that makes me hesitant to actually make any of the changes.


Most of the ideas are good but I think you need a pretty big team to sustain your systems if you're flipping all the security switches to non-default values.

It would be fine as a one-time activity but it creates an ongoing stream of work to deal with breakages, particularly during upgrades.

Your OS ends up in a configuration that no upstream devs have tested compatibility for and you end up shouldering that for each component of your workload.

The recommended security options (given you have opted to tweak all the things) change over time as well and that creates additional work.

I think for many (most?) teams choosing a minimalist OS where an upstream does this kind of maintenance (say COOS or BottleRocket in container world) will produce better real-world results.


Or just use FreeBSD if you really care about secure by default.


I did this, and 1 month later got bad emails about shutting down the server because of ntp vulnerability. Maybe it was just a new vulnerability found at that time, but then this brings me to the missing item of the text: how to keep up? What pages tell you about new found vulnerabilities, so you can mitigate.



Thanks. I am always puzzled why linux on server is more popular than BSD.


According to the author:

> Despite popular belief, OpenBSD's security is actually lacking in a lot of ways.

https://madaidans-insecurities.github.io/openbsd.html

It's probably not better on FreeBSD.


Isn't that OpenBSD?


The only thing OpenBSD secures you from is good performance


Lol. Yes I have observed that there is a massive performance hit.

But OpenBSD is packaged for, shipped for and boots nicely on obscure hardware. So I am still around.



This guide is concerning for the following reasons:

* blithely states that systemd is not secure, uses lines of code in an unit system as a security measure and makes claims not substantiated about it.

* recommends musl and misleadingly states number of CVEs as some sort of security metric. Completely overlooking that glibc was created in 1987 and opposed to musl which was released in 2011.

* ignores the fact that a lot of effort had gone into hardening openssl, and seems to think supporting OS/2 and VMS equates to bad code security

* seems to misunderstand the purpose of long-term support kernels, which rather ironically contradicts there first point about frozen updates... author should do some basic reading here, which directly contradicts the states there ate only two releases of kernels stable and LTS (there are actually four categories of releases, and don’t necessarily happen for security reasons): https://www.kernel.org/category/releases.html

I am still reading through it, there are interesting points made but I’m definitely taking it with a grain of salt given the above!


> Linux is not a secure operating system. [Click link for an explanation by the same author]

> There is no strong sandboxing in the standard Linux desktop. [...] This is in contrast to other desktop operating systems such as macOS or Windows 10 [...]. Windows automatically sandboxes UWP applications and provides the Windows Sandbox utility for non-UWP applications.

Is this actually fair? Windows Sandbox is not that different from using docker, or even better a VM, is it? It's great that UWP apps are sandboxed, but they're a minority of the apps people use.

I don't get how Windows gets a point over Linux here. Running all or most apps in a sandbox without the user even noticing improves security, but neither OS really do that effectively. Windows Sandbox is cool, and Linux doesn't have something like that out of the box, but a VM is easy to install and create no?


I saw that page too, and the author seems to, uhm, have very strong opinions.

The author has this to say about the Linux kernel:

> The kernel is also very lacking in security. It is a monolithic kernel, meaning it contains a colossal amount of code all within the most privileged part of the operating system. The kernel has huge attack surface and is constantly adding new and dangerous features.

And then goes on about how insecure eBPF and user namespaces are, before concluding with this:

> Other kernels such as the Windows and macOS kernels are somewhat similar too in that they are also large and bloated monolithic kernels with huge attack surface but they at least realise that these issues exist and take further steps to mitigate them.

This person is clearly bending over backwards in his quest to prove how insecure Linux is compared to the competition. Some of his points are true, but there are so many glaring inaccuracies that I don't even know where to begin with. For instance, many links in that article are bogus. Multiple links that's supposed to showcase vulnerabilities in eBPF and user namespaces are just someone ranting about these features instead of it being an actual bug report. Links to a Debian commit disabling user namespaces are explained as a direct result of security bugs, when it's clear from the code comments that it was disabled only as a preventative measure because it was deemed premature by the maintainers. Actual eBPF vulnerabilities linked in the article requires eBPF JIT, a feature disabled by default but not stated as such. And so much more...


It's great that UWP apps are sandboxed, but they're a minority of the apps people use.

It's not just sandboxing of UWP apps and Windows Sandbox. Recent Windows 10 version have a feature called 'Controlled folder access' where you can designate certain folders as being protected (besides the Windows system folders). If any non-whitelisted application attempts to access these folders, you get a notification and their access is blocked by default. You can then whitelist the application if you want to grant it access.

It's more course-grained than fully sandboxes application, but it's a nice extra layer of defense for application that do not use sandboxing by default with a portal-based mechanism to request access to certain files/folders.

There is more information here:

https://docs.microsoft.com/en-us/windows/security/threat-pro...

(This should be fairly easy to implement in Linux with a MAC framework like AppArmor/SELinux or perhaps eBPF. But it's not useful for most people until someone actually implements this in a user-friendly manner.)


Which are some of the ways that Android deviates from GNU/Linux.

SELinux, seccomp, eBPF, FORTIFY_SOURCE, HSAN,.... all enabled.


Indeed. Unfortunately it seems nobody in the traditional Linux ecosystem outside Red Hat (and perhaps Canonical's Snap) is really interested in meaningfully improving UI application sandboxing.

Flatpak is often met with outright hostility and then some ranting about how OpenBSD's pledge already solves all problems. Sure, Flatpak is not perfect yet and too many Flatpaks still have to use filesystem=home (which usually caused by the application or toolkit not being friendly to sandboxing). But the Flatpak folks at least try to gradually improve security.

Disclaimer: I mostly run Linux on the desktop. But application security is definitely a blind spot for the Linux community. Many people still seem to think that UID 0 is the ultimate goal. We are mostly protected by the fact that Linux has ~1% of the desktop and is therefore not an interesting target yet.


I am fully with you, although I lost hope on the Desktop GNU/Linux, so just end up using it on a surviving travel netbook.

However I do use Ubuntu with AppArmor, Snap and whatever else that Canonical is trying to do to improve the situation.


Device Guard and some other Defender features I'm forgetting the name of are pretty cool, I think they're even in 10 Home now. I don't think all Store apps are sandboxed now, either.

I don't think it's fair to compare Linux without AppArmor or SELinux, really. Someone more versed in both may be able to provide more info. If I was as worried as the author and had a risk model to back it up I'd probably run QUBES or OpenBSD or both.


Yeah, he lost me when he compared Linux file access to MS-DOS. Really now? Even 1995 Linux was nothing like MS-DOS. Sure, POSIX is nothing to be admired. But it had some security for fucks sake. Users generally can't see other users' files without an exploit. Whereas MS-DOS... doesn't have users. Which... I mean, what the fuck are we even talking about now?


IMO UWP sandboxing is not useful. I don’t have a single UWP application and I use my PC for a lot of tasks.


Which is why Win32 sandboxing is part of Windows 10 X feature set.


Windows sandboxing can also be applied to Win32, when delivered via MSIX packages, and since Reunion introduction one of the goals is to merge both sandbox models, as per Windows 10X roadmap.


While I did not read this properly yet, it seems like a good primer.

There is also a great set of ansible playbooks and roles that should cover this and more that is a good base for Linux servers: https://github.com/dev-sec/ansible-collection-hardening


I’m ehhh about the musl suggestion. I spent way to long finding problems that were related to musl


When I found problems apparently relating to musl, it unfortunately mostly (like around 98%) ends up to be an application relying on GNUisms instead of actually being a musl bug.


That's not necessarily a bug in the application - only supporting glibc in an application is an absolutely sensible option.


I do get it when the software is only targeted to Linux* - but I encountered this with applications that supposed to be cross-platform!

* GNU/Linux


> If possible, your timezone should be set to "UTC" and your locale and keymap to "US".

Is there any special reason why other values would be less secure?


> This guide is focused purely on security and privacy

Maybe for privacy to make you stand out less?


But then he does seems to go out of his way to customise everything else. Making his machine a unique, finger-printable, snowflake...


IP will leak your location. Using non-default time zone for this location means less privacy.


I've got mixed feeling about this page.

1. It does list some completely valid options which are good to know. All of them are interesting both from the perspective of learning about more Linux internals and actually locking down access where needed.

2. It mixes security and privacy. Some privacy things may be interesting, but going as far as removing your machine-id? What's the scenario here exactly?

3. It tells you what you can do, but often doesn't say why. What's the threat model? It bounces between things that could be useful for the desktop and for the server. If you're protecting yourself at home, what exactly does blocking ICMP provide you, given it's likely both a flat network and you're registered in both upnp and mdns?

4. Some points I find really questionable: It says to avoid distros with systemd, however systemd was the first one to really bring service sandboxing to the masses. So many issues could be avoided if we used PrivateTmp years ago. Some points are really bad: Avoiding distros that freeze packages (I guess vs rolling distros) is not a trivial change and is not obviously more secure.

5. https://xkcd.com/1200/ - Sure you can put all of those extra options in kernel boot, the extra layers of service separation, spend time hiding identifiers and network options. Unless you're specifically targeted, nobody will ever try that. You'll be owned by some XSS which pulls your login cookie - and the list doesn't even mention Firefox tab containers which can separate that content. Or if someone's targeting developers, your SSH key will get extracted - and the post doesn't even mention hardware SSH keys.

Overall it reminds me of CIS policies. "Here's a CIS certified docker image. It has aida and tcpwrapper on it, because security."


I'm glad that this called out systemd in the very first section, because it told me that the rest of it probably wasn't going to actually talk about anything actually meaningful for security.

Edit: Oh god I hadn't even gotten to the end of that same section where it recommends Gentoo???? This article is written for people who read Cryptonomicon and took it a little too seriously.


Don't see anything wrong with Gentoo in their context.

Author says in disclaimer This guide is focused purely on security and privacy, not performance, usability, or anything else. He's not wrong. Gentoo makes it easier to have compile flags while building your system. Say you want to disable pulseaudio support completely? You can get rid of it completely from anything that might link to it by setting it globally as a flag you want to avoid.

Sure the guide doesn't follow a threat model, but there's still some good advice in there. If someone follows the guide as dogmatic gospel, as a list of rules to follow at all cost, that's on them. If one is responsible for securing down their stack, maybe they should know better than following everything down to the bone as if it's some gospel.


Would you honestly say systemd is the most secure init system when writing your own "Linux Hardening Guide"?

That's a big call to make in such a context.


Yeah probably. Its support for easily configuring daemons to run under dynamic users, with private tmp, lesser capabilities, read only root, etc etc makes it quite attractive.

Even more attractive if you’re writing a blog post about hardening Linux against ill specified ill specified threats. Writing a bunch of config files to “lock down” random daemons seems like it’d be right up this guy’s alley


Would you say Gentoo is the most secure distribution to use when writing yours?


No, I wouldn't. Qubes likely has the most security features out of the box without configuration for the average person trying to lock down their computer. BSD and Arch-flavours certainly have plenty going for them though and I wouldn't speak badly of their intent+outcomes.


systemd has a strong track record of finding and fixing security issues and getting the CVE published. This is important if you want to consider "secure software" and getting things resolved in a timely manner.

People focus on the attack surface, but most of it is local exploits where the threat model is "someone has shell on your box".


Are you saying that you can’t harden systemd and reduce its attack surface?


whats wrong with Gentoo, my guy ?


Absolutely nothing at all, but it is not a 'more secure choice' than an LTS distribution of a mainline distro and while this guide can't really decide if it's targeting local or server usecases, I can't imagine a worse existence than being responsible for a fleet of servers running Gentoo.


While I agree that Gentoo is probably not significantly more secure than most LTS distros,

> I can't imagine a worse existence than being responsible for a fleet of servers running Gentoo.

You lack imagination:) NetBSD, AIX, LFS, Arch Linux...


Systemd is evil. What alternatives are there which were designed with security and speed in mind? I would prefer something simple instead of a lot of features.


Inanimate things without preprogrammed social behaviours are neither good nor evil. There are lots of alternative inits and you can use all the same namespacing tricks yourself with unshare and ip. But you'll have to maintain them for each service yourself and that's A LOT of work.


Some of the recommended steps like removing machine-id to prevent fingerprinting are especially funny when the rest of the article perfectly describes how to get your system in a completely uniquely fingerprintable state via obscure knobs most people don't touch.

As you've mentioned, it's a hardening "guide" without specifying a threat model, so its value is very limited anyway.


CIS policies are a regulatory and government requirement for many industries. How does one combat this wasteful hardening activities?

https://aws.amazon.com/marketplace/pp/Center-for-Internet-Se...


There's a saying when you walk the Camino De Santiago that you shouldn't worry about your next meal or bed because the Camino will provide.

I notice it is also true about HN.

Yesterday I was researching about Linux security and was disappointed by an eBook on the subject. Today HN has provided :)

Thanks HN, see you next year.


I missed? Such an important was not mentioned

NoNewPrivileges=yes

"A new system.conf setting NoNewPrivileges= is now available which may be used to turn off acquisition of new privileges system-wide (i.e. set Linux' PR_SET_NO_NEW_PRIVS for PID 1 itself, and thus also for all its children). Note that turning this option on means setuid binaries and file system capabilities lose their special powers. While turning on this option is a big step towards a more secure system, doing so is likely to break numerous pre-existing UNIX tools, in particular su and sudo."


there is no threat model described here afaics. seems pretty shallow imho


Right... but then if the threat is remote access, the author would have to remove nonsense like systemd making your box insecure (racy init shellscripts being the pinnacle of security), and old platform support that you don't build in openssl being "abhorrent security practices".


Exactly this. If your threat model needs you to implement all of this then also use TEMPEST hardware to really lock things down. No use in hardening everything when an attacker can read your screen from a safe distance with some SDR-gear and a suitable antenna.


I think everyone with valid criticisms of this should file an Github issue, I'm definitely planning to, because of these things:

- Lots of discussion on X11 security issues without any mention of wayland

- Not on the Linux page, but they recommend iOS as a secure OS, which is total bullshit given how many failures we've seen with serious bugs/vulns put into production. I can't even remember how many times I've read about bugs in Safari, Whatsapp or some other app that can be chained to get kernel-level privileges. Remember the Jeff Bezos hack?

- No discussion of threat models

- Focusing on academic/technical arguments and not looking at real-world malware ecosystems/exploits (or: why there is orders of magnitude more malware for Windows than Linux)

- Memory safe languages - Linux is totally exploring a way to use rust for parts of the kernel, and Windows is still probably 99% C/C++

I'm all the more confused by this guy since he's a whonix developer, this almost sounds like a Microsoft employee based on how little scrutiny he applies to Windows...

https://github.com/madaidans-insecurities/madaidans-insecuri...


> Configure a cron job or init script to update your system daily.

He recommends daily automated updates. This is quite dangerous from my personal experience in various distributions. An update might break your system or might put it into a vulnerable state. I very much prefer regular manual updates, as it allows me to read release notes, bug reports and manual user intervention instructions before update.


> I very much prefer regular manual updates, as it allows me to read release notes, bug reports and manual user intervention instructions before update.

Do you have the time for this though?

> 198 packages need to be updated


But, not having the time to review 198 updates shouldn’t justify automatically permitting the updates.

I agree there isn’t time to review them, unless you have plenty of staff and time (no one does)... but the recent Solarwinds debacle was a supply chain attack, and automatic updates allowed an exploit to be propagated to many companies and agencies.

I would ask myself if I need that many packages. And raise the red flag with management that maintaining proper security is a challenge under such circumstances.


How would reading release notes help in the case of the Solarwinds attack?


Sure, for most (including myself) it's impossible to review every little update. But the bug tracker of your distro is a good indicator if some of the most recent updates created a bunch of new tickets, especially those that are considered to be critical. In case of Arch Linux it is strongly recommended to read the latest update news on archlinux.org, before update.

In addition to that you get some experience over time which of your package updates caused most trouble in the past and you can take additional care. For example, I know that on my desktop the combo of displaylink and xorg does break from time to time with updates, therefore I carefully lookout for issues regarding xorg updates for displaylink users, before I do the xorg update.


Your service should be able to cope with a failure. Just ensure your updates don’t run at the same time on all your services.

I have two proxy servers which auto update. One in the morning, one in the evening. If the morning update breaks, it flags up the problem, and we can then fix before the evening one updates.


Okay, I have a server running the database (postgres) that backs my user-facing apps. How exactly shall I run kernel/glibc/systemd/postgres updates without customer-facing outages?

EDIT: To be clear, I'm mostly arguing that you're wrong, but if you have a solution then that'd be great, because I dislike the difficulty in patching these things.


Been a while since I’ve needed to run a critical database, but the options were generally a master-slave with replication, or an active-active cluster


Yes, we're currently doing primary-replica but that still requires downtime while we do a failover.


I remember building a system about 15 years ago which automatically failed over in under a second - the application connected to the most recent one and set up replication to the other one. If the master went down then the slave was promoted to master and when the original master came back it was configured as slave.

Looks like https://www.citusdata.com/blog/2019/05/30/introducing-pg-aut... will sort out a Postgres master master for you.

I certainly don’t have auto updates on all my servers (wouldn’t be good if a tv program going out to millions of people suddenly vanished — there’s always a glitch when it fails over), I do have it on public facing servers though, so in your case I’d have it on the web servers but not on the database.


Some people only have one server for a particular use case.

And servers might not always break in predictable/alertable ways.


If your service is only provided by one machine then you have to accept it will occasionally be out of use.

If the only monitoring of the service you have are users telling you, then that’s not a problem you can avoid by not updating.


> If your service is only provided by one machine then you have to accept it will occasionally be out of use.

But, I can minimize possible downtimes, by learning from other downtime reports on a bug tracker.


The blog mentions the ill usage of OpenSSL in favour of LibreSSL. At the same time, right now the devs at Gentoo are having a heated discussion to discontinue LibreSSL support.


Question is, why not run Lynis for example and research the output it gives you? It follows along the same lines of no password login for ssh, no x11 fowarding, etc.


> fixes for most security vulnerabilities are not able to be backported to LTS kernels

This is nonsense.


I like this, very good reference material.

But as the disclaimer notes, this should not be followed blindly. It's sort of aimed at a niche audience because imho you should know what everything in that guide does, or at least know how to look it up, before applying it.

Also I feel that the overhead added by not using systemd creates a lot of attack surface too. So that's a difficult trade off to consider.

This is a rabbit hole of a debate but I'm interested in comparing the security trade offs of using for example RHEL with systemd where you get properly packaged SElinux policy for all your packages, and using Gentoo without systemd where you get no SElinux policy packaged at all and you have to write and maintain a bunch of init scripts.

(And before anyone chimes in, I know RHEL has had to make compromises in their SElinux packaging but at least it's there and you have the option to build upon it)

I personally prefer having the systemd+selinux option over any alternative. Because ease of administration is also a factor that influences security in the long run.


This is a nice collection of options that you have, but you should evaluate each of them very carefully before actually implementing them.

Switch to libc to musl? You need to talk to the application developers first to see if their applications work on musl. This step alone requires some extensive testing; I'm sure some of the other steps do as well.


For average joe using ubuntu vps, what is the safest guide in this list that can be applied?


Awww, does anyone remember the Linux Administrator's Security Guide (https://seifried.org/lasg/)?

It was such a valuable resource in 2001.


There's no mention of snaps in the sandboxing section. This is a shame, I'd like to hear the author's analysis of its sandboxing.

As far as I'm aware, it is stricter on providing permissions to snaps than flatpak, in that classic confinement and special plugs require store approval. It has the same issue that most snaps provide access to the "home" plug for trivial access to much user data, but dotfiles are excluded so there is no trivial exploit through .bashrc, or reading of .config data, for example.


If these things are true, author is talking about true, why are they not set to hardened by default?

Why is ptrace enabled by default, rather than disabled? Why is /proc visible to any other process? Why aren’t the ASLR bits already set to 32?

This of course leads to the question, why is there even a way to change this and why don’t we live in a opt-out world? This reminds me about that whole Apple vs. Facebook discussion again


> Why is ptrace enabled by default, rather than disabled?

because it's useful to be able to attach to a running process with gdb


If you want to use it you can unlock it. Why is it unlocked by default?


Because unlocking it requires root and users who want to submit a crash report may not have root privileges on the system.


>You have likely heard of regular protections such as Position Independent Executables, Stack Smashing Protector, immediate binding, read-only relocations and FORTIFY_SOURCE but this section will not go over those as they have already been widely adopted.

Is Debian PIE by default now? last time I checked, this was not included in their hardening patch to GCC, yet this was well over a year ago at this point.


PIE has been enabled on most popular architectures since 2016:

https://sources.debian.org/src/gcc-6/6.3.0-18+deb9u1/debian/...


This is meme security advice. Anti-systemd, anti-glibc, anti-openssl. This is security by internet fad.

As for the rest of the advice, it doesn't explain any of the tradeoffs, making it worthless. The kernel devs and distribution maintainers are not idiots. If you can't explain why it's on, then you don't understand it enough to turn it off.


This guide should be titled as “Compendium of Linux Hardening”.

It is not for the average system admin nor targeting a specific system usage model such as desktop workstation, embedded, or router/gateway/switch.

Given that, this guide is a very useful summary, which of course is only for the seasoned security developers and admin can use.

The rest of you can tread lightly.


Throwing this into the mix as another source on basically the same topic https://www.ncsc.gov.uk/collection/end-user-device-security/...


Something that might help this article is to group the recommendations into a set of conceptual guidelines for hardening any system:

1. What ports are open?

1a. Can any of them be closed?

2. What services are running?

2a. Can any of them be stopped?

3. What permissions do those services have?

3a. Can any of them be reduced?

... and so on


Chrome OS does most of these things by default, at the cost of trusting Google.


Container optimised OS (used for GKE on google cloud) is built from chromeOS sources, so enjoys the same protections.


Does anyone know how "hardened" various AWS/Azure/GCP linux distros are? Also, what about those popular base docker images? How "hardened" are they?


Reminds me of CIS benchmarks and their hardened images https://youtu.be/71ek_fm3TNs


If you want a battle tested hardening profile, use DISA’s STIG.


As a newcomer, how many of these tips would you suggest for a Fedora/Ubuntu/<whatever mainstream distro> PC meant for everyday computing?


Close to none. A list for normal users:

- don't delay updates

- use Firefox tab containers to isolate browsing contexts so a random page can't mess with your gmail session (not sure if Chrome has something similar)

- have backups and check they work

- use 2fa where possible

- if you use SSH, move your key to a hardware token


Also:

- set up full disk encryption

- set a boot up (BIOS) password

- for ssh: disable root ssh access, and disable ssh with a password

- lock screen when away from computer, auto-lock when away


Some of the `su` restrictions and sandboxing could be useful, but most of this list is overly pedantic and honestly for my part I would not recommend it, it would be hell to maintain and is unnecessary for a desktop user. Just operate a firewall which does not allow inbound access, and only run programs you trust (e.g. from the distribution repository, from a developer you trust, auditable code, etc.)


Lets put it this way, most of them are enabled by default on Android.


This sounds like an advert for alpine Linux.

That said, the take away I think for most users is: software side good (selinux, firewall, sandbox).

Kernel stuff tread with care.


These are my absolute favorite resources. Succinct and to the point with lots of links to follow if you want to dive in for more.


There’s still no usable userspace firewall on Linux. There’s OpenSnitch but it’s so far from being any good.


How do FreeBSD, OpenBSD, and NetBSD compare to Linux (Gentoo) for security?


Is this really where we are on Hacker News? The brightest minds on the internet can't even agree on systemd or other systems without coming to a sound conclusion? Now wonder tech is so fragmented.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: