Hacker News new | past | comments | ask | show | jobs | submit login
Every Sufficiently Advanced Configuration Language Is Wrong (matt-rickard.com)
42 points by rckrd on June 24, 2022 | hide | past | favorite | 32 comments



A thousand times, yes. I've wanted to write this same article. Thanks for saving me the time!

The industry is going to great lengths to avoid writing configuration in any ubiquitous imperative programming language. We're seeing the proliferation of hyper-specialized, clunky declarative languages with sub-par tooling and package ecosystems. In what world are templates acceptable code? I don't mean to pick on anything specific, but this[0] is the most recent example I've come across, and it's far from the most unreadable examples.

[0]: https://github.com/traefik/traefik-helm-chart/blob/master/tr...


Is it? Terraform has been around for ages.


I think for a lot of situations the key is to separate the language from the actual application.

So for example the application should always take a JSON blob, or some other simple format.

However the user will likely want to generate their configuration form something with the ability to control complexity like Jsonnet, CUE or just Python.


The problem with configuration expressed as a code is that it is not automatically auditable. With simple ini or json files, the auditing tool can just parse them, or, even worse, grep, and find out if a particular option has a certain legally-mandated value. This also applies to configuration files generated via templating - you just feed the end result to the auditing tool. With the code, the auditing tool would need to run the code, and this is a no-no from a security perspective.


If you need this, you can split your application in two parts: one which creates the objects, then these objects are serialised to a 'configuration file formats' which can be audited and then used by the 'real' application.

IMHO this kind of audit is useless because even if one option has the expected value what guarantees do you have that this option is used as expected? Well you have to audit the code!


I think it's better in any case to pull the config values from your actual live resources in some sort of read-only view.

This is the only way to be sure you're auditing what you think you're auditing, and it doesn't matter what kind of provisioning tool you've used.


Configuration languages typically produce a json/protobuf/thrift/etc serialization (if designed well that is) so just audit that.


I wish Ansible was just a python API. Past a certain point it would be much simpler.

"Its supposed to be state and not a procedure" is BS in my experience. When you need to do IFs, load vars, do things in order anyways...it stops being a state declaration.


Check out pyinfra: https://pyinfra.com/

I've been been using it for a couple personal projects, and I really like it.


I've tried to move large ansible playbooks to pyinfra and there is no any benefit in the end. pyinfra is also "state"-oriented and for complex cases with logic you have to perform some acrobatics. IMHO switch resulted in the same mess as ansible playbooks.

Note: I am python-fanatic with 15y experience.


What do you mean by "state-oriented"?

Also: complex cases will likely result in complex code. Do you have a viable alternative for this?


https://docs.pyinfra.com/en/2.x/getting-started.html#state-d...

> Do you have a viable alternative for this?

My imperative Fabric 1k line deploy scripts in the past allowed to encapsulate complexity because I had remote state on hands.


That is exactly what it is though. The Ansible CLI is just a client of that API. You can easily interface with the library itself in python


AFAIK Ansible core operates in terms of tasks and plays. The only API it provides is how to construct a list of tasks and how to execute it. One has no remote state/vars at the construction time and has to use "when" or shell logic at execution time as in YAML.


I feel the same way. I feel that I am basically writing shell scripts, but in YAML (and a nice library of idempotent functions). I think Python or she'll would be superior. And if your tool is inferior to shell scripts you are really doing something wrong.



It surprises me that the author didn’t mention Lua. More than just being a beautiful and simple language, it has been used for config-as-code in a variety of applications for years.


Was about to write this. Lua makes much more sense to me than typescript for this purpose.


I miss systems use lex, bison, or their successors to define an appropriate configuration file format.

There ought to be some way to export the data structure definitions those things parse to, and then generate go/python/whatever bindings with IDE support for people that want to programmatically manipulate the config files.


Isn’t this what protocol buffers do?


Protocol buffers use code generation, but the format is not human readable. Also, the format is less expressive than a full grammar, and, in many cases completely unchecked. (It will often "successfully" deserialize data written for the wrong protobuf struct type, which produces random hex gibberish.)

Something like ~/.ssh/config is closer to what I am talking about.


~/.ssh/config has some some wildcard support, but I've been in situations where that wasn't enough, and I had to generate them using python with templates and variables. It got ugly pretty fast.


Further, state-managing configuration languages lack sufficient hierarchy. Terraform providers are pretty nice but they aren't first class objects (resources) in the language and so dependencies between providers can fail to apply on creation (Kubernetes resources depending on a cluster created by managed resources is a common one) and there are some other weird edge-cases like dependency cycles that arise when removing or changing resources across provider version upgrades. Provider aliases help but aren't a complete solution. Similarly, providers for additional state management tools (such as Helm for k8s template management) don't have sufficient visibility into the state management of the encapsulated tool. What configuration languages need is arbitrarily deep nesting of state management with a well-defined interface to inspect and diff it from a higher level.

Most imperative programming languages aren't quite a natural fit for declarative management of resources because the general loop of how the tools work is functional evaluation of the declarative statements to instantiate the desired states of all managed resources, a reader to get the current states of managed resources, a dependency solver to help order the plans, a differ to produce a plan of what needs to be changed detect and show what will need to be changed, and finally the applicative layer to modify/create/delete managed resources by executing plans against APIs. This is a pretty functional model at every step (except perhaps plan execution, but in theory all executions should be modeled as edges on state machines of valid possible configurations and not as a spaghetti of if/then http/bash invocations) and so while it's certainly possible to implement configuration in imperative languages I think it will almost always fit functional languages a bit better.

I haven't implemented anything better than the current crop of tools, so take this with a pinch of salt I guess.


I wrote a configuration language for CTOS back in the day. It ran on a multi-processor box and was responsible for launching services in a coordinated fashion across the processors during boot.

Instead of a language, it was a schema for configuration statements. You edited the schema to add new launch commands, then added launch commands to the script.

It was pretty esoteric - used all three kinds of braces to express one-of, required, optional etc. The only person I knew who changed it other than me was Andrew Thurber (the one at Cisco). He was a new hire at the time, and I was startled to find he'd just read the schema-parser code and figured it out.

Anyway not sure how that approach fits into the OP's objections.


With languages like python, code itself can become configuration.

Declaring constants, lists, dicts, etc can be basic config, that is an importable python module.

Advanced config can be functions or some constructs around code that the program packages into a module during runtime.


I use protobuffers for configuration definitions and prototxt as actual configurations. For me it checks all the boxes: typed, complex objects support, readable, you don't have to worry about parsers.


protobuf has terrible complex object support, you can't nest maps


Fair! For configurations I never managed to go that complex :)


I think the biggest offender of configuration language are the unintuitively multi-tiering of same settings but for different scope with varied ordering of precedence.

I am looking at you, Postfix `main.conf` and ISC Bind9 `named.conf`.

Damn near unauditable by machine code.


Any thoughts on HashiCorp Configuration Language (HCL)? I've found it pretty intuitive and sufficiently powerful when needs be.


The author does not take JsonSchema in account.


I think he mentions it in the beginning - "It falls apart when you try to do more: ... Type-check or schema validate", but I don't feel the falling apart in cases when I used JsonSchema.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: