Why JSON Isn’t A Good Configuration Language (2018)

betwixthewires · on June 8, 2022

Seems I'm in the minority here, but I agree with the author on his main point.

JSON is fantastic for organizing and passing data around, storing it, even for configuration files where the user doesn't need to edit the raw configuration. If you are like me and write a lot of tools that run in the background, lack a UI and where the only thing a user will be doing is editing a configuration file and running it, you're asking for trouble with JSON. Curly brackets and nesting and lack of basic formatting are bad for UX, and while it's a fantastic compromise for machine and human readability, it focuses more on machine readability and the human readability is just enough for you to look at it. It is not easy for a human to parse with their eyes.

I go with TOML for my configuration files most of the time. I am not a fan of YAML or XML. I don't think those offer any real benefit over JSON, and both have more downsides than JSON. TOML is a breeze to read and understand just looking at it, and is analogous to JSON as far as machine readability to the point that it's trivial to convert to it when you need to.

I do use JSON a lot, even for configuration files, but only when I'm storing a configuration in a flat file that the user should never have to see.

kcartlidge · on June 8, 2022

I'd agree. JSON is user-readable, but not really user-friendly and user-editable.

I'm okay with YAML but TBH, and I know this makes me totally uncool, for config that only nests to a single level I quite like INI files. I sometimes cheat slightly and take account of the ordering of entries in a section but generally speaking it's easy to read, easy to write, and very easy to parse.

dan_hawkins · on June 8, 2022

Also: no comments in JSON. This makes self-documented config files difficult to create and maintain.

angarg12 · on June 8, 2022

We use tons of configuration at work and this has come up time and time again. The biggest pain is when inevitably you need to "parameterize" your JSON files. The most common route I've seen is to turn your configs into templates e.g. Jinja and go from there. Welcome to hell.

My solution? Write a DSL with Typescript, and "compile" your configs down to JSON.

Define the structure of your configs as TS types. Write functions with free variables to work as templates, or define snippets as composable fragments. Writing the end result to a JSON file results in transparency (what you see is what you get) and compatibility.

unrealhoang · on June 8, 2022

jsonnet[1] might suit your use-case better, it was created to do exactly that.

[1]https://jsonnet.org

whacked_new · on June 8, 2022

jsonnet is my go-to language for anything related to configuration, after having tried json, yaml, TS, edn, and tasting dhall and toml. It addresses all problems in the article and more.

the composition model strikes a good balance between data extensibility / language expressiveness / ease of use.

the generated json leads to easy-to-understand and portable data, and if you write jsonschemas from jsonnet, tools like json-schema-to-typescript [1] make it easy to import a consistent interface, and almost every language has a reasonably up-to-date validation library.

[1] https://github.com/bcherny/json-schema-to-typescript

Matheus28 · on June 8, 2022

Not OP. Looks interesting, but I'd bet that typescript's type system would enforce constraints a lot better

whacked_new · on June 8, 2022

in my experience, for things like static configs, jsonnet has been superior to TS for constraints, if you pair the configs with a json schema (also generated from jsonnet, of course).

TS's types are easier to use at write-time, but json schema includes a lot of batteries that save a lot of time when you need them, but become annoying in TS, like patternProperties in dictionaries, length constraints in keys. Of course, there are situations like key1 XOR key2 where you will need custom logic one way or another, but given json schema's evolution over the years, I think it's pretty solid. An added benefit is that if you stay in json schemas, you're almost guaranteed a validator exists in $other_language, it _almost_ feels like a first-class construct everywhere.

Matheus28 · on June 9, 2022

That's a valid point. I think that TS might also lack constraints for object keys and values (like matching them against a regex).

After thinking for a while, I think that json schema might be the best approach to it (possibly generated from jsonnet if complicated enough).

AtlasBarfed · on June 8, 2022

Configuration is a rabbit hole that starts with a single innocent file (while it is not total bikeshedding arguing about the serialization format, it is just the tip of the iceberg), and descends to many complications:

https://www.blogger.com/blog/post/edit/9049876254685009339/1...

The author of the post is already signalling descent into config hell, they call JSON a language, which it TECHNICALLY is, but it shows their strong desire for more power and features and a descent into config hell.

campers · on June 8, 2022

+1 for TypeScript for config. I've been using Pulumi with TS to create a simple DSL to define all our GitLab config and alerting policies in GCP. It makes it easy to have concise definitions and quickly refactor the DSL with confidence.

Having a proper type system removes so many possibilities of inevitable accidental configuration mistakes. One of our other types of deployment projects has a yaml configuration. One time a section had the wrong tab indenting, took a little while to debug through why a table wasn't being provisioned.

OliverGilan · on June 8, 2022

I’d love to learn more about how you do this. Do you have any helpful resources that I can look at to understand the process of creating a DSL with TS + Pulumi?

campers · on June 8, 2022

Calling it a DSL is maybe bit of a stretch, its just creating a concise abstraction above the provided API.

So instead of having 20 lines of code for every alert policy like at https://www.pulumi.com/registry/packages/gcp/api-docs/monito... you create a function that generates all the standardised alerts from concise strongly typed config such as

    cloudSqlAlerts({
        instanceId: 'sql-main',
        highCpu: {
            percent: 70,
            duration: _5min
        },
        highMemory: {
            percent: 80,
            duration: _5min
        },
        highDisk: {
            percent: 80,
            duration: _10min
        },
        deadlockDetection: true
    }, 
    projectNotificationChannels)

eurasiantiger · on June 8, 2022

Just create types for all objects in your configuration as well as the configuration object itself.

The ”DSL” is the typings, and the rest is nothing more than a simple TS program which calls JSON.stringify() with the type-checked config object and writes the resulting string to disk.

valbaca · on June 8, 2022

Sounds like what AWS CDK basically does: uses TypeScript to write JSON (Cloudformation).

disintegrator · on June 8, 2022

CUE has been an absolute joy to work with on configuration problems. It’s a typed language which is something that’s been a massive gap in the config language space when looking at yaml, toml, json, etc...

https://cuelang.org/

cwp · on June 8, 2022

I'll echo this: CUE provides a lot of expressive power without being a Turing-complete language. That's a sweet spot for configuration—you can DRY things up and catch errors early, without introducing nondeterministic behavior.

betwixthewires · on June 8, 2022

Why would you need typing in a configuration file? I would think a configuration file would be specific to your program and any interpretation of data would be handled by your program.

disintegrator · on June 8, 2022

My initial comment was very vague and I'm glad to see others have replied to fill in the gaps. If your service/system has a sufficiently large configuration space, like say, Kubernetes, then a typed configuration language can greatly improve the development experience by spotting errors early on. You get a very fast feedback loop that you set the wrong value for some config option before attempting to deploy the change. Different services will give you appropriate feedback about wrong config options but that is sometimes a few steps removed from your development environment e.g. You might find out after you push a PR and CI/CD fails or after you merge the PR even.

The type system also has great second order benefits like allowing us to build an language server protocol implementation for CUE that has rich diagnostics, auto-complete, jump to definition, rename symbol features. Something that cannot be done _as well_ effectively in untyped languages.

I'm still scratching the surface. CUE does more than add types to config and I would encourage you to dig into it if you have spare time. By virtue of providing a great type system, it also manages to reduce boilerplate as yet another second order benefit. Boilerplate reduction is something where tools like Jsonnet attack as a primary goal but over enough time you're back at having seas of untyped config and indirection that are hard to navigate or contribute to and you're back at square one.

planckscnst · on June 8, 2022

CUE's type system is also its validation system. It's extremely flexible. You can define custom types and your own constraints on those types. It allows for disjunctive constraints (e.g. this value must be an integer > 0 or the string value "none"). The types compose well together. I'd highly recommend reading through the docs - they're great.

verdverm · on June 8, 2022

One of the benefits is that the checking is handled by CUE, so every program does not have to do validation checks like regexp matching, int bounds, required keys, default values, transforms...

Also, imports and dependency management for config will be big

arccy · on June 8, 2022

if your config is longer and semi repetitive, you can use it to template things out while still having a strong schema at every step

linkdd · on June 7, 2022

> Write your own

No.

No no no.

Definitely no.

I wish for a standardized configuration format so that I don't have to learn a new format for each software. At least JSON, TOML (and INI) and YAML are widely used enough so there is little chances that you don't know them.

ComradePhil · on June 7, 2022

They suggest "Writing your own" after they suggest the formats you mention and they clearly do not recommend it if anything else works for you:

> If for some reason a key-value configuration format doesn’t meet your needs, and you can’t use a scripting language due to performance or size constraints, then it might be appropriate to write your own configuration format. But if you find yourself in this scenario, think long and hard before making a choice that will not only require you to write and maintain a parser but also require your users to become familiar with yet another configuration format.

linkdd · on June 7, 2022

The question I must ask then is:

When does a key-value configuration format not meet your needs? Using a scripting language is still about setting key/values.

No, I claim that writing your own is *never* an appropriate choice.

So "think long and hard and then say no".

giaour · on June 8, 2022

Sometimes you need to support something more flexible than JSON but just can't hand your (non-dev) users all the footguns that come with a Turing complete configuration language. "Key/Value" makes the problem sound simple, but maybe you're writing a rules engine, and the values are expressions that will be evaluated at runtime. It's uncommon, but the need isn't impossible to imagine.

I'm personally of the opinion that using a scripting language is the worst option. Going that route is giving up on most forms of static analysis, since you can't inspect configuration values without executing untrusted code.

lmm · on June 8, 2022

> Sometimes you need to support something more flexible than JSON but just can't hand your (non-dev) users all the footguns that come with a Turing complete configuration language. "Key/Value" makes the problem sound simple, but maybe you're writing a rules engine, and the values are expressions that will be evaluated at runtime. It's uncommon, but the need isn't impossible to imagine.

> I'm personally of the opinion that using a scripting language is the worst option. Going that route is giving up on most forms of static analysis, since you can't inspect configuration values without executing untrusted code.

Sounds like the original case of the "inner-platform effect". A "custom configuration file" format for a "rules engine" is a scripting language - you will almost certainly make it accidentally Turing complete even if you were trying not to - and it's one that's even less susceptible to analysis than a standard scripting language.

The least-bad solution is an "inner" DSL in your language, IME, with lots of support code to make it as nice as possible (admittedly my experience is mainly in languages that make this easy) - that way you get tooling for free and you can leverage existing static analysis tools. Yes, your users will find ways to shoot themselves in the foot, but they were always going to.

giaour · on June 8, 2022

> A "custom configuration file" format for a "rules engine" is a scripting language

I see what you mean, but if a project is defining its own configuration language, it can impose as many restrictions as it wants on what expressions are allowed. There are some great examples of non-Turing complete configuration DSLs out in the world, though the most successful examples seem to be used by application frameworks (protocol buffers, GraphQL schema language, Thrift IDL, Avro IDL, Smithy) and infrastructure-as-code projects (Terraform HCL, Azure Bicep).

At a previous job, there were a couple popular "inner DSL" projects, but one was written from the beginning to eventually evaluate to a JSON document, and the other was rewritten to do so (rather than be interpreted and acted upon iteratively) because TypeScript and Ruby scripts could just embed too much arbitrary complexity to be reasonably analyzed in toto by humans once projects reached a certain size.

lmm · on June 9, 2022

> it can impose as many restrictions as it wants on what expressions are allowed

Sure, but it's very easy to accidentally be Turing complete. You added loops? Boom. You added some kind of alias / extract-common-value feature? Boom. Etc.. Of course you can forcibly restrict the implementation (I wrote a Y combinator in one of them and found that the interpreter would detect recursion at runtime and refuse to process it) but at that point you're making your language inconsistent which is even worse.

> protocol buffers, GraphQL schema language, Thrift IDL, Avro IDL, Smithy

Those aren't "rules engine"s though. Once you get to the point of having expressions in your language, you slide down the slippery slope pretty quickly, IME.

giaour · on June 9, 2022

> Those aren't "rules engine"s though

Yeah, I don't know why I picked rules engines as an example of things that are hard to configure with JSON. My day job is working on an infrastructure-as-code DSL, and I previously worked on an RPC framework IDL. Both of those projects are replacing an alternative in a traditional serialization language (JSON and XML, respectively) where users spent a lot of their time fighting against JSON or XML gotchas.

The IaC language supports expressions but not named expressions, so we've been able to avoid Turing completeness so far.

lmm · on June 16, 2022

That may not be enough; you wouldn't be the first supposedly-non-Turing-complete language where it's actually possible to express a Y combinator and use that to embed arbitrary computations.

linkdd · on June 8, 2022

Expressions that will be evaluated at runtime are just strings, no?

  {"rules": [
    "x > 1",
    "x % 2 = 1"
  ]}

So you will have an "expression parser" to parser your rules, validate them, and execute them. But at least you do not have the edge cases of everything else.

_gabe_ · on June 8, 2022

At this point you're already writing your own parser to parse the expressions from the strings. Why not cut out the json completely and make it way more readable?

  rules = [
    x + 1,
    x % 2 == 1
  ]

giaour · on June 8, 2022

Those are simple expressions (no multiline blocks, no named symbols, no expressions that call other expressions), and you're still forced to treat them as opaque strings when you use a serialization language for configuration.

That may be the best approach for a given project! But if you're using configuration to define an API contract, cloud infrastructure, or a query, you might be better off using a purpose-built DSL like protocol buffers, HCL, or SQL, respectively. That approach can let you define a greater level of expressivity than is allowed in a serialization language like JSON or XML without letting config authors write scripts of arbitrary computational complexity (like they would be able to with config in a scripting language).

kevin_thibedeau · on June 8, 2022

This sort of hack with control logic was done in XML with Ant. It was a bad idea then and it still is.

benibela · on June 8, 2022

I also thought writing my own format would be a bad idea

So I choose a standardized configuration format: JSON

Now I store my config in JSON, and wrote a JSON parser

Unfortunately my JSON parser threw an exception when parsing floating point numbers, and my app did not start anymore.

I also wrote an XML parser. In hindsight, it would have been much better to store my config as XML, since there are no floating point numbers. Parsing floats is incredible complicated.

Akronymus · on June 8, 2022

For me personally, I'd love for S-Expressions to be a standard configuration language. Sadly, it is a chicken and egg problem.

_ajoj · on June 7, 2022

Write your own is a bad option. I have tried it and found that you end up spending lots of time maintaining a parser.

I have found some success in my org using JSON with JSON Schema [1]. Combine with a json schema aware IDE like VS Code it solves the documentation problem.

[1] https://json-schema.org/

gnulinux · on June 7, 2022

But article literally makes the same point you're making. They only put "make your own" as the last resort option. Here:

> If for some reason a key-value configuration format doesn’t meet your needs, and you can’t use a scripting language due to performance or size constraints, then it might be appropriate to write your own configuration format. But if you find yourself in this scenario, think long and hard before making a choice that will not only require you to write and maintain a parser but also require your users to become familiar with yet another configuration format.

Gravyness · on June 7, 2022

> With so many better options for configuration languages, there’s no good reason to use JSON

I have been surprised by how silly that conclusion is, even though it's from 2018.

JSON is so bad it is on the standard library of all major languages, natively supported by php, python, nodejs, javascript, ruby and many others, unlike YAML, TOML, FADFH or whatever solution someone comes up to a problem that is essentially solved.

If you want perfection then by all means reinvent the wheel. But otherwise, don't be an edgy teen, just go with what the entire industry tells you that works.

nomel · on June 8, 2022

> JSON is so bad it is on the standard library of all major languages

This is like claiming that HTTP is a good way for people to type messages to each other, because it's in the standard library for all major languages. Existing in a standard language doesn't mean it was meant to be typed directly by humans. The lack of comments was an intentional design choice [1] to make sure its intent, a data exchange format, was understood.

1. https://news.ycombinator.com/item?id=3912149

valbaca · on June 8, 2022

> unlike YAML, TOML, FADFH or whatever solution

TOML and YAML are each supported by every language you listed.

https://github.com/toml-lang/toml/wiki

https://yaml.org/

hillcrestenigma · on June 8, 2022

They aren't supported by the standard libraries of those languages though. It is still a major advantage of JSON.

lmm · on June 8, 2022

Does that matter nowadays? Unless you're writing a small script in a language with terrible dependency management like Python, you're going to have a bunch of non-standard-library dependencies, one more doesn't make much difference.

linkdd · on June 8, 2022

> with terrible dependency management like Python

Ah yes, because writing a requirements.txt is complicated, writing a pyproject.toml is complicated, or distributing your app with pyinstaller is complicated.

The Python ecosystem has many options that solve dependency management, this is what makes the standard evolve (like the PEP about pyproject.toml which is now a standard). This is healthy, and installing Python deps, even native has never been a problem since the wheels package have been standardized.

Can we stop with those obsolete claims that Python packaging is a mess? The package format barely changed in the last 20 years, and every solutions rely on setuptools and/or pip.

lmm · on June 8, 2022

> Ah yes, because writing a requirements.txt is complicated, writing a pyproject.toml is complicated, or distributing your app with pyinstaller is complicated.

Writing it isn't complicated. Getting deterministic behaviour out of it is. (Yes, there's freeze which is better than nothing, but that doesn't help if you're actually developing a project; you can freeze your transitive dependencies, but sooner or later you'll want to upgrade something that depends on something you froze, and so then you have to unfreeze everything and hope that nothing else breaks).

> The Python ecosystem has many options that solve dependency management, this is what makes the standard evolve (like the PEP about pyproject.toml which is now a standard).

The Python ecosystem has many options because they all suck. Every few years someone writes a new one that claims to fix the problems, but they still haven't managed to catch up with where Maven was 20 years ago. (Indeed I'd argue they've actually gone backwards in some ways, e.g. including pip in newer versions of Python).

> Can we stop with those obsolete claims that Python packaging is a mess? The package format barely changed in the last 20 years, and every solutions rely on setuptools and/or pip.

The claims are not obsolete, and the fact that things have barely changed in 20 years is a big part of the problem.

linkdd · on June 8, 2022

> Getting deterministic behaviour out of it is.

As you said, there is `pip freeze`, but also the Pipfile.lock for pipenv and the poetry.lock for poetry.

> but sooner or later you'll want to upgrade something that depends on something you froze, and so then you have to unfreeze everything and hope that nothing else breaks

Same for Rust with its Cargo.lock, same for JS with it's package-json.lock or yarn.lock, same for everything that lets you freeze your dependencies.

EDIT: That's also what test suites and CI/CD are for.

> The Python ecosystem has many options because they all suck.

They let me write software, manage dependencies and virtualenvs, there is even pdm which supports the recent PEP for __pypackages__ (equivalent of node_modules) instead of a virtualenv.

Can you clarify how every single one of them suck?

> they still haven't managed to catch up with where Maven was 20 years ago

What does Maven do to prevent the problem of upgrading frozen dependencies?

> I'd argue they've actually gone backwards in some ways, e.g. including pip in newer versions of Python

Care to clarify?

lmm · on June 8, 2022

> What does Maven do to prevent the problem of upgrading frozen dependencies?

You don't need to freeze your dependencies in the first place, because dependency resolution is deterministic. (In the case where you want to upgrade a transitive dependency without upgrading your direct dependency, you specificy it explicitly; if you want the latest available version, or the latest available version with the same minor number or whatever, you can run a command to populate your project file with that, but it's an explicit, visible operation).

> EDIT: That's also what test suites and CI/CD are for.

That's an admission that the dependency management isn't doing its job.

> Can you clarify how every single one of them suck?

My point is that the huge proliferation of options is not a sign that they're good, but rather the opposite. Ecosystems that have decent dependency management don't feel the need to write a new system every couple of years. Maybe PDM is finally the one that doesn't suck, but at some point after six or seven tries that all suck (and yet were always widely trumpeted as "python packaging is fixed now") my confidence is pretty low.

> > I'd argue they've actually gone backwards in some ways, e.g. including pip in newer versions of Python

> Care to clarify?

Sure. I want a single entry point that I can install on my system (via my package manager) and run to build any Python project, when realistically I'm going to be working with a bunch of projects that each have separate sets of dependencies and each require separate Python versions. Putting pip in the standard library means a) half my python installs have pip and half don't b) I have several different versions of pip installed at any given time. And so it entrenches the complexity where I have to have a separate tool like virtualenv for managing which Python version I'm using for each project as well as pip. (Which wouldn't be so bad if accidentally running pip for one project while your shell was in the virtualenv for another didn't fuck up that virtualenv, potentially permanently...).

linkdd · on June 8, 2022

In C++ you get the great https://github.com/nlohmann/json

In Rust you have the amazing serde with serde_json but at this point, you can use toml which is also based on serde. I consider serde as being standard.

In C you have the lib jason which is very good.

In Elixir, I use the compile-time configuration (config/config.exs) with environment variables for production, but that's because in the end, my Elixir system is in a Docker container, running on Kubernetes, with a ConfigMap defining the environment variables, so in the end, it's YAML (or JSON).

tikhonj · on June 8, 2022

"Use the popular thing because it is popular."

The same industry was—substantially still is!—telling us to use XML everywhere. Let's not.

rascul · on June 8, 2022

> JSON is so bad it is on the standard library of all major languages, natively supported by php, python, nodejs, javascript, ruby and many others

Pretty sure that "all" isn't correct here. Or we have different lists of major languages.

arielcostas · on June 7, 2022

Why not use XML? Fairly similar to HTML, which almost everyone has probably seen/used at least once, and it's kinda easy to understand. It supports comments, attributes and elements and schemas (which help with validation, editor autocompletion and documentation), and is also widely supported in many languages.

billconan · on June 7, 2022

too verbose. not ergonomic.

wronglyprepaid · on June 8, 2022

XML is incredibly complicated to process correctly, the support for various features, like schema validation, is also rather limited.

arielcostas · on June 8, 2022

Most languages already include an XML in their standard library, you don't need to implement it yourself

8lall0 · on June 8, 2022

Hell no, horrible language, too verbose.

arielcostas · on June 8, 2022

I prefer XML's verbosity to a full disfunctional language like YAML where adding an extra space might break your config

rektide · on June 8, 2022

  {"this_article": "// would be better",
   "this_article": "// if it werent",
   "this_article": "// immediately",
   "this_article": "// right off the bat",
   "this_article": "// wrong",
   "this_article": "meh"}
  //=> {"this_article": "meh"}

Hacker spirit represent. Where there's a will, there's a way. (Death to all oppressors!)

I actually agree, JSON is kind of a pain in the ass to work with. It's shocking to think of a pre-'jq' world, given how saturated we all are with JSON, but we lived like that for a long time. (We also evidently didn't even have Firebug until 2006?[1] Holy shit!)

But is it a problem? I dunno. Is it worth doing something else? Ooof, scary proposition.

I think one of the missing questions we don't ask, don't know is: what tools help us fix our json fast. What tools can see, oh there's a comma missing between these two lines, and just fix it for us.

[1] https://en.wikipedia.org/wiki/Firebug_(software)

tikhonj · on June 8, 2022

As far as I can tell, you're relying on non-standard behavior. This is what [the standard][1] has to say about keys being unique:

> * The JSON syntax does not impose any restrictions on the strings used as names, does not require that name strings be unique, and does not assign any significance to the ordering of name/value pairs. These are all semantic considerations that may be defined by JSON processors or in specifications defining specific uses of JSON for data interchange.*

This means that different tools can—and, if experience is any guide, will—handle your example inconsistently. This can be especially troubling when JSON processing tools are used internally by some other system like a database, queue or server; most of the time these tools parse and serialize JSON without changing its meaning, but they can change or lose information when people rely on patterns like the one that you used.

I would honestly be surprised if there isn't at least one realistically used tool that interprets your example differently, like producing {"this_article": "// would be better"} instead.

[1]: https://www.ecma-international.org/publications-and-standard...

rektide · on June 8, 2022

I'd love to see the counter-evidence! Works on every browser, every server-side JS engine of note.

Edit: I believe a number of these non-determinism did get formally resolved in EcmaScript itself, which is yes, not the same, but is an active, alive, & notable specification with some bearing.

tikhonj · on June 8, 2022

First thing I tried was the Haskell Aeson library which keeps the first value of the key:

    λ> decode f :: Maybe Value
    Just (Object (fromList [("this_article",String "// would be better")]))

There's a more flexible jsonWith function that lets you control how duplicates (and, for that matter, field order) is handled, but it's not the default.

The handful of other tools I spot-checked all keep the last pair, so it seems like almost but not quite standard behavior.

I was curious about other programs and, searching around, found a discussion[1] about standardizing the behavior to "either reject objects with duplicate keys or keep the last value"; however, from a quick scan, that wasn't added to any of the standards documents I found. Apart from the standard I linked earlier there are a couple of JSON RFCs[2][3], but they basically say "different systems do different things and if you depend on the behavior, your JSON won't be universally interoperable"

> When the names within an object are not unique, the behavior of software that receives such an object is unpredictable. Many implementations report the last name/value pair only. Other implementations report an error or fail to parse the object, and some implementations report all of the name/value pairs, including duplicates.

> JSON parsing libraries have been observed to differ as to whether or not they make the ordering of object members visible to calling software. Implementations whose behavior does not depend on member ordering will be interoperable in the sense that they will not be affected by these differences.

If anything, the fact that it is almost but not quite standard behavior probably makes it worse—you can rely on it for quite a while until it unexpectedly breaks and the figuring out why it broke will be a real headache.

[1]: https://esdiscuss.org/topic/json-duplicate-keys

[2]: https://datatracker.ietf.org/doc/html/rfc7159

[3]: https://datatracker.ietf.org/doc/html/rfc8259

rektide · on June 8, 2022

I found the EcmaScript specification[1]:

> In the case where there are duplicate name Strings within an object, lexically preceding values for the same key shall be overwritten.

So in JavaScript, it indeed will always work. But yes, other languages & libraries are under no obligation to follow this behavior. As you clearly show, the JSON spec itself highlights it's displeasure with & the unpredictability of duplicate keys, while also noting the two prevailing behaviors: accept last, or fail.

[1] https://262.ecma-international.org/12.0/#sec-internalizejson...

dmitriid · on June 8, 2022

There are at least seven different JSON specifications. And no, the ecmascript one isn't the "canonical" one.

More details if you ever wanted them, here: https://seriot.ch/projects/parsing_json.html

rektide · on June 9, 2022

Please stop trolling my threads to piss on my statememts. You're being awful to me?

I dont think I implied at all that kson was canonical nor absolute. I think I agreed thay it was unclear .

dmitriid · on June 9, 2022

No. I'm not being awful. I'm being objective.

rektide · on June 9, 2022

You were cruel & wrong & malicious this time. Again. Added nothing I nor who I was discussing with hadn't already actively agreed to already. Meanly.

vsnf · on June 8, 2022

Go will serialize keys to maps in random order, meaning that unless you take special and extraordinary precautions while serializing your json structs, you will get keys that will not be where you expect them to be.

rektide · on June 8, 2022

This does not appear to be true[1][2].

[1] https://github.com/golang/go/issues/24415

[2] https://go.dev/play/p/nXLs0hd8Uj3

taeric · on June 8, 2022

I'm not even sure which I would want it to be. Alists are the closest this is to, so first wins. But putting it in a config usually appends going down.

valbaca · on June 8, 2022

You just highlighted one of the issues with JSON. You think you can write comments like that (hell, I've done it too) but it's both undefined behavior and at worst breaks when your config comes across something that expects a strict adherence to some schema some poor intern or sr. dev was saddled with writing.

phendrenad2 · on June 8, 2022

Sure, just throw away 50 years of syntax highlighting technology and code readability. <insert this is fine meme>

throw0101a · on June 8, 2022

How about ISC-style like for BIND?

* https://bind9.readthedocs.io/en/latest/reference.html#config...

* https://wiki.debian.org/Bind9#File_.2Fetc.2Fbind.2Fnamed.con...

* https://www.zytrax.com/books/dns/ch7/#overview

I've been experimenting with NSD (and Unbound) recently, and their configuration is JSON-like. There seem to be some 'arbitrary' attributes that are treated as "top-level":

> At the top level only server:, key:, pattern:, zone:, tls-auth:, and remote-control: are allowed. These are followed by their attributes or a new top-level keyword. The zone: attribute is followed by zone options. The server: attribute is followed by global options for the NSD server. […]

* https://nsd.docs.nlnetlabs.nl/en/latest/manpages/nsd.conf.ht...

It seems to be good/best practice to indent the sub-attributes of the 'top-level' attributes to know where the 'stanzas' for each top-level start and end. The white space has no significance.

While I like the functionality of NSD/Unbound, I lean toward liking the use of braces (curly brackets; {}) à la BIND to explicitly denote stanzas and statements, even if one also uses indentation for human consumption.

woojoo666 · on June 8, 2022

My favorite so far is StrictYaml [1]. It's a subset of yaml that directly addresses a lot of the article's concerns:

* supports comments

* flexible with good signal to noise (no need for so many brackets and commas)

* gets rid of a lot of YAML's complexity [2]

* explicitly typed for less ambiguities [3]

The main page also gives comparisons against HJSON, HOCON and TOML [4]

[1]: https://hitchdev.com/strictyaml/

[2]: https://hitchdev.com/strictyaml/features-removed/

[3]: https://hitchdev.com/strictyaml/why/implicit-typing-removed/

[4]: https://hitchdev.com/strictyaml/#why-strictyaml

Spivak · on June 8, 2022

StrictYAML throws out too much for my taste — tags, flow state, and anchors are all useful non-footgun features.

woojoo666 · on June 8, 2022

To me those are in the category of "nice to have", and the problem is that every developer has different preferences for these [1] [2]. But the main features of StrictYaml, like supporting comments and less syntactic noise, I think are pretty uncontroversial, and perhaps it's worth it to get people to switch over for those alone. It doesn't need to be perfect, it just needs to be a significant enough improvement over JSON, and I'd say those two features are more than enough

[1]: https://github.com/crdoconnor/strictyaml/issues/37

[2]: https://github.com/crdoconnor/strictyaml/issues/38

120bits · on June 7, 2022

I went from simple flat file with tab separated configs to xml to Protobuf and to JSON/TOML.

Every time we had issues, it was because we were trying to spin our OWN version of configuration file. It breaks backward compatibility and versioning. Code collaboration was horrible and devs were getting frustrated.

dub · on June 8, 2022

My wish for configuration languages is that as an industry we continue to adopt scriptable build systems like Bazel that make it easy to transform human-written configuration into machine-readable configuration during compile time.

Want comments in JSON? Spend 20 minutes refactoring a single BUILD rule and now humans can write JSON5 that's transformed into JSON

Want more flexibility? Have humans write cue or dhall or jsonnet

Wish you also had a copy of a subset of the same config data in YAML? Easy

Want to write a compile-time check in a programming language of your choice that a certain setting is never missing? Easy

Want all of this to work with reproducible builds across a variety of computers and distributed build farms? Easy

We don't have to let legacy build systems limit our imagination.

molyss · on June 8, 2022

I’m totally happy using the properties format from java. There’s comments, it’s easy to grep. Main missing part are value types and multiline (or maybe it’s there, I’m not sure). The parameter reference substitution would be nice to have too.

phendrenad2 · on June 8, 2022

This is why I hate the Javascript ecosystem with a burning passion, and consider every single person who has contributed to it to be the very portrait of supremely incompetence. Everyone in web dev spent 2010-2020 schlepping these intolerable JSON configuration files around, unable to add comments to document what various lines did, simply because the JS ecosystem developers couldn't be bothered to implement a YAML parser. Le sigh.

shepherdjerred · on June 8, 2022

Yikes.

ukoki · on June 8, 2022

JSON is a great configuration language:

* It's in the standard library for most languages so you can write tools that read and write it that have zero dependencies. This is great for portability and for use in offline environments.

* You can use the same JSON files as input for tools that expect JSON (eg Terraform .tfvars.json) and tools that that expect YAML (anything kubernetes related)

* You can easily see at a glance exactly what whitespace characters are in multiline strings like PEM certificate strings. This is useful to avoid LF vs CRLF line ending issues, and ensuring start and end whitespace is exactly what you want to make sure input strings are compatible with whatever is consuming the configuration.

The downside is that it's a little bit ugly compared to YAML, but editor auto-formatting and key-sorting plugins make that a minor issue.

luismedel · on June 8, 2022

IMHO points 1 and 2 are more community efforts rather than JSON's design strengths.

Regarding point 3, are you talking about using \n in single-line strings (the ones you can use in JSON)

I think that the main reason this debate exists is we are using a good data-format (which JSON is) to configure things, which it's not a use case JSON is designed for.

benibela · on June 8, 2022

All the text formats are bad and verbose

Everything is much more efficient with binary format like protobufs or flatbuffers. And if the format has a schema, the schema is the documentation, an d you do not need to have comment in the config file itself.

terrabitz · on June 8, 2022

As a transport I agree. But as a config file format? I'm a little skeptical.

The main benefit of a flat file format for configs is that it can be added to source control and edited by any plaintext editor. If I kept config as protobuf, I would need special tools to handle it, which could be annoying (e.g. if I'm remoted into a remote system). It would also make it more difficult to see diffs between different config versions stored in git.

hsbauauvhabzb · on June 8, 2022

I learned about json5 on news.ycombinator, which turned json config from an absolute pain to work with, to something more tolerable. Thank you kind stranger who recommended it.

Edit: one thing I think all json python libs suck at is error messages. A syntax error in a json tree shouldn’t require me to rip nodes out to a/b test a parse error. Sure it can’t correctly break down AST but it could at least try to provide something more effective than Exception(f”{line} {col} glhf!”)

princevegeta89 · on June 8, 2022

JSON is not a language it's just a format.

rfiat · on June 8, 2022

I disagree. It has a formal grammar so at least in a CS sense [0] it's a language. Why is JSON not a language but e.g. TOML or YAML are?

[0] https://en.wikipedia.org/wiki/Formal_language

aiddun · on June 8, 2022

I really like the solution Tailwind and some other JavaScript tools have taken to this where instead of a tailwind.json, there’s a tailwind.config.js, which is a plain ol JS file that exports a JS object. Allows for importing constants from other modules, scripting, comments, conditionals, etc

ryanthedev · on June 8, 2022

Idk. I feel like most things are YAML based or now some custom flavor (bicep, hcl, Nginx)

But that’s more from the IaaC world.

I guess .net application configuration might still be heavy on the JSON side.

I just feel like this isn’t as much of a problem anymore. There are so many ways to utilize alternatives.

wgj · on June 7, 2022

> Language

JSON is only a data _format_, not a language. I agree it's awful for maintaining configs.

YAML has issues with the more complex parts of its spec, but it's great for the stuff JSON gets used for.

scubbo · on June 8, 2022

What is the distinction between a data format and a data/configuration language? (It's clearly not a programming language, but that wasn't the claim)

kelnos · on June 8, 2022

I've started liking TOML quite a bit. I am so tired of getting bitten by YAML's weird parsing behaviors. I still sometimes forget to quote strings, and then end up having the string "no" get interpreted as a boolean "false", and then spend an embarrassing amount of time confused as to what's going on.

krapp · on June 8, 2022

(clears throat, taps microphone.)

LuaJIT. Or at least Lua.

Thank you for coming to my TED talk.

fb03 · on June 8, 2022

I actually really like RON[1]

[1] https://github.com/ron-rs/ron