XML cannot be parsed into nested maps/dictionaries/lists/arrays without guidance...

falcolas · on Aug 17, 2019

> And we didn't even need to get into the needless complexity of validation, namespaces, and other junk.

So, that’s why we’re adding all of this “junk” back into JSON? Transformers, XPath for JSON, validation, schemas, namespaces (JSON-LD, JSON prefixes) it’s all there.

History repeating itself (and here’s the important part) because this complexity is needed. Not every application will need every complication, but every complication is needed by some application.

robocat · on Aug 17, 2019

No junk has been added into JSON - the specification hasn't changed to accommodate those features.

Unless you need to use the feature, you don't need to know anything about it, which is a huge benefit for the majority. XML almost encourages programmers to use unnecessary features.

When an application domain chooses to add a feature (say JSON-LD) then there are advantages to that mixture over XML. Where XML is better, it is often chosen instead.

falcolas · on Aug 17, 2019

> the specification hasn't changed to accommodate those features.

Neither did XML. They simply took advantage of an early "processing directive" feature to add them in. XML and JSON are no different in this regard.

ChrisSD · on Aug 17, 2019

Except that JSON doesn't have anything like processing directives or even comments.

falcolas · on Aug 18, 2019

That depends entirely on which parser you’re using. People have wanted comments so badly there are parsing libraries (and proposed revisions to JSON) that include comments. And sometimes those comments are used to provide processing directives.

https://json5.org/ Jsonnet https://www.npmjs.com/package/comment-json

> Suppose you are using JSON to keep configuration files, which you would like to annotate. Go ahead and insert all the comments you like. Then pipe it through JSMin before handing it to your JSON parser.

reissbaker · on Aug 18, 2019

Neither Jsonnet nor JSON5 are JSON, and they're considerably less popular than their forebear.

(Although I do love JSON5.)

zvrba · on Aug 18, 2019

> XML cannot be parsed into nested maps/dictionaries/lists/arrays without guidance from a type or a restricted xml structure.

? Using XML without a schema is slightly worse than JSON because the content of each node is just "text". XML with schema is far more powerful, also because of a richer type-system. JSON dictionaries are most of the time used to encode structs, but for that you have `complexType` and `sequence` in the XML schema.

I've been using XML with strongly-typed schemas for serialization for the last couple of years and couldn't be happier. I have ~100 classes in the schema, yet I've needed a true dictionary like 2 or 3 times.

> And we didn't even need to get into the needless complexity of validation, namespaces, and other junk.

Validation is junk? Isn't it valuable to know that 1) if your schema requires a certain element, and 2) if the document has passed validation, then navigating to that element and parsing it according to its schema type won't throw a run-time exception?

Namespaces are junk? They serve the same purpose as in programming languages. How else would you put two elements of the same name but of different semantics (coming from different sources) into the same document? You can fake this in JSON "by convention", but in XML it's standardized.

crispyambulance · on Aug 17, 2019

XML is a perfectly serviceable data exchange format. The parsers and serializers work great when used properly. It's nice to have schema.

But I think people just got sick of XML because it was abused so badly with "web services", SOAP, wsdl and all those horrible technologies from the early naughts. Over-complicated balls of mud that made people miserable.

eknkc · on Aug 17, 2019

Apple's plist format might be the weirdest abuse of XML as far as I can tell. The SOAP envelopes and shit like that were horrible but plist is plain weird.

Everyone abused XML some way or another. JSON is not that "abusable" I'd say.

amaccuish · on Aug 18, 2019

PLIST is pretty flexible though, the underlying storage can be XML, binary or even JSON now.

cellularmitosis · on Aug 18, 2019

Implementing a binary plist encoding on you REST endpoints is actually pretty great for iOS devices.

nimish · on Aug 17, 2019

Xml is a beast to parse. It's slow to parse and verbose but it doesn't give you a human friendly text format. It's got a number of weird features inherited from sgml. Every parser needs a quirks mode since nobody can write good schemas and schema parsers.

XML is a really bad interchange format. It's OK for a document markup language, and that's where it survives.

tootie · on Aug 17, 2019

When I was doing XML/Java stuff 10 years ago, you take your XSD and generate domain classes as a build step. It was more complicated but it was also 100% reliable because the tools were all rock solid. Written by the guy who made Jenkins.

beatgammit · on Aug 17, 2019

Many languages have libraries built in that do something reasonable with JSON. Usually you just make a class or struct, instantiate it, and then generate JSON, no need to have a separate compile step. When going the other direction, I usually just format the JSON, copy that into my code, then fix the compile errors.

XML has all that tooling because it needs it. JSON is a lot more straightforward, is more compact, and is faster to parse and (probably) generate.

If you're going to go through the effort of a compile step, you should probably just use a binary protocol, which will get you even better performance and getting documentation out of the box (e.g. protocol buffers schemas are very readable).

I see absolutely no reason to use XML these days as a data format, but it's still a response choice as a markup format (you know, what the M stands for).

zvrba · on Aug 18, 2019

> Many languages have libraries built in that do something reasonable with JSON.

What about cross-language? In C# I define a class containing a `DateTime` field, export the schema with xsd, and generate classes for Java with xjc, and get back a field of (an equivalent of) `DateTime` type. Doing what you suggest with JSON, I'd get a "string". Thanks but no thanks.

> If you're going to go through the effort of a compile step, you should probably just use a binary protocol, […] I see absolutely no reason to use XML these days as a data format,

In our product we use a relational db (SQLServer) combined with XML. Each table has a structured part which is put into relational columns, plus an extensions part that is put into a "Data" XML column for semi-structured data. SQLServer supports XQuery so we can query the semi-structured data from SQL when needed.

This wouldn't fly with a binary format.

EDIT: yes, SQLServer also supports JSON, but has special optimizations for XML (e.g., it can understand schema types, it supports XML indexes which "shred" XML to a more efficient binary representation based on schema, etc.)