STOMP – The Simple Text Oriented Messaging Protocol

viraptor · on April 10, 2016

I'm a bit confused why people want things like this. I mean, a simple protocol is great. But why text? Text protocol for arbitrary data is always going to make things more complicated than necessary. And even "text" STOMP includes "NULL / octet 0" in the description.

Here are some thing that are complicated in text: escaping data, escaping line endings, type of line endings, escaping headers, encoding and string comparison, data delimitation, preallocating right buffer sizes, parsing long values, parsing numeric values, potential for confusion on what's opaque and what's parsable.

All of these are solved by even the most trivial binary TLV:

    4-byte tag, 4-byte length, lenth-bytes data

It's not going to be telnet-compatible, but it prevents so many issues from ever arising. If you need to debug it, it's trivial to even look at the hex dump of packets.

/rant (I don't understand the obsession with text formats)

brianm · on April 10, 2016

I'm the author of original Stomp protocol. Much of what changed in 1.1 and 1.2 came from people smarter than me, so blame me for the ugly and them for the good.

Why text? Because I wanted to be able to debug it with netcat, implement minimal but acceptable clients by inspection of the wire, etc. More importantly, I wanted anyone else to be able to do the same.

At the time, AMQP was just starting to get steam, and it was a gross, exceptionally difficult to implement, binary protocol. (AMQP has evolved a huge amount from those early forms.) In looking for something better, the ease of using SMTP and HTTP and the fact that I often did do them directly in netcat to debug or understand behavior, decided the direction.

Sure, SMTP, HTTP, etc., style protocols can be wonky at times, but the number of times I have in fact done them by hand is very high. Avoiding the need to break out Wireshark, and write a message packer on the fly as Wireshark only supports reading what comes across, makes using it, when you don't have a client yet, great. I've had people implement clients, from learning about the protocol to using the working client, during a half hour talk about Stomp. They can do this because it bootstraps on knowledge of HTTP, and text is easy on humans (if more involved for computers).

This ease of examination and manipulation is the same reason that JSON is (approximately) infinitely more prevalent than protobuf, thrift, avro, messagepack, or even smile (which is the exact json semantics, but way more compact and fast to parse). The desperate perl hacker of XML fame is a very real thing, don't ignore the implementation and debug affordances provided by text-based protocols.

_wmd · on April 10, 2016

Hey Brian,

Just wanted to say thanks for Stomp, protocols that are simple enough to be easily reimplemented are definitely underrated.

The Python story for Stomp is not so great, especially working with Twisted (e.g. there I think stompest is the best option available). On profiling I discovered this module's parser was horrendous, contributing something like 30% to my runtime.

Had it been some binary protocol with a complex spec, I might have spent days reimplementing it, but producing a simple and fast parser took only an hour or two, and now my problem is solved.

For others complaining about Stomp, my reimplementation is 116 lines of perfectly testable code, and for a problem domain as simple as wrapping messages up in a few verbs, this is how it should be.

(Will be releasing tinystomp.py at some point, but it needs a little tidying up and documenting first!)

viraptor · on April 10, 2016

> but the number of times I have in fact done them by hand is very high

How many times did you do them correctly and according to spec? And how many times did you just rely on servers accepting things that aren't terribly incorrect? Maybe you actually followed RFCs, I don't know... But from my experience with looking at short code that pretends to do HTTP/SMTP, they often work by accident and a new encoding or some special characters thrown into the message break them pretty bad :(

I spent a few years working with SIP, which way too many companies treat as "hey, it looks like HTTP, and is simple!" and follow with their own interpretation of how things should work. It's awful.

weberc2 · on April 10, 2016

Would you consider putting this rationale on the Stomp webpage? I spent quite a while trying to understand what problem this solved before finding your comment.

mike_hock · on April 10, 2016

No, you cannot implement a conforming client or server of any protocol by inspecting the wire.

How do you know there aren't other cases that you're supposed to handle that simply didn't happen to occur in the sessions that you sniffed?

HTTP is the prime example of why being a text protocol doesn't make it simpler.

wtbob · on April 10, 2016

> If you need to debug it, it's trivial to even look at the hex dump of packets.

The problem, of course, is that it's not trivial to generate those packets in an interactive fashion. That's what's great about protocols such as SMTP & HTTP: you can learn them just by telnetting to a port.

Of course, there are advantages to TLV formats, which is probably a reason why Ron Rivest invented canonical S-expressions, which enable both a readable symbolic form (e.g. (foo bar baz)) and a fast-parsing canonical form (e.g. (3:foo3:bar:baz)), as well as forms for hex (e.g. (#666f6f# #626172# #62617a#)) and base64 (e.g. (|Zm9v| |YmFy| |YmF6|)) transmission. The intended use case was to transmit tagged data like public keys:

    (public-key
     (rsa-pkcs1-md5
      (e #03#)
      (n
       |ANHCG85jXFGmicr3MGPj53FYYSY1aWAue6PKnpFErHhKMJa4HrK4WSKTO
       YTTlapRznnELD2D7lWd3Q8PD0lyi1NJpNzMkxQVHrrAnIQoczeOZuiz/yY
       VDzJ1DdiImixyb/Jyme3D0UiUXhd6VGAz0x0cgrKefKnmjy410Kro3uW1|)))

which could be transmitted in an efficient format (base64-decode the following to see what it'd look like):

    KDEwOnB1YmxpYy1rZXkoMTM6cnNhLXBrY3MxLW1kNSgxOmUxOgMpKDE6bjE
    yOToA0cIbzmNcUaaJyvcwY+PncVhhJjVpYC57o8qekUSseEowlrgesrhZIpM
    5hNOVqlHOecQsPYPuVZ3dDw8PSXKLU0mk3MyTFBUeusCchChzN45m6LP/JhU
    PMnUN2IiaLHJv8nKZ7cPRSJReF3pUYDPTHRyCsp58qeaPLjXQquje5bUpKSk=

Of course, since it was eminently readable and part of an eminently well-thought-out standard, the thing completely died on the vine, and now we're stuck with JSON and XML and YAML and why do I even bother getting up in the morning computing is so backwards it's not even funny.

http://people.csail.mit.edu/rivest/Sexp.txt

danbruc · on April 10, 2016

Seems pretty obvious to me, why that failed - the linked specification describes how to represent a tree of pairs of byte strings. What do I do with that? That's not much more helpful than saying that I can encode my information as a sequence of zeros and ones.

It is redundant and vague. Why six ways to represent a byte string? Or even more if you count things like quoted strings and length prefixed quoted strings separately. It has the feel of just do whatever you want. As some kind of type information one can associate display hints with each byte string. What are valid display hints? Arbitrary byte strings, of course. »Many of the MIME types work here.«

Where would you even start to build a useful library around this specification? It is so general that it is useless.

wtbob · on April 11, 2016

> Seems pretty obvious to me, why that failed - the linked specification describes how to represent a tree of pairs of byte strings. What do I do with that?

Whatever you want to. Once you have lists, trees and byte strings, you have everything you need (note that JSON is this, with some sugar for alists).

> Why six ways to represent a byte string?

Because different approaches make sense in different contexts: for a single word the word itself is completely fine; for an English phrase "this is a small phrase" works; for certain constants hex works, e.g. #DEADBEEF#; for large binary strings then Base64 encoding works, since it's opaque anyway. And all of that is really for display: for transmission (and cryptographic hashing) the length-prefixed canonical form works perfectly, but is still visually-inspectable in time of need (unlike, say, a pure-binary format).

> It has the feel of just do whatever you want.

It's a tool for one's toolbox, not a finished product: you use it to build whatever you want.

> Where would you even start to build a useful library around this specification? It is so general that it is useless.

How does one build a useful library around JSON? Canonical S-expressions live at the same level of abstraction (but are efficient and visually appealing, unlike JSON).

Are you aware that S-expressions are how Lisp code is represented? This takes that same universal code format and turns it into a universal data format. How would you build a library around it? How do Lisp implementations build a useful library around S-expressions? Simple: they offer functions like READ to read in structured data, and functions like EVAL to evaluate it as code. Likewise, the user of canonical S-expressions would have some library to read in the expressions, and some other library (or his own code) to evaluate that structured data as code in whatever format he's using.

I point you to the SPKI RFCs as an example.

In the context of the parent to my post, one can use S-expressions as an alternative to tag-length value. Instead of 0x01666f6f20626172, one might have (name "foo bar") as something a human being can read & write, which encodes down to (4:name7:foo bar), which is fast to parse, reasonably efficient for transmission, and can be read by a human in duress. And if for some reason one wanted a copy-pastable version (maybe for end users to paste auth tokens, or to survive emails, or something) then one could use {KDQ6bmFtZTc6Zm9vIGJhcik} (that's just Base64-encoded).

Or you could use JSON: {"name": "foo bar"}. Which is larger, more error-prone to type, uglier, harder to parse, and still needs to be evaluated in its context anyway (What if name isn't the only required attribute? What if it must be in HTML format? &c. &c. &c.).

danbruc · on April 11, 2016

I don't think that canonical S-expressions and JSON are really on the same level. JSON enforces much stricter structure or constraints - key value pairs with string keys and a small set of possible value types, numbers, strings, booleans, arrays, you name it. You have to build all of this on top of canonical S-expressions, you get two byte sequences, the display hint and the actual value, and then it's your task to interpret the raw byte sequences. It's also your task to figure out whether a list of pairs represents a map or a list of items with two values. It's definitely a powerful format but in order to be able to use it, you have to first build a lot of additional functionality on top of it. And by the way, this is not a mindless defense of JSON, I prefer to stay away from all things JavaScript as far as I can.

IgorPartola · on April 10, 2016

When it says that it is difficult to implement the server correctly, I think it's an immediate red flag. We went through this with HTTP (nginx retrying non-idempotent requests anyone), and SMTP. I think the lesson learned that any future protocols should require really good specs and every implementation should follow them.

vidarh · on April 10, 2016

It's not saying it is difficult to implement the server correctly. It is saying "the server side may be hard to implement well". There's a subtle difference there. The reason is not protocol complexity or a poor spec. The protocol is simple and well documented. Rather it is an admission that implementing a queueing server can be hard, depending on the semantics you choose to implement, and certainly is harder than the client code.

The protocol is perhaps not the most efficient. It's primary benefit is the simplicity and level of support in various environments. Having Stomp support for a message broker is a great way of adding low-effort interoperability.

A Stomp based message broker was my first serious Ruby project a decade or so ago, and getting the server "right" for our use case took ~500 lines of code (using sqlite for queues marked persistent), which is incidentally about the same size as the spec.

alphapapa · on April 10, 2016

The web site seems to lack a description of STOMP's purpose, or examples of how it would be used.

kgwxd · on April 10, 2016

The "1.2" in the nav bar is a separte link, I missed it at first too: http://stomp.github.io/stomp-specification-1.2.html

vidarh · on April 10, 2016

It's a protocol for message passing, such as for a message broker / queue. It's simple enough that lots of message brokers implements it either as their primary protocol or to offer interoperability.

KaiserPro · on April 10, 2016

I don't understand why you'd want to replicate HTTP style interactions for this kind of job.

The headers are massive, take up most of the message and are overly verbose.

I know people fear binary protocols, but they really aren't something to worry about if you make the correct tools (after all, text is binary, its just there are a great many tools to read it.)

If you are serious and actually want to test your protocol/code, then you need to implement a parser that's useable. (either in wireshark or pipeable from netcat) You've done most of the work making the parse for the actual protocol, going the extra mile and wrapping it so it can take STDIN shouldn't be a terrible drag

_wmd · on April 10, 2016

Parsing a few bytes of extra text pales in comparison to sending anything across the network. For example in the time it takes to call an uncontended futex() on Intel Xeon, libxml2 can parse about a kilobyte of XML

takno · on April 10, 2016

I thought maybe there was a new version or something. How has the spec for a perfectly common technology which has been stable for several years got this many upvotes?

jnordwick · on April 10, 2016

Why wouldn't you try to make this as easy to parse as possible, just as prepend a size to the message at least, and possible to other fields internally too. Make the size a fixed width and it makes reading easier.

I deal with crappy protocols on a daily basis in the form of FIX and other exchanges. I hope to never have to deal with this one.

chvid · on April 10, 2016

I am the only one who is confused with this protocol: what problem does it solve and exactly what does it do?

em3rgent0rdr · on April 10, 2016

2012 is latest release. Need to put (2012) in title, I think.

xaduha · on April 10, 2016

It's a spec. The fact that it doesn't change every month is a plus. It doesn't mean that a site or implementations weren't updated since 2012.