The unintentional side-effects of a bad concurrency model

Matthias247 · on Jan 26, 2016

After ten years in the automotive industry I can confirm that some observations from the article also happens there: If you look at the system architecture (parts, components, connections) of a car you can also very certainly identify the companies organizational structure. You would also see there that the architecture is optimized for solving the problem with least amount of effort and redundancy. But nevertheless, working in such a gigantic environment I also concluded that this still makes sense. You simply need a well defined boundary up to which you and your team are responsible and a well defined interface against which your are developing. In the electrical engineering world these are often interfaces described in form of messages that are defined on classical bus systems (e.g. CAN), which worked quite decent. I worked a lot on the introduction of IP networking based interfaces, which (besides allowing other network layers and higher speeds) also can be used for describing interfaces between different software parts which might still be deployed in the same hardware. This yielded the advantages that Joe Armstrong described here: Different parties can work very independently, using different technologies and you can compose the results to a complete system.

However I can't say that I would use the approach for everything, because I comes with a certain overhead - in performance as well as in the development process. E.g. to have a large number of developers (could easily be > 100) the same understanding of an interface and develop against it you really need a perfect static definition of the interface, for at least some parts a dynamic description (state and sequence charts) and test tools that work against both sides. The interface definition must be maintained by someone (probably neither the groups that implement the server or the client side) and changes will be much much slower than a change somewhere in a monolithic codebase (can be month from the change of an interface description until all sides have adapted the new version). I would also heavily recomment the use of some interface definition language (anything from custom, protobuf, thrift, swagger, etc.) and the use of a framework and code generations tools and client and server side. Manually processing JSON or even worse writing bytes manually to a socket won't work at scale.

cryptica · on Jan 26, 2016

I think it's not really a ideal thing for client-side work and server side work to be separated (this is more a talent constraint). The most efficient organizations I have worked for are the ones that split themselves up into feature teams instead of strict front-end and backend teams. Otherwise, you end up with people waiting around for each other to make changes and that forces them to context-switch between different tasks (and everyone loses focus). Not to mention that the backend devs often dont get the interface 100% perfect the first time and you end up having to iterate over those changes - Which means more waiting around and more context switching (except if the required changes are minor in which case they can pair together).

With feature teams, you give people ownership over a feature of the software and they are responsible for bringing it to completion. The great advantage of fullstack developers is that a single developer can work on a feature end-to-end - It gets rid of communication problems, greatly speeds up development and reduces the backend-frontend "blame game" factor that you often see in large companies.

I like the way the article portrays people as concurrent processes but I think there is a fundamental difference between people and processes which affects things: People don't have perfect memory. When someone offloads a task to a different person and starts working on some new tasks asynchronously, when they come back to the previous task, they may not be able to resurrect all the old thoughts that they had the last time they looked at the same code (not the same level of depth). So they lose a bit of time getting back in the mindset.

justinhj · on Jan 26, 2016

Some backends are fairly sophisticated and require extensive domain knowledge from designing databases that can be scaled, caching systems, security issues and so on.

On the client side you may need someone who is an expert in C#, in 3d rendering and geometry and in audio. Finding someone great at both is going to be very hard, and frankly not worth the benefits unless you have a very small team.

That said I believe front and backend should work closely together. There is no reason that the frontend should be blocked because the API design can be figured out before the backend is written and the frontend can mock out the missing features until they are available.

bitsai · on Jan 27, 2016

"That said I believe front and backend should work closely together. There is no reason that the frontend should be blocked because the API design can be figured out before the backend is written and the frontend can mock out the missing features until they are available."

Came here to say exactly this. At my current workplace, one of the very first steps in any project that involves both FE and BE work is for FE and BE engineers to get together and design the API. Once that's done, work can proceed on both sides in a largely asynchronous and independent manner. And one of the first things to be implemented on the BE is API endpoints with multiple mocks that can be requested by the FE, so the FE can start integrating with the endpoints as early as possible.

matthewmacleod · on Jan 27, 2016

I don't think you need to find people who are great at both, but you may want to find teams that are great at both.

The impression I took from the comment you replied to is that there should be no front-end and back-end team, but instead 'cross stack' feature-oriented teams, including members who have knowledge of all parts of the stack and can work on all parts of a feature together.

platz · on Jan 26, 2016

> Large Erlang applications have a flat “bus like” structure. They are structured as independent parallel applications hanging off a common communication bus. This leads to architectures that are easy to understand and debug and collaborations which are easy to program.

How is it easier to debug a message-passing architecture than a monolith? Observe the JS world is slowly moving away from using Pub/Sub for connecting components together. Each individual piece is easy to understand, but the chain of causality is very hard to determine with all those messages flying around.

kasey_junk · on Jan 27, 2016

Message passing vs monolith's are orthogonal. Being a monolith has to do with deployment & life cycle boundaries. Message passing has to do with communicating state transitions between multiple logical components, possibly running at the same time (regardless of whether they are in a monolith or not).

If you want to compare something with message-passing architectures a better comparison would be to a shared register. Both are abstractions around communicating a shared piece of state. The argument that debugging a message passing abstraction is easier hinges on the idea that the message passing coordination system decouples the sender and receivers state transitions from each other, where as in a shared register the state can be viewed as the state of both. In systems where there are more than a couple pieces of shared state it becomes untenable to put every component that can share state into the precise stateful configuration that you need to debug an particular case.

It turns out in practice, that generally it is easier to break up monolithic applications that are message based than it is to break up ones that are using shared registers, but that is a side benefit of the abstraction not the real difference between the 2.

I'm a huge advocate of log based architectures, which take message passing architectures to another level. That is, message passing architectures are good at limiting the things that can happen, log based architectures do that and add the benefit of capturing what did happen.

kitd · on Jan 27, 2016

> I'm a huge advocate of log based architectures, which take message passing architectures to another level. That is, message passing architectures are good at limiting the things that can happen, log based architectures do that and add the benefit of capturing what did happen.

AKA Event-based programming. Events are what has happened. Components define what events to respond to and how.

In practice you need a combination of events and messages (or 'commands'). Which one to use depends on NFRs like dependency, complexity, performance, maintainability, etc.

kasey_junk · on Jan 27, 2016

I don't know if there is a formal taxonomy of these things, but I tend break down messaging passing architectures along the what/how axes.

So you can have events or commands being sent in your messaging based architecture as the what.

Log based architectures really speak to the how. That is, the medium for message exchange is a durable log.

But like you said, in the real world they tend to be hybrids across all these lines.

toast0 · on Jan 27, 2016

> How is it easier to debug a message-passing architecture than a monolith?

You can observe the messages as they're passed, which is usually enough. In my experience in Erlang, a lot of messages include a pid to reply to, so if you get an unexpected message, you can often see where it came from, and then look at the messages that came into that process, etc (or grep the code on likely identifiers). Erlang makes this pretty easy, you can also peek at the mailbox for a process to see what other messages it hasn't read yet, etc. This model works really well for me, because I've been using it for years to debug networking applications with tcpdump and friends: figure out what's gone wrong over the wire, then you can figure out where it's gone wrong in the code.

Added: I guess that doesn't really address emergent behavior. Most of the emergent behavior I've seen is really around overload conditions, most of the debugging around that is focused around finding the bottleneck(s) (which is usually a problem in a simple process -- you can tell which ones because they have a huge mailbox), why there isn't enough back pressure (it's usually because the senders aren't waiting for confirmation, or aren't waiting long enough, especially in combination with retries), and how to shed load instead of falling over (which is tricky)

lostcolony · on Jan 27, 2016

Immutability is part of what leads to it being easy to program, as well. An actor in a given state, receiving a given kind of message, will always behave the same way (barring some abuse of ETS tables or the process dictionary or other bad practices). Meaning that to reason about the code I usually don't need to -care- about the chain of causality. I can break each part of my complex interaction into a series of "message received -> actor does something -> message sent", and trace through the code that way, turning what would be a big complex hairy problem of mutable state and race conditions in many other languages into a fairly innocuous, straightforward series of small, simple interactions.

platz · on Jan 27, 2016

Yes, as I said, each individual piece (actor) is easy to understand, locally.

But you cannot in general understand the global behavior of the system that way. One option might be to run an integration test, for example, but understanding why a certain actor recieved a given message will be difficult.

lostcolony · on Jan 27, 2016

And what I was saying is that in an immutable system, that's -usually enough-. It's not that difficult, because the reasons for messages being sent are predictable.

You get a lot of emergent behavior in systems with mutable state that is avoided in systems that enforce immutable state, and without that it becomes far more tractable to trace back a series of events and reason about them. You got a message? Okay, who can send that message? If it's more than one, is there a reason why you aren't indicating -who- sent that message (in Erlang, including a pid as part of the message; this is done automatically any time you use a gen_server and send a synchronous message via call)? But regardless, what does that message mean? Is it the same across all senders? Because if so, your bug is in the implementation in this actor. If not, if the message means different things when sent from different places, you need to figure out what the different messages you want to send are, and send different messages for them.

If you have mutable code, even mutable message passing, the time the message arrived changes what it does. Suddenly you have situations where the message being sent, and the behavior the actor exhibits, is dependent on mutable state. It's totally correct in some instances, and totally incorrect in others; completely dependent on what has happened in the system previously as to whether ~this~ particular message, with the coded behavior, does the right thing or not. And it's that, the interplay of all events that came before, that makes things complex.

platz · on Jan 27, 2016

Yeah, I can see how adhering to nice principles leads to more predictable behavior in erlang. And clearly people do it which seems to back up the claim.

cryptica · on Jan 26, 2016

I would say that the JS world is moving towards pub/sub but I understand what you mean about causality.

I think the change that's happening is that devs are moving towards using a central shared store for all data (one source of truth) instead of making each component load/save data independently from each other.

bcherny · on Jan 26, 2016

I'm not sure that's a trend, but a peculiarity of Redux/Baobab.

Without built in immutable data structures, I think this kind of paradigm will always be harder to work with in JS than smaller, distributed, mutable models.

EvanPlaice · on Jan 28, 2016

Or, you know...

You could just use immutable.js.

It's not like OOP languages where the built-in data structures are built up from many layers of inheritance that all have to be made immutable. Where access is blocked at the language level (ex private/protected).

Immutabilit is not difficult in JS. It's just not immetiately intuitive unless you understand the value of comparison by reference.

The == vs === always seemed like a strange vestigial appendage of JS. With the recent trends, it's beginning to make a lot more sense.

chc · on Jan 27, 2016

It's also one of the big ideas of Om, which I think is where it started catching on from.

lmm · on Jan 26, 2016

The best projects I find are small libraries that mostly use other small libraries that mostly... Less a flat bus than a pyramid.

Within-language interfaces can be incredibly rich, conveying a huge amount of information. When you split into separate components you throw all that away, and end up spending more time parsing and marshalling than actually doing your business logic. The right data structure can make the problem look trivial, and is far easier to understand than the behaviour of a system of loosely coupled pieces.

it · on Jan 27, 2016

Erlang doesn't require any programming effort to parse and marshal messages. You just send them and receive them.

  Pid ! {hello, world}

  receive
     {hello, world} ->
        io:format("Received hello.~n", [])
  end

yosefk · on Jan 27, 2016

There's a lot of machine effort though, if the data structures are big. If you're doing computer graphics or vision and you have to send images or 3D assets around, it won't perform very well. I think Armstrong consistently underestimates the cost of messaging compared to shared memory - or rather the fact that systems where this cost is prohibitive exist, just as much as systems where his approach works great exist.

dschiptsov · on Jan 27, 2016

Living organisms as a whole of various subsystems around semi-independent organs, which has been made out of tissues, which in turn made out of specialized cells, etc is the right model.

Cells are agents communicating with messages over different channels (slow blood vessels, and fast neural pathways).

There, at least according to theory of evolution, is no better way to structure a complex systems.

And the cell machinery could be modeled as a pure-functional LISP (because code and data uses exactly same sequential structure). This, presumably, was the intuition behind MIT Scheme.

Pattern matching on messages in Erlang along with the "become" universal server is beautiful way to express general agents and is close to that uniformity, but code cannot be a message.

Procedures are enzimes, data structures are proteins, asynchronous messages-passing is the way to communicate, and there are protocols to follow.

One cannot possibly design better than that.)

discreteevent · on Jan 26, 2016

He seems to be describing what Alan Kay intended objects to be.

Jtsummers · on Jan 26, 2016

http://www.infoq.com/interviews/johnson-armstrong-oop

I swear I had another source, but yes. He has said that he views Erlang as OO in the Alan Kay sense. Which, honestly, was my first thought when I tried Erlang. Each process is, essentially, a concurrent closure, and closures (in the functional language world) are the poor man's objects [1].

[1] http://c2.com/cgi/wiki?ClosuresAndObjectsAreEquivalent