The consistency problem is an open question in my mind. I definitely don't like ...

shanemhansen · on Feb 25, 2015

Ideally you don't have to sync the data because one service owns that data. Other services request that data via api. In a RESTful world those api requests are cacheable.

davidkellis · on Feb 25, 2015

But what about the situation where you have an entity service that owns the data for one piece of the domain, for example a People service, and then other services, like the Address service and the Billing service, reference a particular person. In that scenario, I can imagine the Address service and the Billing service would have a foreign key referencing a person in the People service. Then, what happens if the Person gets deleted? In that case, we've got a consistency problem, even though each service owned its data.

Is the best practice to not use entity services?

danudey · on Feb 25, 2015

The People service can also store Addresses. Call it the Identity service. Include People, Businesses, Relationships, and Addresses.

The Billing service can then reference People or Businesses (and if Businesses, then sub-People), bill to an Address, etc.

No one's saying every object should be a service; you need to find the correct lines to divide across.

In our system (which has been service-oriented for five years), we don't do deletes. We do 'inactive' (UPDATE table SET ACTIVE=0…), but never deletes.

Especially in a case of your billing example, you never want to delete a person or address, because that's historical data you need to retain, but we just keep everything. If it goes in the database, it's because we want to keep it forever.

lil_cain · on Feb 26, 2015

Do you have EU customers? And if so, how do you deal with data protection?

chiph · on Feb 25, 2015

You could have a service bus, where you publish a "PersonDeleted" message that the other services would subscribe to. It decouples the Person service from all the other related entity services.

You'd have to allow for propagation delay. Plus the possibility of a message storm if you delete something fairly fundamental.

mercurial · on Feb 25, 2015

> You could have a service bus, where you publish a "PersonDeleted" message that the other services would subscribe to. It decouples the Person service from all the other related entity services.

You're still screwed if you complete a transaction on the deleted person's still-existing account, now that your system is no longer transactional...

newobj · on Feb 25, 2015

Your objection is hypothetical/abstract and when you ground it in specific use cases there are plenty of patterns that emerge addressing how to deal with the inconsistent state. For example, just-in-time/read reconciliation, batch remediation, actually making some subset of actions transactional/consistent and suffering lower availability there, and so on and so forth.

mercurial · on Feb 26, 2015

I'm not saying there are no solutions, but saying "just add an event bus" is unlikely to be sufficient. Whatever you do, you're going to pay additional costs in terms of complexity.

chiph · on Feb 26, 2015

Yeah, if you're an ACID person, this approach is going to present conceptual challenges. The propagation delay is a mostly-solved problem, which I know because lots of high-scale sites work. Getting a summary of their design decisions around this would be a huge time-saver, but I don't know of one.

threeseed · on Feb 25, 2015

You can setup distributed transactions e.g. using Zookeeper, Consul, Redis etc.

It adds complexity of course but that is the give/take when you use microservices.

shanemhansen · on Feb 25, 2015

So the problem you've identified is real. I used to have some bootleg footage of some private amazon tech talks where the speaker emphasized that in distributed systems it was generally a terrible idea to have transactions span entities.

I think you basically have to learn to live in an eventually consistent world. In the case of people being deleted I would imagine that the user service exposes a pub/sub interface where address and billing services subscribe to "delete" events.

newobj · on Feb 25, 2015

Hardly need "private bootleg" footage to discover this reality. Pat Helland (at the time, working at Amazon) wrote a paper about it maybe 10 years ago.

http://adrianmarriott.net/logosroot/papers/LifeBeyondTxns.pd...

threeseed · on Feb 25, 2015

You don't HAVE to live in an eventually consistent world. If you use something like ZeroMQ or use REST then you can "notify" other services of a "person deleted" event in a synchronous manner.

Xorlev · on Feb 25, 2015

That assumes the network is always good and services are up. Welcome back to eventual consistency (or none at all)

threeseed · on Feb 26, 2015

If the network is bad then your monolithic app wouldn't work either.

The problem of services being up/down has been solved with service discovery e.g. Consul, Etcd, Zookeeper.

saryant · on Feb 26, 2015

That has nothing to do with the fact that if your systems are distributed, you will have eventual consistency.

If System A needs to tell System B about an event in order for A and B to remain consistent, but B is down, you've got eventual consistency, because B can't become consistent with A until it's back up and has performed whatever recovery is necessary to process that event. Service discovery does nothing to solve that problem.

Xorlev · on Feb 26, 2015

What @saryant said.

In addition, the network isn't just up or down. It's varying shades (dare I say, 50 shades?) of down or broken. A single machine might not be accessible due to a switch issue. An entire rack or aisle might be compromised by a bad router or faulty routing table. A network cable might be flaky. The truth is you just don't know, and that's all inside a single LAN.

Your service discovery system could be able to see service {A,B,C}, but service A can't talk to B or C due to network issues. It happens.

http://www.rgoarchitects.com/Files/fallacies.pdf

cgh · on Feb 25, 2015

The "each service owns its data" scenario shouldn't take precedence over cases like this, where consistency is aligned with obvious business rules. If Person gets deleted, then "on delete cascade" should take care of that Person's Address and Billing records.

For updates and maybe reads, it's a different story.

sanmon3186 · on Feb 26, 2015

In my experience, you only expose APIs that are either standalone and transaction-ally independent (change address, delete address etc. in your example) or composite services (say people service) that should manage this distributed transaction. How transactions are managed under the hood vary based on implementation.

One may argue that in this case, "people" service doesn't go by description of micro-service as given in the article. But we need to understand that services get called in some context and there has to be someone there to do the plumbing. That someone can either be a db query, some code in the service, or app/application calling the services. And "generally" you would prefer service code over other two and hence a composite service.

IMO it may also be okay to have People, address, billing under one schema if service granularity and context allows so.

davidkellis · on Feb 25, 2015

cgh, I've hit the reply limit, but I wanted to ask how you'd implement the cascade deletes thing over services? Would the People service have to emit events describing that a person was deleted that the Address and Billing services would be expected to subscribe to in order to handle that the person was deleted?

cgh · on Feb 25, 2015

Sorry, I should have been more clear. I'm assuming a shared database. So cascading deletes would be defined in the table's schema. Let's pretend we're using Postgresql:

\d Person

[A bunch of table schema stuff]

Referenced by:

TABLE "Billing" CONSTRAINT "billing_id_fk" FOREIGN KEY (id) REFERENCES Person(id) ON DELETE CASCADE

(I typed that off the type of my head so it might not be quite correct.)

dsp1234 · on Feb 25, 2015

The original position being argued though was "... where all services talk to the same database... You need to split the database up and denormalize it.".

So the basic premise is that there is no shared database, and thus having the database enforce cascading deletes is not an option.

cgh · on Feb 25, 2015

Right, thanks for the clarification. Sorry for the slight derailment.

ryderm · on Feb 25, 2015

Yeah, events. Which are message queues for most things, api calls if it really needs to be done synchronously

vampirechicken · on Feb 27, 2015

You wouldn't so much delete them, as deactivate them (mark them inactive but keep them in around for retrieval). The consuming service would react differently to an inactive person as an active person.

threeseed · on Feb 25, 2015

Look into Event Sourcing and CQRS.

It explains how to manage situations like this.

dragonwriter · on Feb 26, 2015

Probably, a Person shouldn't be deleted if it might be referenced elsewhere. If it's no longer a valid customer, it should be updated to reflect that.