Their solution seems like a "produce to Kafka first" but with extra steps. Regar...

anentropic · on Oct 19, 2021

When we produce first and the database update fails (because of incorrect state) it means in the worst case we enter a loop of continuously sending out duplicate messages until the issue is resolved

the article does not explain things very clearly, but I think this is describing the problem rather than their solution

Our high level idea was:

- Insert “work” into a table that acts like a queue

- “Executor” takes “work” from DB and runs it

...

A Job is an abstraction for a scheduled DB backed async activity

...

How did we solve the #2 state problem?

By recording Jobs in the service database we can do the state update within the same transaction as inserting a new Job. Combining this with a Job that produces the actual Kafka message, allows us to make the whole operation transactional. If either of the parts fails, updating the data or scheduling the job, both get rolled back and neither happens.

I think this is describing basically a Transactional Outbox

i.e. "jobs" are recorded in the postgres db as part of the same db transaction as the business logic actions

the difference from Kafka-first is that if the app decides to rollback the business logic then the message hasn't already sent