Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Offsets are a marker that says everything before it (in that partition) is processed. This creates 2 issues:

1) Your application must coordinate and make sure that everything up to that offset is indeed processed successfully. 2) You application must stop if it encounters an error (because it can't commit an offset greater than that item) or handle it separately by logging to another topic, database, etc.

Other systems like Redis, Pulsar, Google PubSub provide per-message acknowledgement to allow items to be individually processed without blocking other forward progress.



Ah, I see what you’re saying now. #2 was always a pain to deal with, but I think other systems have similar problems. Other messaging systems deal with this with things like dead letter queues etc, but no matter what you use for message processing you will need some specialized logic to handle records which can’t be processed normally. In Kafka, you can raise an exception for the offset and then move on. When dealing with the exception, you can seek directly to the record offset and take it from there.

For #1, any application which has an in-order requirement would suffer from this problem. I worked with event processing systems so we never really had to worry about this, since each event was independent. However, there were instances where we would need to track state for certain objects getting processed to make sure all of their child objects were also processed. For this we would use an external store with a short TTL since the lifetime of the object during processing would only be a few minutes.

All-in-all it just comes down to what your app’s requirements are. I don’t think Kafka is meant to replace every pub sub service out there, but definitely has some great use cases.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: