Apache Kafka exactly-once processing explained

Click for: original source

Adam Warski blog post explaining real time processing with Apache Kafka and what its’ new major feature - exactly-once semantics - really means. This announcement caused a stir in the community, with some claiming that exactly-once is not mathematically possible.

Author explains and analyses how you can construct an exactly-once pipeline in Kafka, with an emphasis on where the new features come into play, what kind of guarantees you get, and more importantly, what guarantees you don’t get.

It is possible to create a pipeline where, at each stage, the result of processing of each message will be observed exactly-once, as far as Kafka is concerned. Author then focuses on:

  • Producer and its’ crucial feature is idempotency
  • Pipeline stages and how atomically write data to multiple topics and partitions
  • Consumer and how it can be transactional
  • Side effects

Worth your time!

[Read More]

Tags streaming queues apache