How to make Kafka Consumer compatible with Gevent in Python

Click for: original source

Asynchronous task management using Gevent improves scalability and resource efficiency for distributed systems. However, using this tool with Kafka can be challenging. By Jessica Zhao and Boyang Wei.

At DoorDash, many services are Python-based, including the technologies RabbitMQ and Celery, which were central to our platform’s asynchronous task-queue system. We also leverage Gevent, a coroutine-based concurrency library, to further improve the efficiency of our asynchronous task processing operations.

However when migrating to Kafka, we discovered that Gevent, the tool we use for asynchronous task processing in our point of sale (POS) system, is not compatible with Kafka. This incompatibility occurs because we use Gevent to patch our Python code libraries to perform asynchronous I/O, while Kafka is based on librdkafka, a C library. The Kafka consumer blocks the I/O from the C library and could not be patched by Gevent in the asynchronous way we are looking for.

The article then describes how and why:

  • Why move away from RabbitMQ/Celery to Kafka with Gevent?
  • The new challenges of migrating to Kafka
  • Replacing Kafka’s blocking call with a Gevent asynchronous call
  • Throughput comparison: Kafka vs Celery

We liked: Celery and Kafka show similar results on small loads, but Celery is relatively sensitive to the amount of the concurrent jobs that it runs, while Kafka keeps processing time almost the same regardless of the load. Good read!

[Read More]

Tags python devops messaging microservices event-driven