I went down the rabbit hole of learning about performance in a multi tenant Kafka cluster
The Apache Kafka documentation covers the basics but, I found that the Confluent Blog post on Multi-Tenancy in the Cloud and YouTube video provides some good insights on operating a performant Kafka As A Service in the cloud.
My notes are a summary of the excellent resources listed below.
https://kafka.apache.org/documentation/#multitenancy
https://www.youtube.com/watch?v=8hcUBhLE6_U
Multi Tenant Kafka with Confluent Cloud -
https://www.confluent.io/blog/cloud-native-multi-tenant-kafka-with-confluent-cloud/
https://www.youtube.com/watch?v=8ti63z3idbs&t=2s
Optimizing Performance
An application's performance is usually bounded by the following resources -
- CPU
- Memory
- Network
- Disk
In most systems, the CPU is faster than the memory. Memory in turn is faster than the network and disk.
An application's read performance is bounded by memory - when data is cached. The write performance depends on how fast data can be written to disk.
Quotas
A quota is not a reservation, but a means to ensure that tenants don't consume all the resources of a system.
Kafka provides 3 quotas that correlate to CPU, Memory and Disk. Each of these quotas can be configured per broker per tenant to match the finite resources available to the underlying operating system.
- Disk - Produce Bandwidth Quotas
- Memory - Consume Bandwidth Quotas
- CPU - Request Quotas
Configuring Broker Performance
What is a Tenant ?
- The User Principal
- The Client Id
- Both the User Principal And the Client ID
Bandwidth Quotas
- Produce Bandwidth Quota - This throttles the number of writes from clients into Kafka
- Consumer Bandwidth Quota - This throttles the numbers of reads from clients from Kafka
Request Quotas
Request quotas throttle the number of requests received per second.
Request quotas help protect CPU performance. This can be difficult to understand and even harder to determine.
Each request and response from and to the client is associated with network threads and request handler threads. Network threads handle communication with the client, while Request threads handle processing the requests. These threads are bound to the CPU. Hence the request quota is a function of both the number of network and request handler threads.
The Request Quota can be calculated using the following formula -
( network thread + request handler threads) * 100
Configuring a Kafka broker
A Kafka broker can be configured via the kafka-config command, with the following parameters- producer_byte_rate
- consumer_byte_rate
- request_percentage
Other Quotas
Besides Bandwidth and Request Quotas it is possible to limit the creation of topics, and limit the number of connections per broker.
Effective Capacity
When configuring quotas it is important to recognize that brokers are also responsible for replicating their data and this should be taken into account during capacity planning. The effective capacity (diagram below) is what is available to be divided among the various tenants.
Sources
https://kafka.apache.org/documentation/#multitenancy
https://www.confluent.io/blog/cloud-native-multi-tenant-kafka-with-confluent-cloud/