How Does Kafka Perform When You Need Low Latency?

Most Kafka benchmarks appear to test high throughput but not low latency. Apache Kafka was traditionally used for high throughput rather than latency-sensitive messaging, but it does have a low-latency configuration. (Mostly setting linger.ms=0 and reducing buffer sizes). In this configuration, you can get below 1-millisecond latency a good percentage of the time for modest throughputs.

Benchmarks tend to focus on clustering Kafka in a high-throughput configuration. While this is perhaps the most common use case, how does it perform if you need lower latencies?