Improving Serialization and Memory Efficiency With a LongConverter

Chronicle Wire is a powerful open-source serialization library for high-performance data exchange in various binary and text formats, including YAML.

Strings in your data structures can have significant overhead regarding memory usage and access patterns. For each String, you have two objects, the String object, and the char[] or byte[], which contains the actual text. Strings are also immutable, and object pooling tends to create many objects for garbage collection in initialization and deserialization.

How to Optimize CPU Performance Through Isolation and System Tuning

CPU isolation and efficient system management are critical for any application which requires low-latency and high-performance computing. These measures are especially important for high-frequency trading systems, where split-second decisions on buying and selling stocks must be made. To achieve this level of performance, such systems require dedicated CPU cores that are free from interruptions by other processes, together with wider system tuning.

In modern production environments, there are numerous hardware and software hooks that can be adjusted to improve latency and throughput. However, finding the optimal settings for a system can be challenging as it requires navigating a multidimensional search space. To accomplish this efficiently, it is necessary to understand the tuning landscape and to use tools and strategies that facilitate effective changes.

Generating Unique Identifiers Based on Timestamps in Distributed Applications

We build applications that must process very high numbers of events with minimum latency. Generating unique IDs for these events using the traditional method of UUIDs introduces an unacceptable time overhead into our applications, so an alternative approach is needed.

I recently wrote an article on how timestamps can be used as unique identifiers, as they are much cheaper to generate than other methods of generating unique identifiers, taking a fraction of a microsecond. 

Java Is Very Fast, if You Don’t Create Many Objects

This article looks at a benchmark passing events over TCP/IP at 4 billion events per minute using the net.openhft.chronicle.wire.channel package in Chronicle Wire (open source) and why we aim to avoid object allocations. 

One of the key optimisations is creating almost no garbage. Allocation is supposed to be a very cheap operation, and garbage collection of very short-lived objects is also very cheap. Does not allocating really make such a difference? What difference does one small object per event (44 bytes) make to the performance in a throughput test where GC pauses are amortised?

Using Pausers in Event Loops

Typically in low-latency development, a trade-off must be made between minimizing latency and avoiding excessive CPU utilization. This article explores how Chronicle’s Pausers — an open-source product — can be used to automatically apply a back-off strategy when there is no data to be processed, providing balance between resource usage and responsive, low-latency, low-jitter applications.

Description of the Problem

In a typical application stack, multiple threads are used for servicing events, processing data, pipelining, and so on. An important design consideration is how threads become aware that there is work to do, with some general approaches including:

Event-Driven Order Processing Program

Following the Hello World example of a simple, independently deployable real-time Event-Driven microservice, this article looks at a more realistic example of an Order Processor with a New Order Single in and an Execution Report out. 

A New Order Single is a standard message type for the order of one asset in the FIX protocol used widely by financial institutions such as banks. The reply is typically one or more Execution Reports updating the status of that order.

How BDD Works Well With EDA

Behaviour-Driven Development (BDD) and Event-Driven Architecture (EDA) work well together as they complement each other’s strengths and weaknesses. Using both can result in a shorter time to market for new functionality and a more maintainable system.

Behaviour-Driven Development encourages a common language between users and developers in describing requirements in a form the users can understand but can also automatically be checked as the application is developed and maintained. BDD increases the inclusion of users, focuses on requirements capture and maintains the velocity of development as the application increases in complexity.

Deployment of Low-Latency Solutions in the Cloud

Traditionally, companies with low-latency requirements deployed to bare-metal servers, eschewing the convenience and programmability of virtualization and containerization in an effort to squeeze maximum performance and minimal latency from “on-premises” (often co-located) hardware.

More recently, these companies are increasingly moving to public and private “cloud” environments, either for satellite services around their tuned low-latency/high-volume (LL/HV) systems or in some cases for LL/HV workloads themselves.  

How Does Kafka Perform When You Need Low Latency?

Most Kafka benchmarks appear to test high throughput but not low latency. Apache Kafka was traditionally used for high throughput rather than latency-sensitive messaging, but it does have a low-latency configuration. (Mostly setting linger.ms=0 and reducing buffer sizes). In this configuration, you can get below 1-millisecond latency a good percentage of the time for modest throughputs.

Benchmarks tend to focus on clustering Kafka in a high-throughput configuration. While this is perhaps the most common use case, how does it perform if you need lower latencies?

Kafka vs Chronicle for Microservices

Apache Kafka is a common choice for inter-service communication. Kafka facilitates the parallel processing of messages and is a good choice for log aggregation. Kafka also declares to be low latency, high throughput. However, is Kafka fast enough for many microservices applications?

When I wrote Open Source Chronicle Queue my aim was to develop a messaging framework with microsecond latencies. In this article, I will describe why Kafka does not scale in terms of throughput as easily as Chronicle Queue for microservices applications. 

Visualizing Delay as a Distance

In order to illustrate the difference, let me start with an analogy. Light travels through optic fiber and copper at about two-thirds the speed of light in a vacuum, so to appreciate very short delays, they can be visualized as the distance a signal can travel in the time. This can really matter when you have machines in different data centers.

Low Latency Microservices, a Retrospective

I wrote an article on low latency microservices almost five years ago now. In that time, my company has worked with a number of tier-one investment banks to implement and support those systems. What has changed in that time and what lessons have we learned?

Read this article and learn what we learned after five years of developing and supporting low latency microservices.