Deployment of Low-Latency Solutions in the Cloud

Traditionally, companies with low-latency requirements deployed to bare-metal servers, eschewing the convenience and programmability of virtualization and containerization in an effort to squeeze maximum performance and minimal latency from “on-premises” (often co-located) hardware.

More recently, these companies are increasingly moving to public and private “cloud” environments, either for satellite services around their tuned low-latency/high-volume (LL/HV) systems or in some cases for LL/HV workloads themselves.  

Performance of Pipeline Architecture: The Impact of the Number of Workers

With the advancement of technology, the data production rate has increased. In numerous domains of application, it is a critical necessity to process such data, in real-time rather than a store and process approach. When it comes to real-time processing, many of the applications adopt the pipeline architecture to process data in a streaming fashion. The pipeline architecture is a parallelization methodology that allows the program to run in a decomposed manner. The pipeline architecture consists of multiple stages where a stage consists of a queue and a worker. Each stage of the pipeline takes in the output from the previous stage as an input, processes it, and outputs it as the input for the next stage. One key factor that affects the performance of pipeline is the number of stages. In this article, we will first investigate the impact of the number of stages on the performance. We show that the number of stages that would result in the best performance is dependent on the workload characteristics. 

Background

The pipeline architecture is a commonly used architecture when implementing applications in multithreaded environments. We can consider it as a collection of connected components (or stages) where each stage consists of a queue (buffer) and a worker.