Configuring RaptorX Multi-Level Caching With Presto

RaptorX and Presto: Background and Context

Meta introduced a multi-level cache at PrestoCon 2021, the open-source community conference for Presto. Code-named the “RaptorX Project,” it aims to make Presto 10x faster on Meta-scale petabyte workloads. This is a unique and very powerful feature only available in PrestoDB and not any other versions or forks of the Presto project.

Presto is the open-source SQL query engine for data analytics and the data lakehouse. It enables you to scale independently and reduce costs. However, storage-compute disaggregation also brings new challenges for query latency as scanning huge amounts of data between the storage tier and the compute tier is going to be IO-bound over the network. As with any database, optimized I/O is a critical concern to Presto. When possible, the priority is to not perform any I/O at all. This means that memory utilization and caching structures are of utmost importance.

CategoriesUncategorized