HTAP: One Size Fits All?

An important idea in the database world is that specialized databases will outperform general-purpose databases. Michael Stonebraker, an A. M. Turing Award Laureate and one of the most influential people in the database world, also discussed this in his paper, One Size Fits All: An Idea Whose Time Has Come and Gone.

This is a rational judgment because it's tough enough to build a database that supports either Online Transactional Processing (OLTP) or Online Analytical Processing (OLAP) workloads, let alone one that supports both at the same time. But the dilemma is, that today, many users are facing increasing demands with mixed OLTP and OLAP workloads. How do we crack this then?

Empower Your Business With Big Data + Real-Time Analytics in TiDB

Big data, in recent years, is not just a buzzword, but also a growing need for ambitious companies. With skyrocketing data scale and stringent requirements on data freshness, big data-related scenarios are becoming complicated and multi-dimensional. Thus, many companies use real-time data warehouses to meet their business demand.

But data warehouses are not the only option. An emerging category of databases, Hybrid Analytical/Transactional Processing (HTAP) databases, can serve you just as well as data warehouses, if not better. HTAP databases can handle Online Transactional Processing (OLTP) workloads and respond quickly to big data analytical requests in real time.

Reducing Real-Time Query Latency With a Scale-Out HTAP Database

Industry: Automobile

Authors:
  • Xianqi Jin (DBA at Autohome).
  • Fan Zhang (R&D at Autohome).
  • Technical Architecture Team of Autohome Technical College.
TiDB Cover Image

Autohome is the leading online destination for automobile consumers in China. It's one of the most visited auto websites in the world. We provide professionally produced and user-generated content, a comprehensive automobile library, and extensive automobile listing information, covering the entire car purchase and ownership cycle.

Why We Chose a Scale-Out Database as a MySQL Alternative

Chehaoduo is an online trading platform for both new cars and personal used cars. Founded in 2015, it is now one of the largest auto trading platforms in China, valued at $9 billion in its series D round of funding last year.

In the early stages of Chehaoduo, to quickly adapt to our application development, we chose MySQL as our major database. However, as our business evolved, we were greatly troubled by the complication of MySQL sharding and schema changes. In the face of this dilemma, we found an alternative database to MySQL: TiDB, an open-source, MySQL compatible database that scales out to hold massive data.

Today’s World Calls for a New Kind of Database

Over the past decade, applications have become more and more data-intensive. Dynamic data, analytics, and models are now at the core of any application that matters. To support these requirements, there is a commonly held, but often incorrect, belief that modern applications need to be built on top of a variety of special-purpose databases, each built for a specific workload. It is said that this allows you to pick the best ones to solve your application needs. 

This trend is apparent when you look at the plethora of open-source data tools that have proliferated in recent years. Each one was built to scratch an itch, optimized for specific, narrow use cases seen in a smattering of projects. In response, some of the cloud vendors have packaged up these multiple database technologies for you to choose from, commonly forking from existing open-source projects. You’re then meant to wire together several of these tools into the needed data solution for each application.  

TiDB Dashboard: Easier Troubleshooting for Distributed Databases

TiDB

It's challenging to troubleshoot issues in a distributed database because the information about the system is scattered in different machines.

TiDB is an open-source, distributed SQL database that supports Hybrid Transactional/Analytical Processing (HTAP) workloads. Before version 4.0, it could be difficult to efficiently troubleshoot TiDB's system problems. To diagnose a TiDB cluster's issues, even an experienced database administrator (DBA) needed to understand TiDB's basic architecture, get familiar with thousands of TiDB monitoring metrics, and gain experience in the field to ensure that when they encountered similar problems next time, they could fix them more quickly.

SQL Plan Management With TiDB: A Review

graphic

The SQL execution plan is a critical factor that affects SQL statement performance. The stability of the SQL execution plans heavily influences the entire cluster's performance. If a relational database's optimizer chooses a wrong execution plan for a query, it usually has a negative impact on the system; for example, operations might take longer to respond or the database might get overloaded.

We've done a lot of work on optimizer stability for TiDB. However, SQL execution plans are affected by various factors. The execution plan may encounter unanticipated changes. As a result, the execution time might be too long.

How We Implement 10x Faster Expression Evaluation With Vectorized Execution

The query execution engine plays an important role in database system performance. TiDB, an open-source MySQL-compatible Hybrid Transactional/Analytical Processing (HTAP) database, implemented the widely-used Volcano model to evaluate queries. Unfortunately, when querying a large dataset, the Volcano model caused high interpretation overhead and low CPU cache hit rates.

Inspired by the paper MonetDB/X100: Hyper-Pipelining Query Execution, we began to employ vectorized execution in TiDB to improve query performance. (Besides this article, we also suggest you take Andy Pavlo’s course on Query Execution, which details principles about execution models and expression evaluation.)