How to Upgrade TiDB Safely

As a fast-growing open source NewSQL database, TiDB frequently releases new features and improvements. If you are a TiDB user, you may have found it hard to decide whether or not to upgrade your version. You may have also wondered how to make your upgrade journey safer, smoother, and even unnoticed by business. 

On the one hand, new TiDB versions have new features that can support some of the new demands in your business or can fix some known security loopholes or bugs. 

Time Synchronization in Distributed Systems: TiDB’s Timestamp Oracle

Today, distributed databases lead the market, but time synchronization in distributed systems remains a hard nut to crack. Due to the clock skew, the time in different nodes of a distributed database cannot be synchronized perfectly. Many computer scientists have proposed solutions such as the logic clock by Leslie Lamport (the 2013 Turing Award winner), the hybrid logical clock, and TrueTime.

PingCAP’s TiDB, an open-source distributed NewSQL database, adopts timestamp oracle (TSO) to deliver the time service and uses a centralized control service — Placement Driver (PD) — to allocate the monotonically increasing timestamps.

HTAP: One Size Fits All?

An important idea in the database world is that specialized databases will outperform general-purpose databases. Michael Stonebraker, an A. M. Turing Award Laureate and one of the most influential people in the database world, also discussed this in his paper, One Size Fits All: An Idea Whose Time Has Come and Gone.

This is a rational judgment because it's tough enough to build a database that supports either Online Transactional Processing (OLTP) or Online Analytical Processing (OLAP) workloads, let alone one that supports both at the same time. But the dilemma is, that today, many users are facing increasing demands with mixed OLTP and OLAP workloads. How do we crack this then?

TiDB Operator Source Code Reading (V): Backup and Restore

In our last article, we learned how to implement a component control loop in TiDB Operator. This time, I’ll move on to a new but important topic: backup and restore.

Backup and restore are two of the most important and frequently used operations when you maintain a database. To ensure data safety, database maintainers usually need a set of scripts that automatically back up the data and recover the dataset when data is corrupted. A well-designed backup and restore platform should allow you to:

How to Troubleshoot RocksDB Write Stalls in TiKV

TiDB, an open-source, distributed NewSQL database, can experience write performance degradation for several reasons. This troubleshooting guide discusses write performance degradation related to the RocksDB built-in write stall feature. RocksDB is an open-source, mature, and high-performance key-value store. It is optimized for fast, low latency storage such as flash drives and high-speed disk drives.

We will also discuss how to resolve this issue in TiKV, a highly scalable, low latency, and easy-to-use key-value database that uses RocksDB as its storage engine.

TiDB Operator Source Code Reading (Part 4): Implementing a Component Control Loop

In our last article, we introduced how TiDB Operator orchestrates control loop events to manage the lifecycles of TiDB components. The TidbCluster controller manages TiDB components' lifecycles, and the member manager of each TiDB component encapsulates that component's specific management logic.

In this post, I'll explain in detail how we implement a component control loop by taking PD as an example. You'll learn about the PD member manager and its lifecycle management operations. I'll also compare other components with PD and show their differences.

Easy Local Development with TiDB

When you develop an application, you begin by coding and testing in your local environment. Many applications interface with a database, so in this early stage, you might use SQLite rather than the database brand used in production. This is an issue, however, because ideally, you want to develop the application with the production database in mind.

When using a distributed system setting up and starting/stopping the components needed for this can become error-prone and time-consuming.

Best Practices for TiDB Load Balancing

Load balancing distributes connections from applications to TiDB Server instances. This helps to distribute the load over multiple machines and, depending on the load balancing option, can automatically reroute connections if a TiDB instance becomes unavailable.

Load Balancing Types

There are many different ways to implement a load balancer. This section describes the most common types.

How We Trace a KV Database With Less Than 5% Performance Impact

TiKV is a distributed key-value database. It has higher performance requirements than a regular application, so tracing tools must have minimal impact. This article describes how we achieved tracing all requests' time consumption in TiKV with less than 5% performance impact.

Background Knowledge

Logs, metrics, and traces are the three pillars of system observability. The following figure shows their relationship:

TiDE: Developing a Distributed Database in a Breeze

Contributing to TiDB's codebase is not easy, especially for newbies. As a distributed database, TiDB has multiple components and numerous tools, written in multiple languages, including Go and Rust. Getting started with such a complicated system takes quite an effort.

So, in order to welcome newcomers to TiDB and make it easier for them to contribute to our community, we've developed a TiDB integrated development environment: TiDE. Created during TiDB Hackathon 2020, TiDE is a Visual Studio Code extension that makes developing TiDB a breeze. With this extension, developing a distributed system can be as easy as developing a local one.

Managing Your Data Lifecycle With Time to Live Tables

Your organization is growing with each passing day; so is your data. More data brings more business opportunities, but it also begets higher storage costs. Do you want a better way to manage the cost? We want the same thing for our open source database, TiDB.

TiDB is a distributed SQL database designed for massive data. Our goal is to support large-scale datasets at a reasonable cost. At TiDB Hackathon 2020, we took a big step in that direction. We introduced a feature, the time to live (TTL) table, that enables TiDB to automatically manage the lifecycle of data according to its lifetime. TiDB makes sure every portion of its resources is consumed by high-value, fresh data.

TiDB Operator Source Code Reading (Part 2): Operator Pattern

In my last article, I introduced the TiDB Operator's architecture and what it is capable of. But how does TiDB Operator code run? How does TiDB Operator manage the lifecycle of each component in the TiDB cluster?

In this post, I'll present Kubernetes's Operator pattern and how it is implemented in TiDB Operator. More specifically, we'll go through TiDB Operator's major control loop, from its entry point to the trigger of the lifecycle management.

TiDB on KubeSphere: Release a Cloud-Native Distributed Database to the KubeSphere App Store

KubeSphere, an open-source, distributed operating system with Kubernetes as its kernel, helps you manage cloud-native applications on a GUI container platform. TiDB is an open-source, cloud-native database that runs smoothly on Kubernetes.

In my last blog post, I talked about how to deploy TiDB on KubeSphere. If you want TiDB to be available to tenants across the workspace, you can release the TiDB app to the KubeSphere public repository, also known as the KubeSphere App Store. In this way, all tenants can easily deploy TiDB in their project, without having to repeat the same steps.

TiDB on KubeSphere: Using Cloud-Native Distributed Database on Kubernetes Platform Tailored for Hybrid Cloud

In a world where Kubernetes has become the de facto standard to build application services that span multiple containers, running a cloud-native distributed database represents an important part of the experience of using Kubernetes. In this connection, TiDB, a cloud-native, open-source NewSQL database that supports hybrid transactional and analytical processing (HTAP) workloads, meets those needs admirably. Its architecture is suitable for Kubernetes, and it is MySQL-compatible. TiDB also features horizontal scalability, strong consistency, and high availability.

In addition to TiDB, I am also using KubeSphere, an open-source distributed operating system that manages cloud-native applications with Kubernetes as its kernel. It provides a plug-and-play architecture for the seamless integration of third-party applications to boost its ecosystem. KubeSphere can be run anywhere as it is highly pluggable without any hacking into Kubernetes.