Today, distributed databases lead the market, but time synchronization in distributed systems remains a hard nut to crack. Due to the clock skew, the time in different nodes of a distributed database cannot be synchronized perfectly. Many computer scientists have proposed solutions such as the logic clock by Leslie Lamport (the 2013 Turing Award winner), the hybrid logical clock, and TrueTime.
PingCAP’s TiDB, an open-source distributed NewSQL database, adopts timestamp oracle (TSO) to deliver the time service and uses a centralized control service — Placement Driver (PD) — to allocate the monotonically increasing timestamps.
While it may not be a daunting task to set up Kafka on a local machine or within a particular network and produce/consume messages, people do face challenges when they try to make it work across the network.
Let’s consider a hybrid scenario where your software solution is distributed across two different platforms (say AWS and Azure or on-premise and Azure), and there is a need to route messages from Kafka cluster hosted on one platform to the one hosted on another. This could be a valid business scenario wherein you are trying to consolidate your solution with one cloud platform, and in the interim, you need to have this routing in place till you complete your migration. Even in the long term, there may be a need to maintain solution across multiple platforms for various business and technical reasons.
To net the full benefits of a distributed system, applications' underlying architectures must achieve various company-level objectives including agility, velocity, and speed to market. Implementing a reliable observability strategy, plus the right tools for your specific business requirements, will give teams the insights needed to properly operate and manage their entire distributed ecosystem on an ongoing basis.
This Refcard covers the three pillars of observability — metrics, logs, and traces — and how they not only complement an organization's monitoring efforts but also work together to help profile, interpret, and optimize system-wide performance.
Milvus is an open-source vector database designed to manage massive million, billion, or even trillion vector datasets. Milvus has broad applications spanning new drug discovery, computer vision, autonomous driving, recommendation engines, chatbots, and much more.
In March 2021, Zilliz, the company behind Milvus released the platform's first long-term support version — Milvus v1.0. After months of extensive testing, a stable, production-ready version of the world's most popular vector database is ready for prime time. This blog article covers some Milvus fundamentals as well as key features of v1.0.
Linux's memory management system is clear to the user. However, if you're not familiar with its working principles, you might meet unexpected performance issues. That's especially true for sophisticated software like databases. When databases are running in Linux, even small system variations might impact performance.
After an in-depth investigation, we found that Transparent Huge Page (THP), a Linux memory management feature, often slows down database performance. In this post, I'll describe how THP causes performance to fluctuate, the typical symptoms, and our recommended solutions.
PalFish is a fast-growing online education platform that focuses on English learning. It offers tailored English speaking experience to English as a Second Language (ESL) students. As of October 2020, PalFish has over 40 million users, of which more than 2 million are paid users.
As our business rapidly grew, the surge of data posed a severe challenge to our MongoDB database. MongoDB (2.x and 3.x) does not support transactions and has no predefined schema to directly regulate data. This blocked our business growth. To solve these problems, we migrated from MongoDB to TiDB, an open-source, MySQL-compatible, distributed SQL database that supports Hybrid Transactional/Analytical Processing (HTAP) workloads. This turned out to be the right move.
TiDB, an open-source, distributed SQL database, provides detailed monitoring metrics through Prometheus and Grafana. These metrics are often the key to troubleshooting performance problems in the cluster.
However, for novice TiDB users, understanding hundreds of monitoring metrics can be overwhelming. You may wonder:
Chaos Mesh ® is an open-source chaos engineering platform for Kubernetes. Although it provides rich capabilities to simulate abnormal system conditions, it still only solves a fraction of the Chaos Engineering puzzle. Besides fault injection, a full chaos engineering application consists of hypothesizing around defined steady states, running experiments in production, validating the system via test cases, and automating the testing.
This article describes how we use TiPocket, an automated testing framework to build a full Chaos Engineering testing loop for TiDB, our distributed database.
In this post, we are going to cover the pod priority class, pod disruption budget, and the relationship of these constructs' with pod eviction. Okay, enough of talking, let’s start with pod priority class.
PriorityClass and Preemption
PriorityClass is a stable Kubernetes object from version 1.14, and it is a part of the scheduling group used for defining a mapping between priority class name and the integer value of the priority. PriorityClass is straightforward to understand; the higher the value of the integer, the higher is the priority. Take, for example, a PriorityClass with an integer value of ten and another with an integer value of twenty; the later one holds a higher priority than the first one.
Chaos Engineering is a way to test a production software system's robustness by simulating unusual or disruptive conditions. For many people, however, the transition from learning Chaos Engineering to practicing it on their own systems is daunting. It sounds like one of those big ideas that require a fully-equipped team to plan ahead. Well, it doesn't have to be. To get started with chaos experimenting, you may be just one suitable platform away.
Chaos Mesh is an easy-to-use, open-source, cloud-native Chaos Engineering platform that orchestrates chaos in Kubernetes environments. This 10-minute tutorial will help you quickly get started with Chaos Engineering and run your first chaos experiment with Chaos Mesh.
The query execution engine plays an important role in database system performance. TiDB, an open-source MySQL-compatible Hybrid Transactional/Analytical Processing (HTAP) database, implemented the widely-used Volcano model to evaluate queries. Unfortunately, when querying a large dataset, the Volcano model caused high interpretation overhead and low CPU cache hit rates.
Inspired by the paper MonetDB/X100: Hyper-Pipelining Query Execution, we began to employ vectorized execution in TiDB to improve query performance. (Besides this article, we also suggest you take Andy Pavlo’s course on Query Execution, which details principles about execution models and expression evaluation.)
Hi Bernd, tell us who you are and what lead you into microservices?
Our customers. Over the last years they adopted that architectural style more and more – and of course, had questions around it. Then I saw a lot of misunderstandings around how to implement end to end business processes or workflows in these microservices architectures – as people tried to avoid mistakes made with BPM and SOA. That got me into thinking about that whole topic and I got quite enthusiastic about it – which lead to a couple of articles and more than 100 talks around the world.