ClickHouse or StarRocks? Here is a Detailed Comparison

A New Choice of Column DBMS

Hadoop was developed 13 years ago. Its suppliers have been enthusiastic about offering open-source plug-ins, as well as technical solutions. This, on one hand, has resolved the problems of users, while it has also led to the high cost of maintenance, thus Hadoop gradually lost its share of the market. Users are calling for a simple and scalable database at a low cost, therefore the column DDBs got increased attention.

Brief Intro to ClickHouse

ClickHouse is an open-source database by the owner of Yandex, Russia's largest search engine. It has an enhanced performance compared to many commercial MPP databases, such as Vertica or InfiniDB. ClickHouse has gained increased popularity among companies besides Yandex, for the ordinary analytical business which is more structured and has fewer data changes, they can be put into flat tables and into ClickHouse thereafter.

Building a ClickHouse Visualization with Altinity and Cube

I like counting stars. Especially stars on GitHub! Tracking the growth of popular GitHub repositories has always been interesting to me. That's why I decided to use the public data set of GitHub events in ClickHouse to create dashboards with actionable metrics.

In this tutorial, I'll explain how to build a custom front-end visualization that fetches data from a ClickHouse instance. I'll use a managed instance of ClickHouse from Altinity Cloud and Cube Cloud as the metrics API layer.

Tips for High-Performance ClickHouse Clusters with S3 Object Storage

In our previous blog posts, we explained the various ways that ClickHouse can use S3 object storage. To keep things simple we generally focused on single-node operation. However, ClickHouse often runs in a cluster, and cluster operation poses some interesting questions regarding S3 usage. They include parallelizing data load across nodes, benefits of horizontal vs. vertical scaling, and avoiding unnecessary replication. 

In this article, we will discuss how ClickHouse clusters can be used with S3 efficiently thanks to two important new features: the ‘s3Cluster‘ table function and zero-copy replication. We hope our description will pave the way for more ClickHouse users to exploit scalable, inexpensive object storage in their deployments.

Goodbye XML, Hello SQL! ClickHouse User Management Goes Pro

Access control is one of the essential features of database management. Starting in late 2019, ClickHouse contributor Vitaly Baranov began to introduce robust, full-featured Role-Based Access Control (RBAC). As a result of this work, which included a huge number of tests implemented by the Altinity QA team, ClickHouse can now rightfully boast enterprise-level access control. Best of all, the commands are all in SQL.

User management is the front gate of RBAC. It controls access to ClickHouse itself. This article digs into new commands like CREATE USER that allow you to create, change, and delete users conveniently. We’ll focus on ways to control authentication for single ClickHouse servers. 

Monitoring ClickHouse on Kubernetes With Prometheus and Grafana

The ClickHouse Kubernetes operator is great at spinning up data warehouse clusters on Kubernetes. Once they are up, though, how can you see what they are actually doing? It’s time for monitoring!  

In this article, we’ll explore how to configure two popular tools for building monitoring systems: Prometheus and Grafana. The ClickHouse Kubernetes operator includes scripts to set these up quickly and add a basic dashboard for clusters. 

How to Mutate Data in a System Designed for Immutable Data

In a post published on our blog earlier this year, we described some of the decision-making that went into the design and architecture of Snuba, the primary storage and query service for Sentry’s event data. This project started out of necessity; months earlier, we discovered that the time and effort required to continuously scale our existing PostgreSQL-based solution for indexing event data was becoming an unsustainable burden.

Sentry’s growth led to increased write and read load on our databases, and, even after countless rounds of query and index optimizations, we felt that our databases were always a hair’s breadth from the next performance tipping point or query planner meltdown. Increased write load also led to increased storage requirements (if you’re doing more writes, you’re going to need more places to put them), and we were running what felt like an inordinate number of servers with a lot of disks for the data they were responsible for storing. We knew that something had to change.

ClickHouse Monitoring Key Metrics to Monitor

If you keep up to date with the latest developments in the world of databases, you are probably familiar with ClickHouse, an open-source columnar database management system designed for OLAP. Developed by Yandex, ClickHouse was open-sourced in 2016, which makes it one of the most recent database management systems to become widely available as an open-source tool.

Because ClickHouse supports real-time, high-speed reporting, it's a powerful tool, especially for modern DevOps teams who need instantaneous, fast, and flexible ways of analyzing data.