Setting Up Apache Druid on Kubernetes in Under 30 Minutes

 I was introduced to Apache Druid a year and a half ago. During this time, I've focused on operationalizing Apache Druid on Kubernetes (K8s). I started with Helm Charts to spin up Druid clusters in this complex distributed Druid + K8s system, but I realized Helm Charts alone were not enough.

I’ve written Golang-based operators, custom controllers in Kubernetes for different use cases, and contributed various oss operators, so I was familiar with extending Kubernetes using Custom Resource Definitions (CRDs). I was thrilled to discover the Druid Operator, which had just been open-sourced in the Druid community in late 2019. The project was less than a month old when I started contributing to it.

Analytics on Kafka Event Streams Using Druid, Elasticsearch, and Rockset

Everything you need to get started analyzing Kafka Event Streams

Events are messages that are sent by a system to notify operators or other systems about a change in its domain. With event-driven architectures powered by systems like Apache Kafka becoming more prominent, there are now many applications in the modern software stack that make use of events and messages to operate effectively. In this blog, we will examine the use of three different data backends for event data - Apache Druid, Elasticsearch, and Rockset.

Using Event Data

Events are commonly used by systems in the following ways: