Abhishek Gupta | The Blog Pros

July 29, 2020

Kafka on Kubernetes, the Strimzi Way! (Part 1)

Some of my previous blog posts (such as Kafka Connect on Kubernetes, the easy way!), demonstrate how to use Kafka Connect in a Kubernetes-native way. This is the first in a series of blog posts which will cover Apache Kafka on Kubernetes using the Strimzi Operator. In this post, we will start off with the simplest possible setup i.e. a single node Kafka (and Zookeeper) cluster and learn:

Strimzi overview and setup
Kafka cluster installation
Kubernetes resources used/created behind the scenes
Test the Kafka setup using clients within the Kubernetes cluster

The code is available on GitHub - https://github.com/abhirockzz/kafka-kubernetes-strimzi

July 29, 2020

Kafka Connect on Kubernetes The Easy Way!

This is a tutorial that shows how to set up and use Kafka Connect on Kubernetes using Strimzi, with the help of an example.

Kafka Connect is a tool for scalably and reliably streaming data between Apache Kafka and other systems using source and sink connectors. Although it's not too hard to deploy a Kafka Connect cluster on Kubernetes (just "DIY"!), I love the fact that Strimzi enables a Kubernetes-native way of doing this using the Operator pattern with the help of Custom Resource Definitions.

July 29, 2020

Data Pipeline Using MongoDB and Kafka Connect on Kubernetes

In Kafka Connect on Kubernetes, the easy way!, I had demonstrated Kafka Connect on Kubernetes using Strimzi along with the File source and sink connector. This blog will showcase how to build a simple data pipeline with MongoDB and Kafka with the MongoDB Kafka connectors, which will be deployed on Kubernetes with Strimzi.

I will be using the following Azure services:

July 29, 2020

Azure Event Hubs: Role Based Access Control (RBAC) in action

Azure Event Hubs is a streaming platform and event ingestion service that can receive and process millions of events per second. In this blog, we are going to cover one of the security aspects related to Azure Event Hubs.

Shared Access Signature (SAS) is a commonly used authentication mechanism for Azure Event Hubs which can be used to enforce granular control over the type of access you want to grant - it works by configuring rules on Event Hubs resources (namespace or topic). However, it is recommended that you use Azure AD credentials (over SAS) whenever possible since it provides similar capabilities without the need to manage SAS tokens or worry about revoking a compromised SAS.

July 29, 2020

Manage Azure Event Hubs With Azure Service Operator on Kubernetes

Azure Service Operator is an open source project to help you provision and manage Azure services using Kubernetes. Developers can use it to provision Azure services from any environment, be it Azure, any other cloud provider, or on-premises — Kubernetes is the only common denominator!

It can also be included as a part of CI/CD pipelines to create, use and tear down Azure resources on-demand. Behind the scenes, all the heavy lifting is taken care of by a combination of Custom Resource Definitions which define Azure resources and the corresponding Kubernetes Operator(s) which ensure that the state defined by the Custom Resource Definition is reflected in Azure as well.

July 28, 2020August 4, 2020

Change Data Capture Architecture Using Debezium, Postgres, and Kafka

Change Data Capture (CDC) is a technique used to track row-level changes in database tables in response to create, update and delete operations. Different databases use different techniques to expose these change data events - for example, logical decoding in PostgreSQL, MySQL binary log (binlog) etc. This is a powerful capability, but useful only if there is a way to tap into these event logs and make it available to other services which depend on that information.

Debezium does just that! It is a distributed platform that builds on top of Change Data Capture features available in different databases. It provides a set of Kafka Connect connectors which tap into row-level changes (using CDC) in database table(s) and convert them into event streams. These event streams are sent to Apache Kafka which is a scalable event streaming platform - a perfect fit! Once the change log events are in Kafka, they will be available to all the downstream applications.