kafka connectors | The Blog Pros

May 19, 2022

Apache Kafka Essentials

Dive into Apache Kafka: Readers will review its history and fundamental components — Pub/Sub, Kafka Connect, and Kafka Streams. Key concepts in these areas are supplemented with detailed code examples that demonstrate producing and consuming data, using connectors for easy data streaming and transformation, performing common operations in KStreams, and more.

February 12, 2022

Developing an Enterprise-Level Apache Cassandra Sink Connector for Apache Pulsar

When DataStax started investing in streaming with Apache Pulsar™, we knew that one of the first things people would want to do was connect existing enterprise data sources to Apache Cassandra™ using Pulsar.

Apache Pulsar has a powerful framework called Pulsar IO to enable this kind of use case, and at DataStax we already had a best-in-class Kafka Connect Sink that enables you to store structured data coming from one or more Kafka topics into DataStax Enterprise, Apache Cassandra, and Astra.

January 16, 2021

Streaming Data From Files Into Multi-Broker Kafka Clusters

There are multiple ways to ingest data streams into the Apache Kafka topic and subsequently deliver to various types of consumers who are hooked to the topic. The stream of data that collects continuously from the topic by consumers, passes through multiple data pipelines and then stream processing engines like Apache Spark, Apache Flink, Amazon Kinesis, etc and eventually landed upon the real-time applications to deliver a final data-driven decision. From finances, manufacturing, insurance, telecom, healthcare, commerce, and more, real-time applications are becoming the best solution for organizations to take immediate action, gain insights from the updated data. In the present day, Apache Kafka shapes the central nervous system that brings data from all aspects of the business to the large information operational hubs where choices are made.

The text files contain unformatted ASCII text and are commonly used for the storage of information. Each line of the file represents a data record and can be updated continuously to store. Every insert of a new line or lines on the text file can be considered as new data insertion on the file. Henceforth, every addition of a new line or lines on the text file continuously either by humans or applications (no modification on the already inserted line)and subsequently moves or sends to a different location can be considered as data streaming from the file. Every addition of a new line or row in the text file can be analyzed continuously by exporting the new line/lines to the Kafka topic and importing them by consumers that hooks up with the topic.

October 7, 2020

Kafka for XML Message Integration and Processing

XML messages and XML Schema are not very common in the Apache Kafka and Event Streaming world! Why? Many people call XML legacy. It is complex, verbose, and often associated with the ugly WS-* Hell (SOAP, WSDL, etc). On the other side, every company older than five years uses XML. It is well understood, provides a good structure, and is human- and machine-readable.

This post does not want to start another flame war between XML and other technologies such as JSON (which also provides JSON Schema now), Avro, or Protobuf. Instead, I will walk you through the three main approaches to integrate between Kafka and XML messages as there is still a vast demand for implementing this integration today (often for integrating legacy applications and middleware).

September 3, 2020

Reading AWS S3 File Content to Kafka Topic

Apache Camel

Apache Camel is an open-source framework for message-oriented middleware with a rule-based routing and mediation engine that provides a Java object-based implementation of the Enterprise Integration Patterns using an application programming interface to configure routing and mediation rules

Red Hat AMQ Streams

Red Hat AMQ Streams is a massively-scalable, distributed, and high-performance data streaming platform based on the Apache ZooKeeper and Apache Kafka projects.

August 31, 2020

Apache Kafka and SAP ERP Integration Options

A question I get every week from customers across the globe is, "How can I integrate my SAP system with Apache Kafka?" This post explores various alternatives, including connectors, third party tools, custom glue code, and trade-offs between the different options.

After exploring what SAP is, I will discuss several integration options between Apache Kafka and SAP systems:

May 26, 2020

AMQ Streams With Kafka Connect on Openshift

Have you ever heard about AMQ Streams? It is a Kafka Platform based on Apache Kafka and powered by Red Hat. A big advantage of AMQ Streams is that as all Red Hat tools, it is open source.

This article's about a common feature of Apache Kafka, called Kafka Connect. I'll explain how it works with OpenShift.