Building Scalable Streaming Applications

DataStax has recently released its Astra Streaming, enabling developers to build streaming applications on top of an elastically scalable, multi-cloud messaging and event streaming platform powered by Apache Pulsar. This article will walk you through a short demo that will provide a great starting point for familiarizing yourself with this powerful new streaming service.

Here’s what you will learn:

Why Pulsar Beats Kafka for a Scalable, Distributed Data Architecture

The leading open-source event streaming platforms are Apache Kafka and Apache Pulsar. For enterprise architects and application developers, choosing the right event streaming approach is critical, as these technologies will help their apps scale up around data to support operations in production.

Everyone wants results faster. We want applications that know what we want, even before we know ourselves. We want systems that constantly check for fraud or security issues to protect our data. We want applications that are smart enough to react and change plans when faced with the unexpected. And we want those services to be continuously available.

Bring Streaming to Apache Cassandra with Apache Pulsar

Twitch, YouTube, Instagram, Facebook — virtually every major brand nowadays uses live streaming to connect and engage their audience. For enterprises and developers building cloud-native applications, this growing trend creates a need for streaming technologies that can reliably handle the rush of massive amounts of data, while also being flexible and easy to manage for developers.

One such technology is Apache Pulsar® — an open-source, distributed messaging and streaming platform that’s easy to deploy, simple to scale, and packed with developer-friendly APIs. So the next question is: how can you stream from Pulsar to Apache Cassandra®, the powerful NoSQL database designed to support data-heavy applications in the cloud?

Join our beginner-friendly Pulsar workshop on YouTube and learn how to connect Pulsar with Cassandra for streaming! In this post, we’ll set the scene with an introduction to Pulsar and guide you through four hands-on exercises where you’ll use these free, cloud-native technologies: Katacoda, Kesque, GitPod, and DataStax Astra DB. Each exercise will also be linked to the step-by-step instructions on the DataStax Developers GitHub wiki.

Simplify Migrating From Kafka to Pulsar With Kafka Connect Support

Large-scale implementations of any system, such as the event-streaming platform Apache Kafka, often involve customizations and tools and plugins developed in-house. When it’s time to transition from one system to another, the task can become complicated, drawn-out, and error-prone. Often the benefits of an alternative system (which can include significant cost savings and other efficiencies) are outweighed by the risks and costs of migration. As a result, an organization can end up locked into a suboptimal situation, footing a bigger bill than necessary and missing out on modern features that help move the business forward faster. 

These risks and costs can be mitigated by making the transition process iterative, breaking off the vendor lock-in in small, manageable steps, and avoiding the "big bang" switch that often results in delayed delivery and increases the cost of running two systems in parallel for A|B testing. 

Comparing Pulsar and Kafka From a CTO’s Point of View

When upper management assesses a new technology, they view it from a different perspective than middle management, architects, or data engineers. Upper management is not just looking at benchmarks and feature lists; they’re looking for long-term viability and how it gives their company a clear competitive edge. In addition, they’re optimizing for time to market and costs.

As the Managing Director for Big Data Institute, technology assessment is a key part of my role. We help companies identify and adopt the best technologies for their business needs. Our clients especially appreciate our vendor-neutral approach.