Building Scalable Streaming Applications

DataStax has recently released its Astra Streaming, enabling developers to build streaming applications on top of an elastically scalable, multi-cloud messaging and event streaming platform powered by Apache Pulsar. This article will walk you through a short demo that will provide a great starting point for familiarizing yourself with this powerful new streaming service.

Here’s what you will learn:

Why Pulsar Beats Kafka for a Scalable, Distributed Data Architecture

The leading open-source event streaming platforms are Apache Kafka and Apache Pulsar. For enterprise architects and application developers, choosing the right event streaming approach is critical, as these technologies will help their apps scale up around data to support operations in production.

Everyone wants results faster. We want applications that know what we want, even before we know ourselves. We want systems that constantly check for fraud or security issues to protect our data. We want applications that are smart enough to react and change plans when faced with the unexpected. And we want those services to be continuously available.

Bring Streaming to Apache Cassandra with Apache Pulsar

Twitch, YouTube, Instagram, Facebook — virtually every major brand nowadays uses live streaming to connect and engage their audience. For enterprises and developers building cloud-native applications, this growing trend creates a need for streaming technologies that can reliably handle the rush of massive amounts of data, while also being flexible and easy to manage for developers.

One such technology is Apache Pulsar® — an open-source, distributed messaging and streaming platform that’s easy to deploy, simple to scale, and packed with developer-friendly APIs. So the next question is: how can you stream from Pulsar to Apache Cassandra®, the powerful NoSQL database designed to support data-heavy applications in the cloud?

Join our beginner-friendly Pulsar workshop on YouTube and learn how to connect Pulsar with Cassandra for streaming! In this post, we’ll set the scene with an introduction to Pulsar and guide you through four hands-on exercises where you’ll use these free, cloud-native technologies: Katacoda, Kesque, GitPod, and DataStax Astra DB. Each exercise will also be linked to the step-by-step instructions on the DataStax Developers GitHub wiki.

7 Reasons to Choose Apache Pulsar over Apache Kafka

So why did we build our messaging service using Apache Pulsar?

At DataStax, our mission is to empower developers to build cloud-native distributed applications by making cloud-agnostic, high-performance messaging technology easily available to everyone. Developers want to write distributed applications or microservices but don’t want the hassle of managing complex message infrastructure or getting locked into a particular cloud vendor. They need a solution that just works. Everywhere.

Apache Kafka Essentials

Dive into Apache Kafka: Readers will review its history and fundamental components — Pub/Sub, Kafka Connect, and Kafka Streams. Key concepts in these areas are supplemented with detailed code examples that demonstrate producing and consuming data, using connectors for easy data streaming and transformation, performing common operations in KStreams, and more.

Simplify Migrating From Kafka to Pulsar With Kafka Connect Support

Large-scale implementations of any system, such as the event-streaming platform Apache Kafka, often involve customizations and tools and plugins developed in-house. When it’s time to transition from one system to another, the task can become complicated, drawn-out, and error-prone. Often the benefits of an alternative system (which can include significant cost savings and other efficiencies) are outweighed by the risks and costs of migration. As a result, an organization can end up locked into a suboptimal situation, footing a bigger bill than necessary and missing out on modern features that help move the business forward faster. 

These risks and costs can be mitigated by making the transition process iterative, breaking off the vendor lock-in in small, manageable steps, and avoiding the "big bang" switch that often results in delayed delivery and increases the cost of running two systems in parallel for A|B testing. 

Solace PubSub+ vs. Kafka: Implementation of the Publish-Subscribe Messaging Pattern

In this post, I’ll explain how pub/sub messaging pattern implementation differs in Kafka and Solace PubSub+.

Publish-Subscribe, also known as pub/sub, is a popular messaging pattern that is commonly used in today’s systems to help them efficiently distribute data and scale among other things. The pub/sub messaging pattern can be easily implemented through an event broker such as Solace PubSub+, Kafka, RabbitMQ, and ActiveMQ.

ELI5: What Is the Publish-Subscribe Messaging Pattern?

Introduction

Used in microservices architecture (a method of designing software applications that is rapidly growing in popularity), the publish-subscribe messaging pattern is a form of asynchronous communication where messages are published to a topic and received – in real-time – by consumers who subscribe to the topic.

Now, what does that really mean? What are the advantages of publish-subscribe? How can you explain it to someone non-technical?

Google Cloud Pub/Sub – Overview

Introduction

The Google Cloud Pub/Sub is a fully managed real-time messaging service and it helps our applications/services to send and receive messages independently between them. It helps us to build robust and scalable applications by integrating them asynchronously. It provides scalability, resilience and handles millions of messages simultaneously.

Why Google Cloud Pub/Sub?

Google Cloud Pub/Sub can be used for a lot of use cases. In general, If you want to process large amounts of data for analytics or do you want to simplify the event-driven microservices development?, then the Google Cloud Pub-Sub is the right choice.

Why Pub/Sub Isn’t Enough for Modern Apps

Chat notifications in Slack. Your Uber driver’s current position. Gone are the days where an app simply presented static data or invoked the occasional API; today’s users expect applications to be fully responsive—not just in terms of UI, but in terms of data too.

This shift in how data is used by applications is universal, but introducing live data to existing applications is not a trivial task. As developers tackle these new requirements, they’ll quickly encounter several realizations:

API Publish/Subscribe Between Zato Services

One of the additions in the upcoming Zato 3.2 release is an extension to its publish/subscribe mechanism that lets services publish messages directly to other services. Let’s check how to use it and how it compares to other means of invoking one’s API services. 

How Does It Work?

In your Zato service, you can publish a message to any other services as below. Simply point self.pubsub.publish to the target service by the latter’s name and it will receive your message.

MQTT – Message Queue Telemetry Transport

What Is MQTT

  • A message protocol with “a small code footprint and on-the-wire footprint”.
  • MQTT is a publish-subscribe-based messaging protocol.
  • On top of TCP/IP.
  • Requires a broker (e.g. mosquito, hivemq, azure IO Hub).
  • ISO standard (ISO/IEC PRF 20922).
  • A message bus for: unreliable, high latency, low bandwidth
  • Payload with a plain byte array.

MQTT PUB/SUB

  • The protocol uses a publish/subscribe architecture in contrast to HTTP with its request/response paradigm.
  • Publish/Subscribe is event-driven and enables messages to be pushed to clients.
  • The central communication point is the MQTT broker, it is in charge of dispatching all messages between the senders and the rightful receivers.
  • Each client that publishes a message to the broker, includes a topic into the message. The topic is the routing information for the broker.
  • Each client that wants to receive messages subscribes to a certain topic and the broker delivers all messages with the matching topic to the client.
  • Therefore the clients don’t have to know each other, they only communicate over the topic.
  • This architecture enables highly scale-able solutions without dependencies between the data producers and the data consumers.

… and What Is With REST?

  • HTTP/REST is useful to handle documents and resources.
  • MQTT is useful to handle messages.
  • HTTP/REST can be complex and is not always the best solution for simple messages.
  • The MQTT packet size is 2 byte + payload.
  • MQTT supports 1-to-1, 1-to-many, and many-to-many messages.
  • Request and response vs publisher and subscriber.

Architecture

The difference to HTTP is that a client doesn’t have to pull the information it needs, but the broker pushes the information to the client, in case there is something new.

Therefore each MQTT client has a permanently open TCP connection to the broker. If this connection is interrupted by any circumstances, the MQTT broker can buffer all messages and send them to the client when it is back online.

How to Implement Producer/Consumer With System.Threading.Channels

What’s this “Producer/Consumer” thing? It’s around us, everywhere. Every time you see some kind of workflow with multiple serial steps, that’s an example. A production line in a car factory, a fast-food kitchen, even the postal service.

So why do we care about it? Well, that’s easy: in almost every piece of software we write, there’s a pipeline to fulfill. And as every pipeline, once a step is completed the output is redirected to the next one in line, freeing up space for another execution.

Publish/Subscribe and Asynchronous API Integrations

Publish/Subscribe and Asynchronous API Integrations

This article introduces features built into Zato that let you take advantage of publish/subscribe topics and message queues in communication between Zato services, API clients, and backend systems.

Overview

Let's start by recalling the basic means through which services can invoke each other.

Tutorial on wxPython 4 and PubSub

The Publish-Subscribe pattern is pretty common in computer science and very useful too. The wxPython GUI toolkit has had an implementation of it for a very long time in wx.lib.pubsub. This implementation is based on the PyPubSub package. While you could always download PyPubSub and use it directly instead, it was nice to be able to just run wxPython without an additional dependency.

However, as of wxPython 4.0.4, wx.lib.pubsub is now deprecated and will be removed in a future version of wxPython. So now you will need to download PyPubSub or PyDispatcher if you want to use the Publish-Subscribe pattern easily in wxPython.

Event-Driven Pub-Sub Design

Problem Statement

In a complex, enterprise-level application, fundamental entities can be updated in various integration points or endpoints. You will build many business rules based on these entities' data. As the system grows bigger and bigger, it will be very difficult to track all the integration points that will update the entities' data.

In such cases, there are chances that it would be very difficult to get things right. It will be very hard to identify all integration points that will update the entities' data to plug the new business rules.

How Are Your Microservices Talking?

Microservices communication

In this piece, which originally appeared here, we’ll look at the challenges of refactoring SOAs to MSAs, in light of different communication types between microservices, and see how pub-sub message transmission — as a managed Apache Kafka Service — can mitigate or even eliminate these challenges.

If you’ve developed or updated any kind of cloud-based application in the last few years, chances are you’ve done so using a Microservices Architecture (MSA), rather than the slightly more dated Service-Oriented Architecture (SOA). So, what’s the difference?