data in motion | The Blog Pros

January 10, 2022

Streaming Data Exchange With Kafka and a Data Mesh in Motion

Data Mesh is a new architecture paradigm that gets a lot of buzz these days. Every data and platform vendor describes how to build the best Data Mesh with their platform. The Data Mesh story includes cloud providers like AWS, data analytics vendors like Databricks and Snowflake, and Event Streaming solutions like Confluent. This blog post looks into this principle deeper to explore why no single technology is the perfect fit to build a Data Mesh. Examples show why an open and scalable decentralized real-time platform like Apache Kafka is often the heart of the Data Mesh infrastructure, complemented by many other data platforms, to solve business problems.

Data at Rest vs. Data in Motion

Before we get into the Data Mesh discussion, it is crucial to clarify the difference and relevance of Data at Rest and Data in Motion:

October 21, 2021

How To Discover Personal Data in Cloud Storage

Data loss prevention tools are often employed to discover and monitor personal data in the cloud, but how effective and costly are they?

Personal data laws have been a bit of a spanner in the works and made everyone have a bit of a rethink about how they store client data that could be classified as “personal”. The thing is, which data can be classed as personal can change depending on whether it is paired with other data. This means that data that has the potential to be personal could be pretty much anywhere.

September 26, 2021

Achieving Data Agility With the Combined Strengths of AWS and Confluent

To create a genuinely reliable and highly scalable platform in the Cloud, companies often find they must combine technologies to meet their real-time demands. Many businesses are struggling with real-time streaming capabilities because it is challenging to accomplish elegantly and meet the requirements for effective data transformation and data movement.

To create a cloud-based powerhouse, you need to combine tools that can complement each other’s strengths. One of our favorite combinations is AWS and Confluent. AWS is the standard cloud provider for scalability and flexibility, especially for those using AWS serverless services. Confluent is best-of-breed for messaging, streaming, and real-time needs. Together, they create a synergy that helps businesses respond to their data agility needs.

August 9, 2021

Serverless Kafka in a Cloud-Native Data Lake Architecture

Apache Kafka became the de facto standard for processing data in motion. Kafka is open, flexible, and scalable. Unfortunately, the latter makes operations a challenge for many teams. Ideally, teams can use a serverless Kafka SaaS offering to focus on business logic. However, hybrid scenarios require a cloud-native platform that provides automated and elastic tooling to reduce the operations burden. This blog post explores how to leverage cloud-native and serverless Kafka offerings in a hybrid cloud architecture. We start from the perspective of data at rest with a data lake and explore its relation to data in motion with Kafka.

Data at Rest - Still the Right Approach?

Data at Rest means to store data in a database, data warehouse, or data lake. This means that the data is processed too late in many use cases - even if a real-time streaming component (like Kafka) ingests the data. The data processing is still a web service call, SQL query, or map-reduce batch process away from providing a result to your problem.

June 26, 2021

Apache Kafka for Industrial IoT and Manufacturing 4.0

This post explores use cases and architectures for processing data in motion with Apache Kafka in Industrial IoT (IIoT) across verticals such as automotive, energy, steel manufacturing, oil&gas, cybersecurity, shipping, logistics. Use cases include predictive maintenance, quality assurance, track and track, real-time locating system (RTLS), asset tracking, customer 360, and more. Examples include BMW, Bosch, Baader, Intel, Porsche, and Devon.

Why Kafka Is a Key Piece of the Evolution for Industrial IoT and Manufacturing

Industrial IoT was a mess of monolithic and proprietary technologies in the last decades. Modbus, Siemens S7, SCADA, and similar "concepts" controlled the industry. Vendors locked in enterprises by intentionally building incompatible products without open interfaces. These systems still run on Windows XP or similar non-supported outdated operating systems and without security in mind.

March 23, 2021

Apache Kafka in the Airline, Aviation and Travel Industry

Aviation and travel are notoriously vulnerable to social, economic, and political events, as well as the ever-changing expectations of consumers. Coronavirus is just a piece of the challenge. This post explores use cases, architectures, and references for Apache Kafka in the aviation industry, including airlines, airports, global distribution systems (GDS), aircraft manufacturers, and more. Kafka was relevant pre-covid and will become even more important post-covid.

Airlines and Aviation are Changing — Beyond Covid-19!

Aviation and travel are notoriously vulnerable to social, economic, and political events. These months have been particularly testing one due to the global pandemic with Covid-19. But the upcoming change is coming not just due to the Coronavirus but because of the ever-changing expectations of consumers.

February 12, 2021

Intro To Apache Kafka: How Kafka Works

Introduction

We recently published a series of tutorial videos and tweets on the Apache Kafka^® platform. So now you know there’s a thing called Kafka, but before you put your hands to the keyboard and start writing code, you need to form a mental model of what the thing is. These videos give you the basics you need to know to have the broad grasp on Kafka necessary to continue learning and eventually start coding. This article summarizes those videos.

Events

Pretty much all of the programs you’ve ever written respond to events of some kind: the mouse moving, input becoming available, web forms being submitted, bits of JSON being posted to your endpoint, the sensor on the pear tree detecting that a partridge has landed on it, etc. Kafka encourages you to see the world as sequences of events, which it models as key-value pairs. The key and the value have some kind of structure, usually represented in your language’s type system, but fundamentally they can be anything. Events are immutable, as it is (sometimes tragically) impossible to change the past.