Applying Kappa Architecture to Make Data Available Where It Matters

Introduction 

Banks are accelerating their modernization effort to rapidly develop and deliver top-notch digital experiences for their customers. To achieve the best possible customer experience, decisions need to be made at the edge where customers interact. It is critical to access associated data to make decisions. Traversing the bank’s back-end systems, such as mainframes, from the digital experience layer is not an option if the goal is to provide the customers the best digital experience. Therefore, for making decisions fast without much latency, associated data should be available closer to the customer experience layer.    

Thankfully, over the last few years, the data processing architecture has evolved from ETL-centric data processing to real-time or near real-time streaming data processing architecture. Such patterns as change data capture (CDC) and command query responsibility segregation (CQRS) have evolved with architecture styles like Lambda and Kappa. While both architecture styles have been extensively used to bring data to the edge and process, over a period of time data architects and designers have adopted Kappa architecture over Lambda architecture for real-time processing of data. Combining the architecture style with advancements in event streaming, Kappa architecture is gaining traction in consumer-centric industries. This has greatly helped them to improve customer experience, and, especially for large banks, it is helping them to remain competitive with FinTech, which has already aggressively adopted event-driven data streaming architecture to drive their digital (only) experience. 

Streaming Data Exchange With Kafka and a Data Mesh in Motion

Data Mesh is a new architecture paradigm that gets a lot of buzz these days. Every data and platform vendor describes how to build the best Data Mesh with their platform. The Data Mesh story includes cloud providers like AWS, data analytics vendors like Databricks and Snowflake, and Event Streaming solutions like Confluent. This blog post looks into this principle deeper to explore why no single technology is the perfect fit to build a Data Mesh. Examples show why an open and scalable decentralized real-time platform like Apache Kafka is often the heart of the Data Mesh infrastructure, complemented by many other data platforms, to solve business problems.

Data at Rest vs. Data in Motion

Before we get into the Data Mesh discussion, it is crucial to clarify the difference and relevance of Data at Rest and Data in Motion:

Protect Your Invariants!

How do you handle constraints and validation inside your application? Most developers put this logic somewhere close to their application's boundary. Actually, this prevents them from having a rich domain model that could ensure consistency.

Developers tend to get confused when they need to find a good place for their business logic. I suspect that most probably the reasons are related to all those bad examples circulating in the documentations of popular frameworks and bad habits from coding in old-fashioned enterprise platforms like J2EE. Most of the time, they're afraid to keep these vital pieces of information in the language elements most relevant to the target domain: I'm talking about the simple classes reflecting the "nouns" of the business logic. Often, you see something like the example below:

Why Develop a Decentralized Application Architecture for Cloud-Native, API-centric, and Microservices Patterns

This article is outlining the cell-based architecture, which was published as an open specification on GitHub summer-2018. Our approach creates a pragmatic and technology-neutral reference architecture that addresses the requirement for agility. It can be instantiated to create an effective and agile method for digital enterprises, deployed in private, public, or hybrid cloud environments.

When I present the new architecture at technology events, one common question is the reason we are defining a new reference architecture in an already crowded market.  This article lists the motivating factors.  

Challenges With Implementing DDD

I have finished reading the DDD books by Eric Evans (blue book) and Vaughn Vernon (red book) and would like to write my personal takeaway notes of challenges implementing DDD.

Implementing Domain-Driven Design Books

Although all original ideas come from the blue book I would admit that the red one gives more understanding of how you can apply the DDD approach. It is not a surprise as the blue book was published in 2003 so it naturally feels a bit outdated in the case of provided examples, but nevertheless, the ideas of DDD are the same in both books.

Don’t Build Distributed Monoliths!

Microservices have been a hype topic for the last several years, and many developers are using this concept when structuring and implementing code nowadays. However, as always, every technology has advantages and disadvantages. So when I’m asked whether microservices architectures make sense, my answer is: It depends!

Cloud-native architectures and microservices clearly have a lot of benefits. One benefit is the simplicity of smaller modules. For example, I used to work on a product that had grown for many years and it had several million lines of code. Developers were scared to change even a few lines, since the effects were not predictable. As a result, productivity was very low. Smaller services would certainly have helped a lot to handle the complexity.

Microservices Powered By Domain-Driven Design

Have you been finding it difficult to model the boundaries of your system’s microservices? Have you been slowed down by the Technical complexity of your codebase? Has your team been stepping on each other’s toes?

If answers to any or many of such questions are yes, then Domain-Driven Design is likely useful to your Team!

The Concept of Domain-Driven Design Explained

Using microservices means creating applications from loosely coupling services. The application consists of several small services, each representing a separate business goal. They can be developed and easily maintained individually, after what they are joint in a complex application. 

Microservices is an architecture design model with a specific bounded context, configuration, and dependencies. These result from the architectural principles of the domain-driven design and DevOps. Domain-driven design is the idea of solving problems of the organization through code. 

Domain Events Versus Change Data Capture

The building of change data capture (CDC) and event based systems have recently come up several time in my discussions with people and in my online trawling. I sensed enough confusion around them that I figured this was worth talking about here.
CDC and event-based communication are two very different things which look similar to some extent, hence the confusion. Beware — confusing one for the other can lead to very difficult architectural situations.

What Are These Things?

Change Data Capture (CDC) typically alludes to a mechanism for capturing all changes happening to a system's data. The need for such a system is not difficult to imagine — audit for sensitive information, data replication across multiple DB instances or data centers, moving changes from transactional databases to data lakes/OLAP stores. Transaction management in ACID compliant databases is essentially CDC. A CDC system is a record of every change every made to an entity and the metadata of that change (changed by, change time etc).
You may also enjoy: Change Data Capture (CDC) With Embedded Debezium and Spring Boot
I have written about events on this blog before and have described them as announcements of something that has happened in the system domain, with relevant data about that occurrence. At a glance, this might seem to be the same as CDC — something changes in a system and this needs to be communicated to other systems — which is exactly what CDC is about.

However, there is a key distinction to be made here. Events are defined at a far higher level of abstraction than data changes because they are meaningful changes to the domain. Data representing an entity can change without it having any "business" impact on the overall entity that the data represents. There can be several sub-states of an order that an order management system might maintain internally but which do not matter to the outside world. 

An order moving to these states would not generate events but changes would be logged in the CDC system. On the other hand, there are be states that the rest of the world cares about (created, dispatched etc) and the order management system explicitly exposes to the outside world. Changes to or from these states would generate events.

Low-code Limits Customization Flexibility: Myth or Reality?

It’s not controversial to say that low-code has become a trend in the development of turnkey solutions for business. However, low-code is not traditionally the first choice for developing systems that handle complex business tasks. One of the main drawbacks associated with low-code development is the limited functionality of applications created. Often platforms do not provide an easy way to add custom code.

In this article, we will discuss if a low-code platform could be flexible enough to describe the logic of really sophisticated business processes.

3 Keys to Efficient Enterprise Microservices Governance

An enterprise normally has a few thousand microservices, having autonomy for each team in selecting its own choice of the technology stack. Therefore, it’s inevitable that an enterprise should have a microservices governance mechanism to avoid building an unmanageable and unstable architecture.

Any centralized governance goes against the core principle of microservices architectures i.e. “provide autonomy and agility to the team.” But that also doesn’t mean that we should not have a centralized policy, standards, and best practices that each team should follow. With an enterprise-scale of integrations with multiple systems and complex operations, the question is, “How do we effectively provide decentralized governance?”

Assessing Legacy ERP Systems With Wardley Maps

Let's talk about today's Swiss army knife software systems called "Enterprise Resource Planning Systems" (ERP systems). These are really powerful tools, no question, but in some situations, they can cause more harm than good — especially when they are really old and called "legacy systems." So, I want to tell a fictive story that shows how an organization can get deep into trouble. For this, I’ll try to use Nick Tune’s brand new Core Domain Patterns and Wardley Mapping (if you aren't familar with Wardley Maps, I recommend watching the YouTube video "Investing in innovation").

For understanding the context around ERP systems better, we first take a brief look at the role of IT systems in the past decades:

JavaLand 2019 Retrospective

In this article, I talk about my impressions from the JavaLand 2019 conference. This was my second time at the international conference, which, this year, took place in the theme park "Phantasialand" in Bruehl, near Cologne, Germany, from March 18th-20th.

Additionally, you can download the presentations here, as well as lecture recordings here.

Scalability :  Think in Terms Of TCO

A system that has the ability to easily scale resources to meet the increasing workload without affecting the performance is known as a scalable system. The workload could refer to anything from an increase in users, storage, or a number of transactions.

To make an easy-to-scale system, it is crucial to have an evolutionary way of thinking about the software development cycle. An architect should focus on designing a scalable software architecture from the early phase of the product life cycle.

Consistency Through Compensation in Microservices

This article addresses the eventual consistency aspect of transactions in a microservices environment where a transaction spans more than one microservice and where transaction failure midway through is imminent.

Existing Business Use Case

Suppose that we currently have a monolithic order management system that is backed by an RDBMS.

Parameter Resolvers in Axon

Axon is an open-source Java framework for building systems in CQRS (Command Query Responsibility Segregation), DDD (Domain Driven Design), and Event Sourcing manner. Axon provides a high level of location transparency, which gives the opportunity to easily split the system into several microservices. You can download the fully open-source package here. The Axon Framework consists of message-based commands, events, and queries that are supported types of messages. For each of these messages, we can define a handler (a method or constructor) to handle it. In some cases, the message itself has enough information to be handled, but in many others, we have a dependency on other components and/or variables (message metadata such as correlation information would be a good example). Axon provides a really powerful mechanism to inject these dependencies into message handlers — Parameter Resolvers. This is the story about them. At the end of this article, you can find a cheat sheet of all resolvers provided by the Axon Framework.

Anatomy

There are two important components that support this mechanism: the  ParameterResolver and the  ParameterResolverFactory (see Figure 1).