November 21, 2022 by Hemanth Atluri

Request Tracing in Spring Cloud Stream Data Pipelines With Kafka Binder

What Is Data Pipeline?

A data pipeline is a process to transfer data from a single source/multiple sources to a destination via batch or stream process. In this process, raw data is filtered, cleaned, transformed, enriched, and feed to data lakes or data warehouses. In an enterprise, data is scattered to multiple systems in different formats; in those cases, data pipelines help to collect data from all the sources in the same format so that it’s appropriate for business intelligence and analytics.

What Is Request Tracing?

In a distributed system, a request travels multiple services before completion. Services can be hosted in different network zones, different VMs, different cloud providers, or any combination of these. Triaging an issue in this environment is tedious and time-consuming to eliminate; request tracing is helpful. A unique id is minted at the origin of the request, and it will be carried forward to all the systems request travel; with this approach, using a unique id, we can trace the journey of the request.

GBase 8a Implementation Guide: Resource Assessment
No categories
1. Disk Storage Space Evaluation The storage space requirements for a GBase cluster are calculated based on the data volume of the business system, the choice of compression algorithm, and the number of cluster replicas. The data volume of a business s... […]
A Look Into Netflix System Architecture
No categories
Ever wondered how Netflix keeps you glued to your screen with uninterrupted streaming bliss? Netflix Architecture is responsible for the smooth streaming experience that attracts viewers worldwide behind the scenes. Netflix's system architecture emphas... […]
High Availability and Disaster Recovery (HADR) in SQL Server on AWS
No categories
High Availability and Disaster Recovery (HADR) play a vital role in maintaining the integrity of data, reducing downtime, and safeguarding against data loss in enterprise database systems. AWS offers a range of HADR options for SQL Server, which levera... […]
Terraform Tips for Efficient Infrastructure Management
No categories
Terraform is a popular tool for defining and provisioning infrastructure as code (IaC), improving consistency, repeatability, and version control. But you need to know how to use it properly to extract maximum value from it as an infrastructure managem... […]
Integration Testing With Keycloak, Spring Security, Spring Boot, and Spock Framework
No categories
In today's security landscape, OAuth2 has become a standard for securing APIs, providing a more robust and flexible approach than basic authentication. My journey into this domain began with a critical solution architecture decision: migrating from bas... […]

Proudly powered by WordPress