APIs Outside, Events Inside

This is an article from DZone's 2022 Enterprise Application Integration Trend Report.

For more:


Read the Report

In the echo chambers of application development, we constantly hear the mantra "API-first," but this slogan has a fundamental flaw: APIs should typically be the last choice when building a distributed application. The correct war cry ought to instead be: "APIs outside, events inside."

How To Set Up a Scalable and Highly-Available GraphQL API in Minutes

A modern GraphQL API layer for cloud-native applications needs to possess two characteristics: horizontal scalability and high availability. 

Horizontal scalability adds more machines to your API infrastructure, whereas vertical scalability adds more CPUs, RAM, and other resources to an existing machine that runs the API layer. While vertical scalability works to a certain extent, the horizontally scalable API layer can scale beyond the capacity of a single machine. 

Getting Started With Observability for Distributed Systems

To net the full benefits of a distributed system, applications' underlying architectures must achieve various company-level objectives including agility, velocity, and speed to market. Implementing a reliable observability strategy, plus the right tools for your specific business requirements, will give teams the insights needed to properly operate and manage their entire distributed ecosystem on an ongoing basis.

This Refcard covers the three pillars of observability — metrics, logs, and traces — and how they not only complement an organization's monitoring efforts but also work together to help profile, interpret, and optimize system-wide performance.

Distributed Tracing With Spring Cloud Sleuth and Zipkin

In the case of a single giant application that does everything, which we usually refer to as a monolith, tracing the incoming request within the application is easy. We can follow the logs and then figure out how the request is being handled. There is nothing else we have to look at but the application logs themselves. 

Over time, monoliths have become difficult to scale, to serve a large number of requests as well as delivering new features to the customer with the growing size of the codebase. This leads to breaking down the monolith into microservice, which helps in scaling individual components and also helps to deliver faster. 

Microservice Architecture Roadmap

Why Microservice Architecture?

Microservice Roadmap.

Nowadays, with the rise of social media, fast internet, etc., the tendency to use applications is getting more and more. As a result of these behavior changes, monolithic applications need to deal with a tremendous majority of changes.

HarperDB: More Than a Database

Introduction

I recently had a very interesting conversation on our podcast with Ron Lewis, the Director of Innovation and Engineering at Lumen Technologies. Ron brought up the notion that HarperDB is more than just a database, and for certain users or projects, HarperDB is not serving as a database at all. How can this be possible?

Database, Explained

Well, what really is a database? Wikipedia states “In computing, a database is an organized collection of data stored and accessed electronically from a computer system.” Another site simply states that “A database is a systematic collection of data. They support electronic storage and manipulation of data. Databases make data management easy.”

Utilizing BigQuery as A Data Warehouse in A Distributed Application

Introduction

Data plays an integral part in any organization. With the data-driven nature of modern organizations, almost all businesses and their technological decisions are based on the available data. Let's assume that we have an application distributed across multiple servers in different regions of a cloud service provider, and we need to store that application data in a centralized location. The ideal solution for that would be to use some type of database. However, traditional databases are ill-suited to handle extremely large datasets and lack the features that would help data analysis. In that kind of situation, we will need a proper data warehousing solution like Google BigQuery.

What is Google BigQuery?

BigQuery is an enterprise-grade, fully managed data warehousing solution that is a part of the Google Cloud Platform. It is designed to store and query massive data sets while enabling users to manage data via the BigQuery data manipulation language (DML) based on the standard SQL dialect.

Cloud-Native and MongoDB

NoSQL stands for "Not an SQL" The term has been coined many years back, as a result, a number of products have come to the market spanning from key-value to document to columnar to a graph database. MongoDB started its journey as the leading provider of Document DB, a close competitor was CouchBase for reference. There are several languages, a framework which built an out-of-the-box connector for MongoDB just like how JDBC connectors help to connect Java programs to an RDBMS, for example, Mongoose as a module under NodeJS, available through npm and this is one of the key components in MEAN (MongoDB, Express, AngularJS, Node.js) or MERN (MongoDB, Express, ReactJS, Node.js) stacks.

Over the years many such platforms have realized that just having a database will not help to build the platform, the need is to help build a Cloud Native solution with Database-as-a-Service aka DaaS is the future. This means the platform needs to support API as the key enabler to connect to various consumers. The classical way to use an on-premise installation with SDK based connector will not stand long in this "Journey to Cloud" era.

Distributed Vs Replicated Cache

Introduction

Caching facilitates faster access to data that is repeatedly being asked for. The data might have to be fetched from a database or have to be accessed over a network call or have to be calculated by an expensive computation. We can avoid multiple calls for these repeated data-asks by storing the data closer to the application (generally, in memory or local disc). Of course, all of this comes at a cost. We need to consider the following factors when cache has to be implemented:

  1. Additional memory is needed for applications to cache the data.
  2. What if the cached data is updated? How do you invalidate the cache? (Needless to say, now that caching works well, when the data is cached it does not need to be changed often.)
  3. We need to have Eviction Policies (LRU, LFU etc.) in place to delete the entries when cache grows bigger.

Caching becomes more complicated when we think of distributed systems. Let us assume we have our application deployed in a 3-node cluster:

Serverless Apache Spark: Data Flow Cloud Service

Apache Spark is a technology that is very close to becoming the industry standard among distributed big data processing platforms. It is possible to encounter Spark in almost every company working on big data. We can use this technology, which is widely used with the support of performance and many programming interfaces, in our on-premises systems as well as the interfaces opened by cloud providers.

In the past few weeks, Oracle added another one to its cloud services and launched the serverless Spark Execution Engine infrastructure on the Oracle Cloud infrastructure, and this service was designated as Data Flow. Now, users who want to use Spark can easily and quickly raise their Spark Execution Engines and deploy their applications to this environment.

Programming Microservices Communication With Istio

Microservices are great but they are also a pain to maintain. With so many moving components, microservices make it difficult to maintain and identify the bottlenecks. The mindset has to be changed and applications need to be designed for failure. 

Applications are split into multiple smaller services and they should be. These small services should be interconnected. They should know where the other services live and how to connect with them. They should be consistent and highly available. The connections should be intelligent and configurable. These services should be secured, traffic between services should be encrypted, requests should be properly authorized and authenticated. Since these services scale up and down autonomously, it is important to monitor and observe them for any issues, errors, exceptions, time lag, and other things.