Coordinating an Apache Ignite Cluster With GridGain Control Center

Bundling various data sources, APIs, services, applications, and several data streams while managing application data integration can become cumbersome. It’s so complex that it typically results in application performance loss. So, database administrators use Apache Ignite, a distributed database that provides high-performance computing capabilities using in-memory speed. Integrating Apache Ignite as an in-memory caching or distributed database solution helps improve the velocity and performance of complex architecture.

But, at the same time, this solution presents new challenges: we’re integrating yet another component into our already complex architecture. GridGain provides a solution to this challenge enabling monitoring, managing, and troubleshooting Apache Ignite clustered environments, whether they’re running as an on-premises solution or as a SaaS offering in the cloud. 

Why My In-Memory Cluster Underperforms: Negating Network Impact

Memory access is so much faster than disk I/O that many of us expect to gain striking performance advantages by merely deploying a distributed in-memory cluster and start reading data from it. However, sometimes we overlook the fact that a network interconnects cluster nodes with our applications, and it can quickly diminish the positive effects of having an in-memory cluster if a lot of data gets transferred continuously over the wire.

With that being said, using proper data access patterns provided by distributed in-memory technologies can negate the effect of the network latency. In this article, we're using the APIs of Apache Ignite's in-memory computing platform to see how the performance of our application changes if we put less pressure on the communication channels. The ultimate goal is to be able to deploy horizontally scalable in-memory clusters that can tap into the pool of RAM and CPUs spread across all machines with minimal impact of the network. 

Apache Ignite: Partitioned Cache

Background

Many enterprise applications are distributed and deployed on multiple servers and accessed by many interfacing applications. In this series, we will go through various scenarios of usage of Apache Ignite in large applications.

We will implement the following scenario in this article:

How to Use Caching With Azure Cosmos DB

Cosmos DB is the new NoSQL database released in Azure Cloud by Microsoft. Unlike relational databases, Cosmos DB is scalable as it is a hosted database service, so it enjoys a lot of popularity among high transaction .NET and .NET Core applications.

However, using Cosmos DB, you need to be wary of performance bottlenecks and cost overhead for accessing the database as Microsoft charges you for each transaction to Cosmos DB. While Cosmos DB is scalable in terms of transaction capacity, it is not as fast because the database service is living in a separate VNet or subscription than the applications. So even if your applications are running in Azure cloud, accessing the database across the VNet is a huge blow to the performance.

Thoughts on Apache Ignite

Yahoo! JAPAN is one of the largest e-commerce platforms in Japan, and we are constantly working on improving users' shopping experience from attempts to understand their needs and behaviors and applying that knowledge to provide better shopping experience. We have a large number of services and APIs for that, and as the number of requests to them is growing (due to the release of new services, or an increase of calls from numerous microservices) we have to be able to scale accordingly and guarantee sub-second data access.

One of the services with a high demand for being able to process tens of thousands of requests per second and grow in near future is 'Recent purchases.' Querying the service for such purchases provides you the latest n items by particular categories in a specified depth. Originally, the data was stored in RDBMS and reached 500+ million records, and the old DB infrastructure couldn't cope with increasing number of requests and prevented the service adoption by other users. That now was the time to renovate the DB to meet higher demand from users.

Using Apache Ignite Thin Client

From the version 2.4.0, Apache Ignite introduced a new way to connect to the Ignite cluster, which allows communication with the Ignite cluster without starting an Ignite client node. Historically, Apache Ignite provides two notions of client and server nodes. Ignite client node is intended as lightweight mode, which does not store data (however, it can store near cache) and does not execute any compute tasks. Mainly, the client node is used to communicate with the server remotely and allows manipulating the Ignite Caches using the whole set of Ignite API’s. There are two main downsides with the Ignite Client node:

  • Whenever Ignite client node connects to the Ignite cluster, it becomes the part of the cluster topology. The bigger the topology is, the harder it is for maintaining.
  • In the client mode, Apache Ignite node consumes a lot of resources for performing cache operations

To solve the above problems, Apache Ignite provides a new binary client protocol for implementing thin Ignite client in any programming language or platforms.