Building Resilience With Chaos Engineering and Litmus

The scalability, agility, and continuous delivery offered by microservices architecture make it a popular option for businesses today. Nevertheless, microservices architectures are not invulnerable to disruptions. Various factors, such as network communication, inter-service dependencies, external dependencies, and scalability issues, can contribute to outages.

Prominent companies like Slack, Twitter, Robinhood Trading, Amazon, Microsoft, Google, and others have recently encountered outages resulting in significant downtime expenses. These incidents underscore the wide-ranging causes of outages in microservices architectures, encompassing configuration errors, database issues, infrastructure scaling failures, and code problems.

Log Monitoring and Alerting With Grafana Loki

In a production environment, a downtime of even a few microseconds is intolerable. Debugging such issues is time-critical. Proper logging and monitoring of infrastructure help in debugging such scenarios. It also helps in optimizing cost and other resources proactively, as well as helps to detect any impending issue which may arise in the near future. There are various logging and monitoring solutions available in the market. In this post, we will walk through the steps to deploy Grafana Loki in a Kubernetes environment. This is due to its seamless compatibility with Prometheus, a widely used software for collecting metrics. Grafana Loki consists of three components: Promtail, Loki, and Grafana (PLG), which we will see in brief before proceeding to the deployment. This article provides a better insight into the architectural differences of PLG and other primary logging and monitoring stack like Elasticsearch-FluentD-Kibana (EFK).

Logging, Monitoring, and Alerting With Grafana Loki

Before proceeding with the steps for deploying Grafana Loki, we will see each tool briefly.

Consul Deployment Patterns: A Brief Overview

If you've ever delved into a service mesh, key-value store, or service discovery solution in the cloud-native space, you have definitely come across Consul. Consul, developed by HashiCorp, is a multi-purpose solution which primarily provides the following features:

  • Service discovery and Service Mesh features with Kubernetes.
  • Secure communication and observability between the services.
  • Automate load-balancing.
  • Key-Value store.
  • Consul watches.

This blog post briefly explains the deployment patterns for Consul to use when making configuration changes that are stored in the Key-Value store. It will explain how to discover and synchronize with the services running out of the Kubernetes cluster. We will also see how to enable Service Mesh features with Consul. We broadly categorize Consul deployment patterns as in-cluster patterns (Consul deployed in a Kubernetes cluster) and hybrid patterns (Consul deployed outside a Kubernetes cluster).