Monitoring and the ELK Stack

Any application monitoring solution should maintain an open design, build upon proven technologies, be accessible, and require a low learning curve. The end goal is simple: provide teams with the ability to identify issues or unexpected behavior within minutes, if not seconds. The ELK Stack meets these expectations and more. In this Refcard, you'll cover the basic components of the ELK Stack, how it maps to a log analysis workflow, and step-by-step instructions for installation, configuration, and reporting.

Getting Started With Log Management

The reality of modern application design means that when an unexpected issue occurs, the ability to find the root cause can be difficult. This is where the concept of centralized log management can provide a great deal of assistance. This Refcard teaches you the basic flow of a log management process, provides a comprehensive checklist of questions to consider when evaluating log management solutions, advises you on what you should and should not log, and covers advanced functionality for log management.

The Importance of Access Logs in Performance Issue Analysis

An access log is generated by the web server to log the details about the request that it has processed. It logs status code, response time, URL, protocol, size, client IP address, etc., about the request.  Load Balancers will have similar log files created to log the request details. While doing any performance analysis, these logs play an important role. It is being neglected most of the time due to lack of awareness and the usage of APM tools abstracts it from the users by allowing them to focus visually on coarse-grained data instead of fine-grained data. Most people are aware of the application server log but many of them are not aware of the web server/load balancer access log. Most of the time, this log will not be striking in the mind of a person who investigates the problem.

Access logs are available for the below servers while the format for each of them varies:

Getting Started With Observability for Distributed Systems

To net the full benefits of a distributed system, applications' underlying architectures must achieve various company-level objectives including agility, velocity, and speed to market. Implementing a reliable observability strategy, plus the right tools for your specific business requirements, will give teams the insights needed to properly operate and manage their entire distributed ecosystem on an ongoing basis.

This Refcard covers the three pillars of observability — metrics, logs, and traces — and how they not only complement an organization's monitoring efforts but also work together to help profile, interpret, and optimize system-wide performance.

Using Machine Learning for Log Analysis and Anomaly Detection: A Practical Approach to Finding the Root Cause

There are many articles on applying machine learning for log analysis. However, most of them are dated, academic in nature, or don’t focus on practical outcomes. On DZone, the last time an article covering how ML can be used for log analysis was published 5 years ago.

In this article, we want to share our real-life experience on using ML/AI for log analysis and anomaly detection with the specific purpose of automatically uncovering the root cause of software issues.

A Good Old-Fashioned Perl Log Analyzer

A Good Old-Fashioned Perl Log Analyzer

A recent Lobsters post lauding the virtues of AWK reminded me that although the language is powerful and lightning-fast, I usually find myself exceeding its capabilities and reaching for Perl instead. One such application is analyzing voluminous log files such as the ones generated by this blog. Yes, WordPress has stats, but I've never let reinvention of the wheel get in the way of a good programming exercise.

So I whipped this script up on Sunday night while watching RuPaul's Drag Race reruns. It parses my Apache web server log files and reports on hits from week to week.

Usage of Practical Grep Commands Examples Useful in Real World Debugging in Linux

In our Daily debugging we need to analyze logs files of various products. Reading those log files is not an easy task, it requires special debugging skills which can only be gained through experience or by god’s grace. Now while debugging we might need to extract some of the data or we need to play with a log file which can not be done by just reading, there is a need for commands. 

There are many commands in Linux which are used by debuggers like grep, awk, sed, wc, taskset, ps, sort, uniq, cut, xargs, etc . . . 

Implementing Scalyr’s PowerQueries

Older log management solutions grew up with complex query languages, including huge libraries of “commands” to manipulate and visualize data. These complex languages make advanced tasks possible but are difficult and cumbersome even for everyday tasks. Only a handful of users ever really know how to use the language, and they typically have to undergo extensive training and certification in order to be productive.

With the benefit of experience, we were in a position to create a clean-sheet design that supports powerful data manipulation with a relatively simple language. The result is PowerQueries: a new set of commands for transforming and manipulating data on the fly. In this article, we’ll talk about how we were able to accomplish this without sacrificing performance.

Nginx Log Analytics With AWS Athena and Cube.js

Sometimes, existing commercial or out-of-the-box open-source tools like Grafana don’t fit requirements for Nginx log analytics. Whether it is pricing, privacy, or customization issues, it is always good to know how to build such a system internally.

In the following tutorial, I’ll show you how to build your own Nginx log analytics with FluentdKinesis Data FirehoseGlueAthena, and Cube.js. This stack also makes it easy to add data from other sources, such as Snowplow events, into the same S3 bucket and merge results in Athena. I’ll walk you through the whole pipeline from data collection to the visualization.

Kafka Logging With the ELK Stack

Kafka and the ELK Stack — usually these two are part of the same architectural solution, Kafka acting as a buffer in front of Logstash to ensure resiliency. This article explores a different combination — using the ELK Stack to collect and analyze Kafka logs. 

More on the subject:

Knative Log Analysis With LogDNA on IBM Cloud

In this post, you will learn how to use the IBM Log Analysis with LogDNA service to configure cluster-level logging for an app named "Knative-node-app" published in IBM Cloud Kubernetes Service. Refer to this post to set up a Node.js app.

IBM Log Analysis with LogDNA offers administrators, DevOps teams, and developers advanced features to filter, search, and tail log data, define alerts, and design custom views to monitor application and system logs.

From the moment you provision a cluster with IBM Cloud Kubernetes Service, you want to know what is happening inside the cluster. You need to access logs to troubleshoot problems and pre-empt issues. At any time, you want to have access to different types of logs such as worker logs, pod logs, app logs, or network logs. In addition, you want to monitor different sources of log data in your Kubernetes cluster. Therefore, your ability to manage and access log records from any of these sources is critical. Your success in managing and monitoring logs depends on how you configure the logging capabilities for your Kubernetes platform.