performance tuning | The Blog Pros

May 16, 2022

Why Performance Projects Fail?

Projects involving performance testing and engineering fail for a variety of reasons. The majority of performance project failures occur for various highly complex reasons from every phase of the development life cycle and performance testing life cycle. Sometimes, performance problems are uncontrollable, and it’s out of the control of a project manager, technical architects, or performance engineers. In my experience, from both business and personal levels, most the performance projects fail due to simply a lack of communication between performance engineers, developers, DBA's, business teams, and stakeholders from the beginning, and this ends up causing many other problems which will directly impact application performance and ROI. The only objective of strategic, effective performance testing for any application/product is to achieve a satisfactory return on investment. Performance testing and engineering the applications are risky and always require a lot of trial and error with rigorous testing from the early stages of development.

Failures in performance testing projects must be treated similarly to other business problems. It is essential to understand what went wrong, why it went wrong, and what can be done to prevent it. In most scenarios, the performance engineers have to run the one-man show role to make everyone educate/understand the performance challenges in the end-to-end full life cycle implementations. Working with Practice and COE teams, we continued seeing the same mistakes repeatedly from multiple teams and projects, so, based on my personal experience, I have compiled a list of reasons Why Performance Projects Fail.

April 13, 2022

Kubernetes Performance Tuning: Make the Most of Your Clusters

Image Source

Why Is Kubernetes Performance Tuning Needed?

As Kubernetes becomes a basic infrastructure for many organizations, performance tuning for Kubernetes clusters is becoming more important. Kubernetes is a highly scalable open-source platform for orchestrating containerized workloads in server environments. It enables declarative configuration and automation of computing resources.

January 23, 2021

How to Trace Linux System Calls in Production (Without Breaking Performance)

If you need to dynamically trace Linux process system calls, you might first consider strace. strace is simple to use and works well for issues such as "Why can't the software run on this machine?" However, if you're running a trace in a production environment, strace is NOT a good choice. It introduces a substantial amount of overhead. According to a performance test conducted by Arnaldo Carvalho de Melo, a senior software engineer at Red Hat, the process traced using strace ran 173 times slower, which is disastrous for a production environment.

So are there any tools that excel at tracing system calls in a production environment? The answer is YES. This blog post introduces perf and traceloop, two commonly used command-line tools, to help you trace system calls in a production environment.

January 7, 2021January 13, 2021

Why We Disable Linux’s THP Feature for Databases

Linux's memory management system is clear to the user. However, if you're not familiar with its working principles, you might meet unexpected performance issues. That's especially true for sophisticated software like databases. When databases are running in Linux, even small system variations might impact performance.

After an in-depth investigation, we found that Transparent Huge Page (THP), a Linux memory management feature, often slows down database performance. In this post, I'll describe how THP causes performance to fluctuate, the typical symptoms, and our recommended solutions.

December 19, 2020

Why We Switched from bcc-tools to libbpf-tools for Linux BPF Performance Analysis

Distributed clusters might encounter performance problems or unpredictable failures, especially when they are running in the cloud. Of all the kinds of failures, kernel failures may be the most difficult to analyze and simulate.

A practical solution is Berkeley Packet Filter (BPF), a highly flexible, efficient virtual machine that runs in the Linux kernel. It allows bytecode to be safely executed in various hooks, which exist in a variety of Linux kernel subsystems. BPF is mainly used for networking, tracing, and security.

October 23, 2019

A Plan for Performance Bugs in 10 Steps: When Managers Want Answers Now

Performance bugs are never good.

You may also like: The Lifecycle of a Testing Bug

You probably experienced this: you made a new release, everything works, but in production, it turns out much slower than expected. It behaves completely differently. Customers complain. Managers want you to wave a magic wand to conjure the problem away. It's not that easy.

August 16, 2019

Application Scalability — How To Do Efficient Scaling

When you build a great product or application, sooner or later, it will be drawing attention more and more users who will expect a flawless, perfect application as the demand grows in the time it handles more and more requests per minute. If we are not prepared for this, the application performance will start degrading, and you will lose your audience and business. In this article, we explain why you should pay attention to when building a scalable application.

What Is Application Scalability?

Application scalability is the potential of an application to grow in time, being able to efficiently handle more and more requests per minute (RPM). It’s not just a simple tweak you can turn on/off; it’s a long-time process that touches almost every single item in your stack, including both hardware and software sides of the system.

June 10, 2019

Apache RocketMQ: Lessons Learned on How to Ensure Stable Capacity

In a previous article, we talked about how Apache RocketMQ fine-tuned the bottlenecks related to latency.

Remember Little’s law?

May 23, 2019

Snowflake Performance Tuning: Top 5 Best Practices

How do you tune the Snowflake data warehouse when there are no indexes, and few options available to tune the database itself?

Snowflake was designed for simplicity, with few performance tuning options. This article summarizes the top five best practices to maximize query performance.

March 14, 2019

Caching in: Lessons From Performance Engineering on Jira Cloud

Performance engineering is a big deal when you're serving millions of users from every corner of the globe. We previously wrote about a large engineering transformation program for Jira and Confluence, which we codenamed Vertigo – read more about the overall program here.

As part of the Vertigo program, we knew we were going to have to invest a lot of engineering effort into performance, and, in particular, the performance of Jira. While we have always spent time over the years improving Jira’s performance using the tools and architecture at hand, the Vertigo architecture brought a host of new opportunities to further improve the performance and reliability of Jira.

January 11, 2019

CQRS Replay Performance Tuning

Command Query Responsibility Segregation (CQRS) has become a popular pattern to avoid complex and, therefore, slow database queries in applications. With CQRS, queries are served from dedicated read models that are optimized for the query’s specific needs. This ensures that queries are as simple as possible and, therefore, as fast as possible. The read models need to be initialized and kept up to date with the primary storage (the command side). This is done most easily if the primary storage is event-based: the Event Sourcing (ES) concept. Axon offers a mature and popular implementation of the CQRS/ES concepts on the JVM, but several other implementations exist.

On the one hand, CQRS/ES presents a solution to a common performance problem. On the other hand, it introduces a couple of new potential performance challenges. First, the event store may not be able to deal efficiently with the number of events being stored. If the events are kept in a regular relational database table, performance will severely degrade once it grows to the point where the indices can’t be buffered in RAM. This is when built-for-purpose event stores like Axon Server can come to the rescue.