Understanding How CALMS Extends To Observability

As our observability and DevOps practices continue to join, similar alignment happens with our frameworks and goals. In fact, when one considers that observability is about increased and deeper data about our environments, it is evident that the frameworks aren’t changing so much as adapting to faster insights.

Let’s take a look at CALMS. CALMS (Culture, Automation, Lean, Measurement, Sharing) was created by Jez Humble and is meant as a method of assessing how an organization is adapting to DevOps practices. However, as we add observability, CALMS can extend to our observability practice as well.

Does Observability Throw You for a Loop?

Our new mantra for managing and maintaining the health and functionality of our apps and environments is observability. Observability is the quality of software, services, platforms, or products that allows us to understand how systems are behaving. Without the new sources of data giving us insights, our modern cloud-native applications would be quite a challenge to monitor. Observability, that deep data, is the new fuel for our developer and DevOps engineers.

The duality of observability is controllability. Observability is the ability to infer the internal state of a 'machine' from externally exposed signals. Controllability is the ability to control input to direct the internal state to the desired outcome. While driving, observing a red stoplight means controlling our vehicle by pressing the breaks (or in some modern vehicles, having the brakes applied automatically for us).

Survivorship Bias in Observability

During World War II, a mathematician named Abraham Wald worked on a problem –  identifying where to add armor to planes based on the aircraft that returned from missions and their bullet puncture patterns. The obvious and accepted thought was that the bullets represented the problem areas for the planes. Wald pointed out that the problem areas weren’t actually these areas, because these planes survived. He found that the missing planes had unknown data, indicating other problem areas existed. In fact, the pattern for the surviving planes showed the areas that weren’t problematic.

By McGeddon - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=53081927