What Are CSS Container Style Queries Good For?

We’ve relied on media queries for a long time in the responsive world of CSS but they have their share of limitations and have shifted focus more towards accessibility than responsiveness alone. This is where CSS Container Queries come in. They completely change how we approach responsiveness, shifting the paradigm away from a viewport-based mentality to one that is more considerate of a component’s context, such as its size or inline-size.

Querying elements by their dimensions is one of the two things that CSS Container Queries can do, and, in fact, we call these container size queries to help distinguish them from their ability to query against a component’s current styles. We call these container style queries.

Existing container query coverage has been largely focused on container size queries, which enjoy 90% global browser support at the time of this writing. Style queries, on the other hand, are only available behind a feature flag in Chrome 111+ and Safari Technology Preview.

The first question that comes to mind is What are these style query things? followed immediately by How do they work?. There are some nice primers on them that others have written, and they are worth checking out.

But the more interesting question about CSS Container Style Queries might actually be Why we should use them? The answer, as always, is nuanced and could simply be it depends. But I want to poke at style queries a little more deeply, not at the syntax level, but what exactly they are solving and what sort of use cases we would find ourselves reaching for them in our work if and when they gain browser support.

Why Container Queries

Talking purely about responsive design, media queries have simply fallen short in some aspects, but I think the main one is that they are context-agnostic in the sense that they only consider the viewport size when applying styles without involving the size or dimensions of an element’s parent or the content it contains.

This usually isn’t a problem since we only have a main element that doesn’t share space with others along the x-axis, so we can style our content depending on the viewport’s dimensions. However, if we stuff an element into a smaller parent and maintain the same viewport, the media query doesn’t kick in when the content becomes cramped. This forces us to write and manage an entire set of media queries that target super-specific content breakpoints.

Container queries break this limitation and allow us to query much more than the viewport’s dimensions.

How Container Queries Generally Work

Container size queries work similarly to media queries but allow us to apply styles depending on the container’s properties and computed values. In short, they allow us to make style changes based on an element’s computed width or height regardless of the viewport. This sort of thing was once only possible with JavaScript or the ol’ jQuery, as this example shows.

As noted earlier, though, container queries can query an element’s styles in addition to its dimensions. In other words, container style queries can look at and track an element’s properties and apply styles to other elements when those properties meet certain conditions, such as when the element’s background-color is set to hsl(0 50% 50%).

That’s what we mean when talking about CSS Container Style Queries. It’s a proposed feature defined in the same CSS Containment Module Level 3 specification as CSS Container Size Queries — and one that’s currently unsupported by any major browser — so the difference between style and size queries can get a bit confusing as we’re technically talking about two related features under the same umbrella.

We’d do ourselves a favor to backtrack and first understand what a “container” is in the first place.

Containers

An element’s container is any ancestor with a containment context; it could be the element’s direct parent or perhaps a grandparent or great-grandparent.

A containment context means that a certain element can be used as a container for querying. Unofficially, you can say there are two types of containment context: size containment and style containment.

Size containment means we can query and track an element’s dimensions (i.e., aspect-ratio, block-size, height, inline-size, orientation, and width) with container size queries as long as it’s registered as a container. Tracking an element’s dimensions requires a little processing in the client. One or two elements are a breeze, but if we had to constantly track the dimensions of all elements — including resizing, scrolling, animations, and so on — it would be a huge performance hit. That’s why no element has size containment by default, and we have to manually register a size query with the CSS container-type property when we need it.

On the other hand, style containment lets us query and track the computed values of a container’s specific properties through container style queries. As it currently stands, we can only check for custom properties, e.g. --theme: dark, but soon we could check for an element’s computed background-color and display property values. Unlike size containment, we are checking for raw style properties before they are processed by the browser, alleviating performance and allowing all elements to have style containment by default.

Did you catch that? While size containment is something we manually register on an element, style containment is the default behavior of all elements. There’s no need to register a style container because all elements are style containers by default.

And how do we register a containment context? The easiest way is to use the container-type property. The container-type property will give an element a containment context and its three accepted values — normal, size, and inline-size — define which properties we can query from the container.

/* Size containment in the inline direction */
.parent {
  container-type: inline-size;
}

This example formally establishes a size containment. If we had done nothing at all, the .parent element is already a container with a style containment.

Size Containment

That last example illustrates size containment based on the element’s inline-size, which is a fancy way of saying its width. When we talk about normal document flow on the web, we’re talking about elements that flow in an inline direction and a block direction that corresponds to width and height, respectively, in a horizontal writing mode. If we were to rotate the writing mode so that it is vertical, then “inline” would refer to the height instead and “block” to the width.

Consider the following HTML:

<div class="cards-container">
  <ul class="cards">
    <li class="card"></li>
  </ul>
</div>

We could give the .cards-container element a containment context in the inline direction, allowing us to make changes to its descendants when its width becomes too small to properly display everything in the current layout. We keep the same syntax as in a normal media query but swap @media for @container

.cards-container {
  container-type: inline-size;
  }

  @container (width < 700px) {
  .cards {
    background-color: red;
  }
}

Container syntax works almost the same as media queries, so we can use the and, or, and not operators to chain different queries together to match multiple conditions.

@container (width < 700px) or (width > 1200px) {
  .cards {
    background-color: red;
  }
}

Elements in a size query look for the closest ancestor with size containment so we can apply changes to elements deeper in the DOM, like the .card element in our earlier example. If there is no size containment context, then the @container at-rule won’t have any effect.

/* 👎 
 * Apply styles based on the closest container, .cards-container
 */
@container (width < 700px) {
  .card {
    background-color: black;
  }
}

Just looking for the closest container is messy, so it’s good practice to name containers using the container-name property and then specifying which container we’re tracking in the container query just after the @container at-rule.

.cards-container {
  container-name: cardsContainer;
  container-type: inline-size;
}

@container cardsContainer (width < 700px) {
  .card {
    background-color: #000;
  }
}

We can use the shorthand container property to set the container name and type in a single declaration:

.cards-container {
  container: cardsContainer / inline-size;

  /* Equivalent to: */
  container-name: cardsContainer;
  container-type: inline-size;
}

The other container-type we can set is size, which works exactly like inline-size — only the containment context is both the inline and block directions. That means we can also query the container’s height sizing in addition to its width sizing.

/* When container is less than 700px wide */
@container (width < 700px) {
  .card {
    background-color: black;
  }
}

/* When container is less than 900px tall */
@container (height < 900px) {
  .card {
    background-color: white;
  }
}

And it’s worth noting here that if two separate (not chained) container rules match, the most specific selector wins, true to how the CSS Cascade works.

So far, we’ve touched on the concept of CSS Container Queries at its most basic. We define the type of containment we want on an element (we looked specifically at size containment) and then query that container accordingly.

Container Style Queries

The third value that is accepted by the container-type property is normal, and it sets style containment on an element. Both inline-size and size are stable across all major browsers, but normal is newer and only has modest support at the moment.

I consider normal a bit of an oddball because we don’t have to explicitly declare it on an element since all elements are style containers with style containment right out of the box. It’s possible you’ll never write it out yourself or see it in the wild.

.parent {
  /* Unnecessary */
  container-type: normal;
}

If you do write it or see it, it’s likely to undo size containment declared somewhere else. But even then, it’s possible to reset containment with the global initial or revert keywords.

.parent {
  /* All of these (re)set style containment */
  container-type: normal;
  container-type: initial;
  container-type: revert;
}

Let’s look at a simple and somewhat contrived example to get the point across. We can define a custom property in a container, say a --theme.

.cards-container {
  --theme: dark;
}

From here, we can check if the container has that desired property and, if it does, apply styles to its descendant elements. We can’t directly style the container since it could unleash an infinite loop of changing the styles and querying the styles.

.cards-container {
  --theme: dark;
}

@container style(--theme: dark) {
  .cards {
    background-color: black;
  }
}

See that style() function? In the future, we may want to check if an element has a max-width: 400px through a style query instead of checking if the element’s computed value is bigger than 400px in a size query. That’s why we use the style() wrapper to differentiate style queries from size queries.

/* Size query */
@container (width > 60ch) {
  .cards {
    flex-direction: column;
  }
}

/* Style query */
@container style(--theme: dark) {
  .cards {
    background-color: black;
  }
}

Both types of container queries look for the closest ancestor with a corresponding containment-type. In a style() query, it will always be the parent since all elements have style containment by default. In this case, the direct parent of the .cards element in our ongoing example is the .cards-container element. If we want to query non-direct parents, we will need the container-name property to differentiate between containers when making a query.

.cards-container {
  container-name: cardsContainer;
  --theme: dark;
}

@container cardsContainer style(--theme: dark) {
  .card {
    color: white;
  }
}
Weird and Confusing Things About Container Style Queries

Style queries are completely new and bring something never seen in CSS, so they are bound to have some confusing qualities as we wrap our heads around them — some that are completely intentional and well thought-out and some that are perhaps unintentional and may be updated in future versions of the specification.

Style and Size Containment Aren’t Mutually Exclusive

One intentional perk, for example, is that a container can have both size and style containment. No one would fault you for expecting that size and style containment are mutually exclusive concerns, so setting an element to something like container-type: inline-size would make all style queries useless.

However, another funny thing about container queries is that elements have style containment by default, and there isn’t really a way to remove it. Check out this next example:

.cards-container {
  container-type: inline-size;
  --theme: dark;
}

@container style(--theme: dark) {
  .card {
    background-color: black;
  }
}

@container (width < 700px) {
  .card {
    background-color: red;
  }
}

See that? We can still query the elements by style even when we explicitly set the container-type to inline-size. This seems contradictory at first, but it does make sense, considering that style and size queries are computed independently. It’s better this way since both queries don’t necessarily conflict with each other; a style query could change the colors in an element depending on a custom property, while a container query changes an element’s flex-direction when it gets too small for its contents.

But We Can Achieve the Same Thing With CSS Classes and IDs

Most container query guides and tutorials I’ve seen use similar examples to demonstrate the general concept, but I can’t stop thinking no matter how cool style queries are, we can achieve the same result using classes or IDs and with less boilerplate. Instead of passing the state as an inline style, we could simply add it as a class.

<ol>
  <li class="item first">
    <img src="..." alt="Roi's avatar" />
    <h2>Roi</h2>
  </li>
  <li class="item second"><!-- etc. --></li>
  <li class="item third"><!-- etc. --></li>
  <li class="item"><!-- etc. --></li>
  <li class="item"><!-- etc. --></li>
</ol>

Alternatively, we could add the position number directly inside an id so we don’t have to convert the number into a string:

<ol>
  <li class="item" id="item-1">
    <img src="..." alt="Roi's avatar" />
    <h2>Roi</h2>
  </li>
  <li class="item" id="item-2"><!-- etc. --></li>
  <li class="item" id="item-3"><!-- etc. --></li>
  <li class="item" id="item-4"><!-- etc. --></li>
  <li class="item" id="item-5"><!-- etc. --></li>
</ol>

Both of these approaches leave us with cleaner HTML than the container queries approach. With style queries, we have to wrap our elements inside a container — even if we don’t semantically need it — because of the fact that containers (rightly) are unable to style themselves.

We also have less boilerplate-y code on the CSS side:

#item-1 {
  background: linear-gradient(45deg, yellow, orange); 
}

#item-2 {
  background: linear-gradient(45deg, grey, white);
}

#item-3 {
  background: linear-gradient(45deg, brown, peru);
}

See the Pen Style Queries Use Case Replaced with Classes [forked] by Monknow.

As an aside, I know that using IDs as styling hooks is often viewed as a no-no, but that’s only because IDs must be unique in the sense that no two instances of the same ID are on the page at the same time. In this instance, there will never be more than one first-place, second-place, or third-place player on the page, making IDs a safe and appropriate choice in this situation. But, yes, we could also use some other type of selector, say a data-* attribute.

There is something that could add a lot of value to style queries: a range syntax for querying styles. This is an open feature that Miriam Suzanne proposed in 2023, the idea being that it queries numerical values using range comparisons just like size queries.

Imagine if we wanted to apply a light purple background color to the rest of the top ten players in the leaderboard example. Instead of adding a query for each position from four to ten, we could add a query that checks a range of values. The syntax is obviously not in the spec at this time, but let’s say it looks something like this just to push the point across:

/* Do not try this at home! */
@container leaderboard style(4 >= --position <= 10) {
  .item {
    background: linear-gradient(45deg, purple, fuchsia);
  }
}

In this fictional and hypothetical example, we’re:

  • Tracking a container called leaderboard,
  • Making a style() query against the container,
  • Evaluating the --position custom property,
  • Looking for a condition where the custom property is set to a value equal to a number that is greater than or equal to 4 and less than or equal to 10.
  • If the custom property is a value within that range, we set a player’s background color to a linear-gradient() that goes from purple to fuschia.

This is very cool, but if this kind of behavior is likely to be done using components in modern frameworks, like React or Vue, we could also set up a range in JavaScript and toggle on a .top-ten class when the condition is met.

See the Pen Style Ranged Queries Use Case Replaced with Classes [forked] by Monknow.

Sure, it’s great to see that we can do this sort of thing directly in CSS, but it’s also something with an existing well-established solution.

Separating Style Logic From Logic Logic

So far, style queries don’t seem to be the most convenient solution for the leaderboard use case we looked at, but I wouldn’t deem them useless solely because we can achieve the same thing with JavaScript. I am a big advocate of reaching for JavaScript only when necessary and only in sprinkles, but style queries, the ones where we can only check for custom properties, are most likely to be useful when paired with a UI framework where we can easily reach for JavaScript within a component. I have been using Astro an awful lot lately, and in that context, I don’t see why I would choose a style query over programmatically changing a class or ID.

However, a case can be made that implementing style logic inside a component is messy. Maybe we should keep the logic regarding styles in the CSS away from the rest of the logic logic, i.e., the stateful changes inside a component like conditional rendering or functions like useState and useEffect in React. The style logic would be the conditional checks we do to add or remove class names or IDs in order to change styles.

If we backtrack to our leaderboard example, checking a player’s position to apply different styles would be style logic. We could indeed check that a player’s leaderboard position is between four and ten using JavaScript to programmatically add a .top-ten class, but it would mean leaking our style logic into our component. In React (for familiarity, but it would be similar to other frameworks), the component may look like this:

const LeaderboardItem = ({position}) => {
  <li className={item ${position &gt;= 4 && position &lt;= 10 ? "top-ten" : ""}} id={item-${position}}>
    <img src="..." alt="Roi's avatar" />
    <h2>Roi</h2>
  </li>;
};

Besides this being ugly-looking code, adding the style logic in JSX can get messy. Meanwhile, style queries can pass the --position value to the styles and handle the logic directly in the CSS where it is being used.

const LeaderboardItem = ({position}) => {
  <li className="item" style={{"--position": position}}>
    <img src="..." alt="Roi's avatar" />
    <h2>Roi</h2>
  </li>;
};

Much cleaner, and I think this is closer to the value proposition of style queries. But at the same time, this example makes a large leap of assumption that we will get a range syntax for style queries at some point, which is not a done deal.

Conclusion

There are lots of teams working on making modern CSS better, and not all features have to be groundbreaking miraculous additions.

Size queries are definitely an upgrade from media queries for responsive design, but style queries appear to be more of a solution looking for a problem.

It simply doesn’t solve any specific issue or is better enough to replace other approaches, at least as far as I am aware.

Even if, in the future, style queries will be able to check for any property, that introduces a whole new can of worms where styles are capable of reacting to other styles. This seems exciting at first, but I can’t shake the feeling it would be unnecessary and even chaotic: styles reacting to styles, reacting to styles, and so on with an unnecessary side of boilerplate. I’d argue that a more prudent approach is to write all your styles declaratively together in one place.

Maybe it would be useful for web extensions (like Dark Reader) so they can better check styles in third-party websites? I can’t clearly see it. If you have any suggestions on how CSS Container Style Queries can be used to write better CSS that I may have overlooked, please let me know in the comments! I’d love to know how you’re thinking about them and the sorts of ways you imagine yourself using them in your work.

How To Hack Your Google Lighthouse Scores In 2024

This article is a sponsored by Sentry.io

Google Lighthouse has been one of the most effective ways to gamify and promote web page performance among developers. Using Lighthouse, we can assess web pages based on overall performance, accessibility, SEO, and what Google considers “best practices”, all with the click of a button.

We might use these tests to evaluate out-of-the-box performance for front-end frameworks or to celebrate performance improvements gained by some diligent refactoring. And you know you love sharing screenshots of your perfect Lighthouse scores on social media. It’s a well-deserved badge of honor worthy of a confetti celebration.

Just the fact that Lighthouse gets developers like us talking about performance is a win. But, whilst I don’t want to be a party pooper, the truth is that web performance is far more nuanced than this. In this article, we’ll examine how Google Lighthouse calculates its performance scores, and, using this information, we will attempt to “hack” those scores in our favor, all in the name of fun and science — because in the end, Lighthouse is simply a good, but rough guide for debugging performance. We’ll have some fun with it and see to what extent we can “trick” Lighthouse into handing out better scores than we may deserve.

But first, let’s talk about data.

Field Data Is Important

Local performance testing is a great way to understand if your website performance is trending in the right direction, but it won’t paint a full picture of reality. The World Wide Web is the Wild West, and collectively, we’ve almost certainly lost track of the variety of device types, internet connection speeds, screen sizes, browsers, and browser versions that people are using to access websites — all of which can have an impact on page performance and user experience.

Field data — and lots of it — collected by an application performance monitoring tool like Sentry from real people using your website on their devices will give you a far more accurate report of your website performance than your lab data collected from a small sample size using a high-spec super-powered dev machine under a set of controlled conditions. Philip Walton reported in 2021 that “almost half of all pages that scored 100 on Lighthouse didn’t meet the recommended Core Web Vitals thresholds” based on data from the HTTP Archive.

Web performance is more than a single core web vital metric or Lighthouse performance score. What we’re talking about goes way beyond the type of raw data we’re working with.

Web Performance Is More Than Numbers

Speed is often the first thing that comes up when talking about web performance — just how long does a page take to load? This isn’t the worst thing to measure, but we must bear in mind that speed is probably influenced heavily by business KPIs and sales targets. Google released a report in 2018 suggesting that the probability of bounces increases by 32% if the page load time reaches higher than three seconds, and soars to 123% if the page load time reaches 10 seconds. So, we must conclude that converting more sales requires reducing bounce rates. And to reduce bounce rates, we must make our pages load faster.

But what does “load faster” even mean? At some point, we’re physically incapable of making a web page load any faster. Humans — and the servers that connect them — are spread around the globe, and modern internet infrastructure can only deliver so many bytes at a time.

The bottom line is that page load is not a single moment in time. In an article titled “What is speed?” Google explains that a page load event is:

[…] “an experience that no single metric can fully capture. There are multiple moments during the load experience that can affect whether a user perceives it as ‘fast’, and if you just focus solely on one, you might miss bad experiences that happen during the rest of the time.”

The key word here is experience. Real web performance is less about numbers and speed than it is about how we experience page load and page usability as users. And this segues nicely into a discussion of how Google Lighthouse calculates performance scores. (It’s much less about pure speed than you might think.)

How Google Lighthouse Performance Scores Are Calculated

The Google Lighthouse performance score is calculated using a weighted combination of scores based on core web vital metrics (i.e., First Contentful Paint (FCP), Largest Contentful Paint (LCP), Cumulative Layout Shift (CLS)) and other speed-related metrics (i.e., Speed Index (SI) and Total Blocking Time (TBT)) that are observable throughout the page load timeline.

This is how the metrics are weighted in the overall score:

Metric Weighting (%)
Total Blocking Time 30
Cumulative Layout Shift 25
Largest Contentful Paint 25
First Contentful Paint 10
Speed Index 10

The weighting assigned to each score gives us insight into how Google prioritizes the different building blocks of a good user experience:

1. A Web Page Should Respond to User Input

The highest weighted metric is Total Blocking Time (TBT), a metric that looks at the total time after the First Contentful Paint (FCP) to help indicate where the main thread may be blocked long enough to prevent speedy responses to user input. The main thread is considered “blocked” any time there’s a JavaScript task running on the main thread for more than 50ms. Minimizing TBT ensures that a web page responds to physical user input (e.g., key presses, mouse clicks, and so on).

2. A Web Page Should Load Useful Content With No Unexpected Visual Shifts

The next most weighted Lighthouse metrics are Largest Contentful Paint (LCP) and Cumulative Layout Shift (CLS). LCP marks the point in the page load timeline when the page’s main content has likely loaded and is therefore useful.

At the point where the main content has likely loaded, you also want to maintain visual stability to ensure that users can use the page and are not affected by unexpected visual shifts (CLS). A good LCP score is anything less than 2.5 seconds (which is a lot higher than we might have thought, given we are often trying to make our websites as fast as possible).

3. A Web Page Should Load Something

The First Contentful Paint (FCP) metric marks the first point in the page load timeline where the user can see something on the screen, and the Speed Index (SI) measures how quickly content is visually displayed during page load over time until the page is “complete”.

Your page is scored based on the speed indices of real websites using performance data from the HTTP Archive. A good FCP score is less than 1.8 seconds and a good SI score is less than 3.4 seconds. Both of these thresholds are higher than you might expect when thinking about speed.

Usability Is Favored Over Raw Speed

Google Lighthouse’s performance scoring is, without a doubt, less about speed and more about usability. Your SI and FCP could be super quick, but if your LCP takes too long to paint, and if CLS is caused by large images or external content taking some time to load and shifting things visually, then your overall performance score will be lower than if your page was a little slower to render the FCP but didn’t cause any CLS. Ultimately, if the page is unresponsive due to JavaScript blocking the main thread for more than 50ms, your performance score will suffer more than if the page was a little slow to paint the FCP.

To understand more about how the weightings of each metric contribute to the final performance score, you can play about with the sliders on the Lighthouse Scoring Calculator, and here’s a rudimentary table demonstrating the effect of skewed individual metric weightings on the overall performance score, proving that page usability and responsiveness is favored over raw speed.

Description FCP (ms) SI (ms) LCP (ms) TBT (ms) CLS Overall Score
Slow to show something on screen 6000 0 0 0 0 90
Slow to load content over time 0 5000 0 0 0 90
Slow to load the largest part of the page 0 0 6000 0 0 76
Visual shifts occurring during page load 0 0 0 0 0.82 76
Page is unresponsive to user input 0 0 0 2000 0 70

The overall Google Lighthouse performance score is calculated by converting each raw metric value into a score from 0 to 100 according to where it falls on its Lighthouse scoring distribution, which is a log-normal distribution derived from the performance metrics of real website performance data from the HTTP Archive. There are two main takeaways from this mathematically overloaded information:

  1. Your Lighthouse performance score is plotted against real website performance data, not in isolation.
  2. Given that the scoring uses log-normal distribution, the relationship between the individual metric values and the overall score is non-linear, meaning you can make substantial improvements to low-performance scores quite easily, but it becomes more difficult to improve an already high score.

Read more about how metric scores are determined, including a visualization of the log-normal distribution curve on developer.chrome.com.

Can We “Trick” Google Lighthouse?

I appreciate Google’s focus on usability over pure speed in the web performance conversation. It urges developers to think less about aiming for raw numbers and more about the real experiences we build. That being said, I’ve wondered whether today in 2024, it’s possible to fool Google Lighthouse into believing that a bad page in terms of usability and usefulness is actually a great one.

I put on my lab coat and science goggles to investigate. All tests were conducted:

  • Using the Chromium Lighthouse plugin,
  • In an incognito window in the Arc browser,
  • Using the “navigation” and “mobile” settings (apart from where described differently),
  • By me, in a lab (i.e., no field data).

That all being said, I fully acknowledge that my controlled test environment contradicts my advice at the top of this post, but the experiment is an interesting ride nonetheless. What I hope you’ll take away from this is that Lighthouse scores are only one piece — and a tiny one at that — of a very large and complex web performance puzzle. And, without field data, I’m not sure any of this matters anyway.

How to Hack FCP and LCP Scores

TL;DR: Show the smallest amount of LCP-qualifying content on load to boost the FCP and LCP scores until the Lighthouse test has likely finished.

FCP marks the first point in the page load timeline where the user can see anything at all on the screen, while LCP marks the point in the page load timeline when the main page content (i.e., the largest text or image element) has likely loaded. A fast LCP helps reassure the user that the page is useful. “Likely” and “useful” are the important words to bear in mind here.

What Counts as an LCP Element

The types of elements on a web page considered by Lighthouse for LCP are:

  • <img> elements,
  • <image> elements inside an <svg> element,
  • <video> elements,
  • An element with a background image loaded using the url() function, (and not a CSS gradient), and
  • Block-level elements containing text nodes or other inline-level text elements.

The following elements are excluded from LCP consideration due to the likelihood they do not contain useful content:

  • Elements with zero opacity (invisible to the user),
  • Elements that cover the full viewport (likely to be background elements), and
  • Placeholder images or other images with low entropy (i.e., low informational content, such as a solid-colored image).

However, the notion of an image or text element being useful is completely subjective in this case and generally out of the realm of what machine code can reliably determine. For example, I built a page containing nothing but a <h1> element where, after 10 seconds, JavaScript inserts more descriptive text into the DOM and hides the <h1> element.

Lighthouse considers the heading element to be the LCP element in this experiment. At this point, the page load timeline has finished, but the page’s main content has not loaded, even though Lighthouse thinks it is likely to have loaded within those 10 seconds. Lighthouse still awards us with a perfect score of 100 even if the heading is replaced by a single punctuation mark, such as a full stop, which is even less useful.

This test suggests that if you need to load page content via client-side JavaScript, we‘ll want to avoid displaying a skeleton loader screen since that requires loading more elements on the page. And since we know the process will take some time — and that we can offload the network request from the main thread to a web worker so it won’t affect the TBT — we can use some arbitrary “splash screen” that contains a minimal viable LCP element (for better FCP scoring). This way, we’re giving Lighthouse the impression that the page is useful to users quicker than it actually is.

All we need to do is include a valid LCP element that contains something that counts as the FCP. While I would never recommend loading your main page content via client-side JavaScript in 2024 (serve static HTML from a CDN instead or build as much of the page as you can on a server), I would definitely not recommend this “hack” for a good user experience, regardless of what the Lighthouse performance score tells you. This approach also won’t earn you any favors with search engines indexing your site, as the robots are unable to discover the main content while it is absent from the DOM.

I also tried this experiment with a variety of random images representing the LCP to make the page even less useful. But given that I used small file sizes — made smaller and converted into “next-gen” image formats using a third-party image API to help with page load speed — it seemed that Lighthouse interpreted the elements as “placeholder images” or images with “low entropy”. As a result, those images were disqualified as LCP elements, which is a good thing and makes the LCP slightly less hackable.

View the demo page and use Chromium DevTools in an incognito window to see the results yourself.

This hack, however, probably won’t hold up in many other use cases. Discord, for example, uses the “splash screen” approach when you hard-refresh the app in the browser, and it receives a sad 29 performance score.

Compared to my DOM-injected demo, the LCP element was calculated as some content behind the splash screen rather than elements contained within the splash screen content itself, given there were one or more large images in the focussed text channel I tested on. One could argue that Lighthouse scores are less important for apps that are behind authentication anyway: they don’t need to be indexed by search engines.

There are likely many other situations where apps serve user-generated content and you might be unable to control the LCP element entirely, particularly regarding images.

For example, if you can control the sizes of all the images on your web pages, you might be able to take advantage of an interesting hack or “optimization” (in very large quotes) to arbitrarily game the system, as was the case of RentPath. In 2021, developers at RentPath managed to improve their Lighthouse performance score by 17 points when increasing the size of image thumbnails on a web page. They convinced Lighthouse to calculate the LCP element as one of the larger thumbnails instead of a Google Map tile on the page, which takes considerably longer to load via JavaScript.

The bottom line is that you can gain higher Lighthouse performance scores if you are aware of your LCP element and in control of it, whether that’s through a hack like RentPath’s or mine or a real-deal improvement. That being said, whilst I’ve described the splash screen approach as a hack in this post, that doesn’t mean this type of experience couldn’t offer a purposeful and joyful experience. Performance and user experience are about understanding what’s happening during page load, and it’s also about intent.

How to Hack CLS Scores

TL;DR: Defer loading content that causes layout shifts until the Lighthouse test has likely finished to make the test think it has enough data. CSS transforms do not negatively impact CLS, except if used in conjunction with new elements added to the DOM.

CLS is measured on a decimal scale; a good score is less than 0.1, and a poor score is greater than 0.25. Lighthouse calculates CLS from the largest burst of unexpected layout shifts that occur during a user’s time on the page based on a combination of the viewport size and the movement of unstable elements in the viewport between two rendered frames. Smaller one-off instances of layout shift may be inconsequential, but a bunch of layout shifts happening one after the other will negatively impact your score.

If you know your page contains annoying layout shifts on load, you can defer them until after the page load event has been completed, thus fooling Lighthouse into thinking there is no CLS. This demo page I created, for example, earns a CLS score of 0.143 even though JavaScript immediately starts adding new text elements to the page, shifting the original content up. By pausing the JavaScript that adds new nodes to the DOM by an arbitrary five seconds with a setTimeout(), Lighthouse doesn’t capture the CLS that takes place.

This other demo page earns a performance score of 100, even though it is arguably less useful and useable than the last page given that the added elements pop in seemingly at random without any user interaction.

Whilst it is possible to defer layout shift events for a page load test, this hack definitely won’t work for field data and user experience over time (which is a more important focal point, as we discussed earlier). If we perform a “time span” test in Lighthouse on the page with deferred layout shifts, Lighthouse will correctly report a non-green CLS score of around 0.186.

If you do want to intentionally create a chaotic experience similar to the demo, you can use CSS animations and transforms to more purposefully pop the content into view on the page. In Google’s guide to CLS, they state that “content that moves gradually and naturally from one position to another can often help the user better understand what’s going on and guide them between state changes” — again, highlighting the importance of user experience in context.

On this next demo page, I’m using CSS transform to scale() the text elements from 0 to 1 and move them around the page. The transforms fail to trigger CLS because the text nodes are already in the DOM when the page loads. That said, I did observe in my testing that if the text nodes are added to the DOM programmatically after the page loads via JavaScript and then animated, Lighthouse will indeed detect CLS and score things accordingly.

You Can’t Hack a Speed Index Score

The Speed Index score is based on the visual progress of the page as it loads. The quicker your content loads nearer the beginning of the page load timeline, the better.

It is possible to do some hack to trick the Speed Index into thinking a page load timeline is slower than it is. Conversely, there’s no real way to “fake” loading content faster than it does. The only way to make your Speed Index score better is to optimize your web page for loading as much of the page as possible, as soon as possible. Whilst not entirely realistic in the web landscape of 2024 (mainly because it would put designers out of a job), you could go all-in to lower your Speed Index as much as possible by:

  • Delivering static HTML web pages only (no server-side rendering) straight from a CDN,
  • Avoiding images on the page,
  • Minimizing or eliminating CSS, and
  • Preventing JavaScript or any external dependencies from loading.
You Also Can’t (Really) Hack A TBT Score

TBT measures the total time after the FCP where the main thread was blocked by JavaScript tasks for long enough to prevent responses to user input. A good TBT score is anything lower than 200ms.

JavaScript-heavy web applications (such as single-page applications) that perform complex state calculations and DOM manipulation on the client on page load (rather than on the server before sending rendered HTML) are prone to suffering poor TBT scores. In this case, you could probably hack your TBT score by deferring all JavaScript until after the Lighthouse test has finished. That said, you’d need to provide some kind of placeholder content or loading screen to satisfy the FCP and LCP and to inform users that something will happen at some point. Plus, you’d have to go to extra lengths to hack around the front-end framework you’re using. (You don’t want to load a placeholder page that, at some point in the page load timeline, loads a separate React app after an arbitrary amount of time!)

What’s interesting is that while we’re still doing all sorts of fancy things with JavaScript in the client, advances in the modern web ecosystem are helping us all reduce the probability of a less-than-stellar TBT score. Many front-end frameworks, in partnership with modern hosting providers, are capable of rendering pages and processing complex logic on demand without any client-side JavaScript. While eliminating JavaScript on the client is not the goal, we certainly have a lot of options to use a lot less of it, thus minimizing the risk of doing too much computation on the main thread on page load.

Bottom Line: Lighthouse Is Still Just A Rough Guide

Google Lighthouse can’t detect everything that’s wrong with a particular website. Whilst Lighthouse performance scores prioritize page usability in terms of responding to user input, it still can’t detect every terrible usability or accessibility issue in 2024.

In 2019, Manuel Matuzović published an experiment where he intentionally created a terrible page that Lighthouse thought was pretty great. I hypothesized that five years later, Lighthouse might do better; but it doesn’t.

On this final demo page I put together, input events are disabled by CSS and JavaScript, making the page technically unresponsive to user input. After five seconds, JavaScript flips a switch and allows you to click the button. The page still scores 100 for both performance and accessibility.

You really can’t rely on Lighthouse as a substitute for usability testing and common sense.

Some More Silly Hacks

As with everything in life, there’s always a way to game the system. Here are some more tried and tested guaranteed hacks to make sure your Lighthouse performance score artificially knocks everyone else’s out of the park:

  • Only run Lighthouse tests using the fastest and highest-spec hardware.
  • Make sure your internet connection is the fastest it can be; relocate if you need to.
  • Never use field data, only lab data, collected using the aforementioned fastest and highest-spec hardware and super-speed internet connection.
  • Rerun the tests in the lab using different conditions and all the special code hacks I described in this post until you get the result(s) you want to impress your friends, colleagues, and random people on the internet.

Note: The best way to learn about web performance and how to optimize your websites is to do the complete opposite of everything we’ve covered in this article all of the time. And finally, to seriously level up your performance skills, use an application monitoring tool like Sentry. Think of Lighthouse as the canary and Sentry as the real-deal production-data-capturing, lean, mean, web vitals machine.

And finally-finally, here’s the link to the full demo site for educational purposes.

Scaling Success: Key Insights And Practical Takeaways

Building successful web products at scale is a multifaceted challenge that demands a combination of technical expertise, strategic decision-making, and a growth-oriented mindset. In Success at Scale, I dive into case studies from some of the web’s most renowned products, uncovering the strategies and philosophies that propelled them to the forefront of their industries.

Here you will find some of the insights I’ve gleaned from these success stories, part of an ongoing effort to build a roadmap for teams striving to achieve scalable success in the ever-evolving digital landscape.

Cultivating A Mindset For Scaling Success

The foundation of scaling success lies in fostering the right mindset within your team. The case studies in Success at Scale highlight several critical mindsets that permeate the culture of successful organizations.

User-Centricity

Successful teams prioritize the user experience above all else.

They invest in understanding their users’ needs, behaviors, and pain points and relentlessly strive to deliver value. Instagram’s performance optimization journey exemplifies this mindset, focusing on improving perceived speed and reducing user frustration, leading to significant gains in engagement and retention.

By placing the user at the center of every decision, Instagram was able to identify and prioritize the most impactful optimizations, such as preloading critical resources and leveraging adaptive loading strategies. This user-centric approach allowed them to deliver a seamless and delightful experience to their vast user base, even as their platform grew in complexity.

Data-Driven Decision Making

Scaling success relies on data, not assumptions.

Teams must embrace a data-driven approach, leveraging metrics and analytics to guide their decisions and measure impact. Shopify’s UI performance improvements showcase the power of data-driven optimization, using detailed profiling and user data to prioritize efforts and drive meaningful results.

By analyzing user interactions, identifying performance bottlenecks, and continuously monitoring key metrics, Shopify was able to make informed decisions that directly improved the user experience. This data-driven mindset allowed them to allocate resources effectively, focusing on the areas that yielded the greatest impact on performance and user satisfaction.

Continuous Improvement

Scaling is an ongoing process, not a one-time achievement.

Successful teams foster a culture of continuous improvement, constantly seeking opportunities to optimize and refine their products. Smashing Magazine’s case study on enhancing Core Web Vitals demonstrates the impact of iterative enhancements, leading to significant performance gains and improved user satisfaction.

By regularly assessing their performance metrics, identifying areas for improvement, and implementing incremental optimizations, Smashing Magazine was able to continuously elevate the user experience. This mindset of continuous improvement ensures that the product remains fast, reliable, and responsive to user needs, even as it scales in complexity and user base.

Collaboration And Inclusivity

Silos hinder scalability.

High-performing teams promote collaboration and inclusivity, ensuring that diverse perspectives are valued and leveraged. The Understood’s accessibility journey highlights the power of cross-functional collaboration, with designers, developers, and accessibility experts working together to create inclusive experiences for all users.

By fostering open communication, knowledge sharing, and a shared commitment to accessibility, The Understood was able to embed inclusive design practices throughout its development process. This collaborative and inclusive approach not only resulted in a more accessible product but also cultivated a culture of empathy and user-centricity that permeated all aspects of their work.

Making Strategic Decisions for Scalability

Beyond cultivating the right mindset, scaling success requires making strategic decisions that lay the foundation for sustainable growth.

Technology Choices

Selecting the right technologies and frameworks can significantly impact scalability. Factors like performance, maintainability, and developer experience should be carefully considered. Notion’s migration to Next.js exemplifies the importance of choosing a technology stack that aligns with long-term scalability goals.

By adopting Next.js, Notion was able to leverage its performance optimizations, such as server-side rendering and efficient code splitting, to deliver fast and responsive pages. Additionally, the developer-friendly ecosystem of Next.js and its strong community support enabled Notion’s team to focus on building features and optimizing the user experience rather than grappling with low-level infrastructure concerns. This strategic technology choice laid the foundation for Notion’s scalable and maintainable architecture.

Ship Only The Code A User Needs, When They Need It

This best practice is so important when we want to ensure that pages load fast without over-eagerly delivering JavaScript a user may not need at that time. For example, Instagram made a concerted effort to improve the web performance of instagram.com, resulting in a nearly 50% cumulative improvement in feed page load time. A key area of focus has been shipping less JavaScript code to users, particularly on the critical rendering path.

The Instagram team found that the uncompressed size of JavaScript is more important for performance than the compressed size, as larger uncompressed bundles take more time to parse and execute on the client, especially on mobile devices. Two optimizations they implemented to reduce JS parse/execute time were inline requires (only executing code when it’s first used vs. eagerly on initial load) and serving ES2017+ code to modern browsers to avoid transpilation overhead. Inline requires improved Time-to-Interactive metrics by 12%, and the ES2017+ bundle was 5.7% smaller and 3% faster than the transpiled version.

While good progress has been made, the Instagram team acknowledges there are still many opportunities for further optimization. Potential areas to explore could include the following:

  • Improved code-splitting, moving more logic off the critical path,
  • Optimizing scrolling performance,
  • Adapting to varying network conditions,
  • Modularizing their Redux state management.

Continued efforts will be needed to keep instagram.com performing well as new features are added and the product grows in complexity.

Accessibility Integration

Accessibility should be an integral part of the product development process, not an afterthought.

Wix’s comprehensive approach to accessibility, encompassing keyboard navigation, screen reader support, and infrastructure for future development, showcases the importance of building inclusivity into the product’s core.

By considering accessibility requirements from the initial design stages and involving accessibility experts throughout the development process, Wix was able to create a platform that empowered its users to build accessible websites. This holistic approach to accessibility not only benefited end-users but also positioned Wix as a leader in inclusive web design, attracting a wider user base and fostering a culture of empathy and inclusivity within the organization.

Developer Experience Investment

Investing in a positive developer experience is essential for attracting and retaining talent, fostering productivity, and accelerating development.

Apideck’s case study in the book highlights the impact of a great developer experience on community building and product velocity.

By providing well-documented APIs, intuitive SDKs, and comprehensive developer resources, Apideck was able to cultivate a thriving developer community. This investment in developer experience not only made it easier for developers to integrate with Apideck’s platform but also fostered a sense of collaboration and knowledge sharing within the community. As a result, ApiDeck was able to accelerate product development, leverage community contributions, and continuously improve its offering based on developer feedback.

Leveraging Performance Optimization Techniques

Achieving optimal performance is a critical aspect of scaling success. The case studies in Success at Scale showcase various performance optimization techniques that have proven effective.

Progressive Enhancement and Graceful Degradation

Building resilient web experiences that perform well across a range of devices and network conditions requires a progressive enhancement approach. Pinafore’s case study in Success at Scale highlights the benefits of ensuring core functionality remains accessible even in low-bandwidth or JavaScript-constrained environments.

By leveraging server-side rendering and delivering a usable experience even when JavaScript fails to load, Pinafore demonstrates the importance of progressive enhancement. This approach not only improves performance and resilience but also ensures that the application remains accessible to a wider range of users, including those with older devices or limited connectivity. By gracefully degrading functionality in constrained environments, Pinafore provides a reliable and inclusive experience for all users.

Adaptive Loading Strategies

The book’s case study on Tinder highlights the power of sophisticated adaptive loading strategies. By dynamically adjusting the content and resources delivered based on the user’s device capabilities and network conditions, Tinder ensures a seamless experience across a wide range of devices and connectivity scenarios. Tinder’s adaptive loading approach involves techniques like dynamic code splitting, conditional resource loading, and real-time network quality detection. This allows the application to optimize the delivery of critical resources, prioritize essential content, and minimize the impact of poor network conditions on the user experience.

By adapting to the user’s context, Tinder delivers a fast and responsive experience, even in challenging environments.

Efficient Resource Management

Effective management of resources, such as images and third-party scripts, can significantly impact performance. eBay’s journey showcases the importance of optimizing image delivery, leveraging techniques like lazy loading and responsive images to reduce page weight and improve load times.

By implementing lazy loading, eBay ensures that images are only loaded when they are likely to be viewed by the user, reducing initial page load time and conserving bandwidth. Additionally, by serving appropriately sized images based on the user’s device and screen size, eBay minimizes the transfer of unnecessary data and improves the overall loading performance. These resource management optimizations, combined with other techniques like caching and CDN utilization, enable eBay to deliver a fast and efficient experience to its global user base.

Continuous Performance Monitoring

Regularly monitoring and analyzing performance metrics is crucial for identifying bottlenecks and opportunities for optimization. The case study on Yahoo! Japan News demonstrates the impact of continuous performance monitoring, using tools like Lighthouse and real user monitoring to identify and address performance issues proactively.

By establishing a performance monitoring infrastructure, Yahoo! Japan News gains visibility into the real-world performance experienced by their users. This data-driven approach allows them to identify performance regression, pinpoint specific areas for improvement, and measure the impact of their optimizations. Continuous monitoring also enables Yahoo! Japan News to set performance baselines, track progress over time, and ensure that performance remains a top priority as the application evolves.

Embracing Accessibility and Inclusive Design

Creating inclusive web experiences that cater to diverse user needs is not only an ethical imperative but also a critical factor in scaling success. The case studies in Success at Scale emphasize the importance of accessibility and inclusive design.

Comprehensive Accessibility Testing

Ensuring accessibility requires a combination of automated testing tools and manual evaluation. LinkedIn’s approach to automated accessibility testing demonstrates the value of integrating accessibility checks into the development workflow, catching potential issues early, and reducing the reliance on manual testing alone.

By leveraging tools like Deque’s axe and integrating accessibility tests into their continuous integration pipeline, LinkedIn can identify and address accessibility issues before they reach production. This proactive approach to accessibility testing not only improves the overall accessibility of the platform but also reduces the cost and effort associated with retroactive fixes. However, LinkedIn also recognizes the importance of manual testing and user feedback in uncovering complex accessibility issues that automated tools may miss. By combining automated checks with manual evaluation, LinkedIn ensures a comprehensive approach to accessibility testing.

Inclusive Design Practices

Designing with accessibility in mind from the outset leads to more inclusive and usable products. Success With Scale\’s case study on Intercom about creating an accessible messenger highlights the importance of considering diverse user needs, such as keyboard navigation and screen reader compatibility, throughout the design process.

By embracing inclusive design principles, Intercom ensures that their messenger is usable by a wide range of users, including those with visual, motor, or cognitive impairments. This involves considering factors such as color contrast, font legibility, focus management, and clear labeling of interactive elements. By designing with empathy and understanding the diverse needs of their users, Intercom creates a messenger experience that is intuitive, accessible, and inclusive. This approach not only benefits users with disabilities but also leads to a more user-friendly and resilient product overall.

User Research And Feedback

Engaging with users with disabilities and incorporating their feedback is essential for creating truly inclusive experiences. The Understood’s journey emphasizes the value of user research and collaboration with accessibility experts to identify and address accessibility barriers effectively.

By conducting usability studies with users who have diverse abilities and working closely with accessibility consultants, The Understood gains invaluable insights into the real-world challenges faced by their users. This user-centered approach allows them to identify pain points, gather feedback on proposed solutions, and iteratively improve the accessibility of their platform.

By involving users with disabilities throughout the design and development process, The Understood ensures that their products not only meet accessibility standards but also provide a meaningful and inclusive experience for all users.

Accessibility As A Shared Responsibility

Promoting accessibility as a shared responsibility across the organization fosters a culture of inclusivity. Shopify’s case study underscores the importance of educating and empowering teams to prioritize accessibility, recognizing it as a fundamental aspect of the user experience rather than a mere technical checkbox.

By providing accessibility training, guidelines, and resources to designers, developers, and content creators, Shopify ensures that accessibility is considered at every stage of the product development lifecycle. This shared responsibility approach helps to build accessibility into the core of Shopify’s products and fosters a culture of inclusivity and empathy. By making accessibility everyone’s responsibility, Shopify not only improves the usability of their platform but also sets an example for the wider industry on the importance of inclusive design.

Fostering A Culture of Collaboration And Knowledge Sharing

Scaling success requires a culture that promotes collaboration, knowledge sharing, and continuous learning. The case studies in Success at Scale highlight the impact of effective collaboration and knowledge management practices.

Cross-Functional Collaboration

Breaking down silos and fostering cross-functional collaboration accelerates problem-solving and innovation. Airbnb’s design system journey showcases the power of collaboration between design and engineering teams, leading to a cohesive and scalable design language across web and mobile platforms.

By establishing a shared language and a set of reusable components, Airbnb’s design system enables designers and developers to work together more efficiently. Regular collaboration sessions, such as design critiques and code reviews, help to align both teams and ensure that the design system evolves in a way that meets the needs of all stakeholders. This cross-functional approach not only improves the consistency and quality of the user experience but also accelerates the development process by reducing duplication of effort and promoting code reuse.

Knowledge Sharing And Documentation

Capturing and sharing knowledge across the organization is crucial for maintaining consistency and enabling the efficient onboarding of new team members. Stripe’s investment in internal frameworks and documentation exemplifies the value of creating a shared understanding and facilitating knowledge transfer.

By maintaining comprehensive documentation, code examples, and best practices, Stripe ensures that developers can quickly grasp the intricacies of their internal tools and frameworks. This documentation-driven culture not only reduces the learning curve for new hires but also promotes consistency and adherence to established patterns and practices. Regular knowledge-sharing sessions, such as tech talks and lunch-and-learns, further reinforce this culture of learning and collaboration, enabling team members to learn from each other’s experiences and stay up-to-date with the latest developments.

Communities Of Practice

Establishing communities of practice around specific domains, such as accessibility or performance, promotes knowledge sharing and continuous improvement. Shopify’s accessibility guild demonstrates the impact of creating a dedicated space for experts and advocates to collaborate, share best practices, and drive accessibility initiatives forward.

By bringing together individuals passionate about accessibility from across the organization, Shopify’s accessibility guild fosters a sense of community and collective ownership. Regular meetings, workshops, and hackathons provide opportunities for members to share their knowledge, discuss challenges, and collaborate on solutions. This community-driven approach not only accelerates the adoption of accessibility best practices but also helps to build a culture of inclusivity and empathy throughout the organization.

Leveraging Open Source And External Expertise

Collaborating with the wider developer community and leveraging open-source solutions can accelerate development and provide valuable insights. Pinafore’s journey highlights the benefits of engaging with accessibility experts and incorporating their feedback to create a more inclusive and accessible web experience.

By actively seeking input from the accessibility community and leveraging open-source accessibility tools and libraries, Pinafore was able to identify and address accessibility issues more effectively. This collaborative approach not only improved the accessibility of the application but also contributed back to the wider community by sharing their learnings and experiences. By embracing open-source collaboration and learning from external experts, teams can accelerate their own accessibility efforts and contribute to the collective knowledge of the industry.

The Path To Sustainable Success

Achieving scalable success in the web development landscape requires a multifaceted approach that encompasses the right mindset, strategic decision-making, and continuous learning. The Success at Scale book provides a comprehensive exploration of these elements, offering deep insights and practical guidance for teams at all stages of their scaling journey.

By cultivating a user-centric, data-driven, and inclusive mindset, teams can prioritize the needs of their users and make informed decisions that drive meaningful results. Adopting a culture of continuous improvement and collaboration ensures that teams are always striving to optimize and refine their products, leveraging the collective knowledge and expertise of their members.

Making strategic technology choices, such as selecting performance-oriented frameworks and investing in developer experience, lays the foundation for scalable and maintainable architectures. Implementing performance optimization techniques, such as adaptive loading, efficient resource management, and continuous monitoring, helps teams deliver fast and responsive experiences to their users.

Embracing accessibility and inclusive design practices not only ensures that products are usable by a wide range of users but also fosters a culture of empathy and user-centricity. By incorporating accessibility testing, inclusive design principles, and user feedback into the development process, teams can create products that are both technically sound and meaningfully inclusive.

Fostering a culture of collaboration, knowledge sharing, and continuous learning is essential for scaling success. By breaking down silos, promoting cross-functional collaboration, and investing in documentation and communities of practice, teams can accelerate problem-solving, drive innovation, and build a shared understanding of their products and practices.

The case studies featured in Success at Scale serve as powerful examples of how these principles and strategies can be applied in real-world contexts. By learning from the successes and challenges of industry leaders, teams can gain valuable insights and inspiration for their own scaling journeys.

As you embark on your path to scaling success, remember that it is an ongoing process of iteration, learning, and adaptation. Embrace the mindsets and strategies outlined in this article, dive deeper into the learnings from the Success at Scale book, and continually refine your approach based on the unique needs of your users and the evolving landscape of web development.

Conclusion

Scaling successful web products requires a holistic approach that combines technical excellence, strategic decision-making, and a growth-oriented mindset. By learning from the experiences of industry leaders, as showcased in the Success at Scale book, teams can gain valuable insights and practical guidance on their journey towards sustainable success.

Cultivating a user-centric, data-driven, and inclusive mindset lays the foundation for scalability. By prioritizing the needs of users, making informed decisions based on data, and fostering a culture of continuous improvement and collaboration, teams can create products that deliver meaningful value and drive long-term growth.

Making strategic decisions around technology choices, performance optimization, accessibility integration, and developer experience investment sets the stage for scalable and maintainable architectures. By leveraging proven optimization techniques, embracing inclusive design practices, and investing in the tools and processes that empower developers, teams can build products that are fast and resilient.

Through ongoing collaboration, knowledge sharing, and a commitment to learning, teams can navigate the complexities of scaling success and create products that make a lasting impact in the digital landscape.

We’re Trying Out Something New

In an effort to conserve resources here at Smashing, we’re trying something new with Success at Scale. The printed book is 304 pages, and we make an expanded PDF version available to everyone who purchases a print book. This accomplishes a few good things:

  • We will use less paper and materials because we are making a smaller printed book;
  • We’ll use fewer resources in general to print, ship, and store the books, leading to a smaller carbon footprint; and
  • Keeping the book at more manageable size means we can continue to offer free shipping on all Smashing orders!

Smashing Books have always been printed with materials from FSC Certified forests. We are committed to finding new ways to conserve resources while still bringing you the best possible reading experience.

Community Matters ❤️

Producing a book takes quite a bit of time, and we couldn’t pull it off without the support of our wonderful community. A huge shout-out to Smashing Members for the kind, ongoing support. The eBook is and always will be free for Smashing Members. Plus, Members get a friendly discount when purchasing their printed copy. Just sayin’! ;-)

More Smashing Books & Goodies

Promoting best practices and providing you with practical tips to master your daily coding and design challenges has always been (and will be) at the core of everything we do at Smashing.

In the past few years, we were very lucky to have worked together with some talented, caring people from the web community to publish their wealth of experience as printed books that stand the test of time. Heather and Steven are two of these people. Have you checked out their books already?

Understanding Privacy

Everything you need to know to put your users first and make a better web.

Get Print + eBook

Touch Design for Mobile Interfaces

Learn how touchscreen devices really work — and how people really use them.

Get Print + eBook

Interface Design Checklists

100 practical cards for common interface design challenges.

Get Print + eBook

Building A User Segmentation Matrix To Foster Cross-Org Alignment

Do you recognize this situation? The marketing and business teams talk about their customers, and each team thinks they have the same understanding of the problem and what needs to be done. Then, they’re including the Product and UX team in the conversation around how to best serve a particular customer group and where to invest in development and marketing efforts. They’ve done their initial ideation and are trying to prioritize, but this turns into a long discussion with the different teams favoring different areas to focus on. Suddenly, an executive highlights that instead of this customer segment, there should be a much higher focus on an entirely different segment — and the whole discussion starts again.

This situation often arises when there is no joint-up understanding of the different customer segments a company is serving historically and strategically. And there is no shared understanding beyond using the same high-level terms. To reach this understanding, you need to dig deeper into segment definitions, goals, pain points, and jobs-to-be-done (JTBD) so as to enable the organization to make evidence-based decisions instead of having to rely on top-down prioritization.

The hardest part about doing the right thing for your user or customers (please note I’m aware these terms aren’t technically the same, but I’m using them interchangeably in this article so as to be useful to a wider audience) often starts inside your own company and getting different teams with diverging goals and priorities to agree on where to focus and why.

But how do you get there — thinking user-first AND ensuring teams are aligned and have a shared mental model of primary and secondary customer segments?

Personas vs Segments

To explore that further, let’s take a brief look at the most commonly applied techniques to better understand customers and communicate this knowledge within organizations.

Two frequently employed tools are user personas and user segmentation.

Product/UX (or non-demographic) personas aim to represent the characteristics and needs of a certain type of customer, as well as their motivations and experience. The aim is to illustrate an ideal customer and allow teams to empathize and solve different use cases. Marketing (or demographic) personas, on the other hand, traditionally focus on age, socio-demographics, education, and geography but usually don’t include needs, motivations, or other contexts. So they’re good for targeting but not great for identifying new potential solutions or helping teams prioritize.

In contrast to personas, user segments illustrate groups of customers with shared needs, characteristics, and actions. They are relatively high-level classifications, deliberately looking at a whole group of needs without telling a detailed story. The aim is to gain a broader overview of the wider market’s wants and needs.

Tony Ulwick, creator of the “jobs-to-be-done” framework, for example, creates outcome-based segmentations, which are quite similar to what this article is proposing. Other types of segmentations include geographic, psychographic, demographic, or needs-based segmentations. What all segmentations, including the user segmentation matrix, have in common is that the segments are different from each other but don‘t need to be mutually exclusive.

As Simon Penny points out, personas and segments are tools for different purposes. While customer segments help us understand a marketplace or customer base, personas help us to understand more about the lived experience of a particular group of customers within that marketplace.

Both personas and segmentations have their applications, but this article argues that using a matrix will help you prioritize between the different segments. In addition, the key aspect here is the co-creation process that fosters understanding across departments and allows for more transparent decision-making. Instead of focusing only on the outcome, the process of getting there is what matters for alignment and collaboration across teams. Let’s dig deeper into how to achieve that.

User Segmentation Matrix: 101
At its core, the idea of the user segmentation matrix is meant to create a shared mental model across teams and departments of an organization to enable better decision-making and collaboration.

And it does that by visualizing the relevance and differences between a company’s customer segments. Crucially, input into the matrix comes from across teams as the process of co-creation plays an essential part in getting to a shared understanding of the different segments and their relevance to the overall business challenge.

Additionally, this kind of matrix follows the principle of “just enough, not too much” to create meaning without going too deep into details or leading to confusion. It is about pulling together key elements from existing tools and methods, such as User Journeys or Jobs-to-be-done, and visualizing them in one place.

For a high-level first overview, see the matrix scaffolding below.

Case Study: Getting To A Shared Mental Model Across Teams

Let’s look at the problem through a case study and see how building a user segmentation matrix helped a global data products organization gain a much clearer view of its customers and priorities.

Here is some context. The organization was partly driven by NGO principles like societal impact and partly by economic concerns like revenue and efficiencies. Its primary source of revenue was raw data and data products, and it was operating in a B2B setting. Despite operating for several decades already, its maturity level in terms of user experience and product knowledge was low, while the amount of different data outputs and services was high, with a whole bouquet of bespoke solutions for individual clients. The level of bespoke solutions that had to be maintained and had grown organically over time had surpassed the “featuritis” stage and turned utterly unsustainable.

And you probably guessed it: The business focus had traditionally been “What can we offer and sell?” instead of “What are our customers trying to solve?”

That means there were essentially two problems to figure out:

  1. Help executives and department leaders from Marketing through Sales, Business, and Data Science see the value of customer-first product thinking.
  2. Establish a shared mental model of the key customer segments to start prioritizing with focus and reduce the completely overgrown service offering.

For full disclosure, here’s a bit about my role in this context: I was there in a fractional product leader role at first, after running a discovery workshop, which then developed into product strategy work and eventually a full evaluation of the product portfolio according to user & business value.

Approach

So how did we get to that outcome? Basically, we spent an afternoon filling out a table with different customer segments, presented it to a couple of stakeholders, and everyone was happy — THE END. You can stop reading…

Or not, because from just a few initial conversations and trying to find out if there were any existing personas, user insights, or other customer data, it became clear that there was no shared mental model of the organization’s customer segments.

At the same time, the Business and Account management teams, especially, had a lot of contact with new and existing customers and knew the market and competition well. And the Marketing department had started on personas. However, they were not widely used and weren’t able to act as that shared mental model across different departments.

So, instead of thinking customer-first the organization was operating “inside-out first,” based on the services they offered. With the user segmentation matrix, we wanted to change this perspective and align all teams around one shared canvas to create transparency around user and business priorities.

But How To Proceed Quickly While Taking People Along On The Journey?

Here’s the approach we took:

1. Gather All Existing Research

First, we gathered all user insights, customer feedback, and data from different parts of the organization and mapped them out on a big board (see below). Initially, we really tried to map out all existing documentation, including links to in-house documents and all previous attempts at separating different user groups, analytics data, revenue figures, and so on.

The key here was to speak to people in different departments to understand how they were currently thinking about their customers and to include the terms and documentation they thought most relevant without giving them a predefined framework. We used the dimensions of the matrix as a conversation guide, e.g., asking about their definitions for key user groups and what makes them distinctly different from others.

2. Start The Draft Scaffolding

Secondly, we created the draft matrix with assumed segments and some core elements that have proven useful in different UX techniques.

In this step, we started to make sense of all the information we had collected and gave the segments “draft labels” and “draft definitions” based on input from the teams, but creating this first draft version within the small working group. The aim was to reduce complexity, settle on simple labels, and introduce primary vs secondary groups based on the input we received.

We then made sure to run this summarized draft version past the stakeholders for feedback and amends, always calling out the DRAFT status to ensure we had buy-in across teams before removing that label. In addition to interviews, we also provided direct access to the workboard for stakeholders to contribute asynchronously and in their own time and to give them the option to discuss with their own teams.

3. Refine

In the next step, we went through several rounds of “joint sense-making” with stakeholders from across different departments. At this stage, we started coloring in the scaffolding version of the matrix with more and more detail. We also asked stakeholders to review the matrix as a whole and comment on it to make sure the different business areas were on board and to see the different priorities between, e.g., primary and secondary user groups due to segment size, pain points, or revenue numbers.

4. Prompt

We then promoted specifically for insights around segment definitions, pain points, goals, jobs to be done, and defining differences to other segments. Once the different labels and the sorting into primary versus secondary groups were clear, we tried to make sure that we had similar types of information per segment so that it would be easy to compare different aspects across the matrix.

5. Communicate

Finally, we made sure the core structure reached different levels of leadership. While we made sure to include senior stakeholders in the process throughout, this step was essential prior to circulating the matrix widely across the organization.

However, due to the previous steps, we had gone through, at this point, we were able to assure senior leadership that their teams had contributed and reviewed several times, so getting that final alignment was easy.

We did this in a team of two external consultants and three in-house colleagues, who conducted the interviews and information gathering exercises in tandem with us. Due to the size and global nature of the organization and various different time zones to manage, it took around 3 weeks of effort, but 3 months in time due to summer holidays and alignment activities. So we did this next to other work, which allowed us to be deeply plugged into the organization and avoid blind spots due to having both internal and external perspectives.

Building on in-house advocates with deep organizational knowledge and subject-matter expertise was a key factor and helped bring the organization along much better than purely external consultants could have done.

User Segmentation Matrix: Key Ingredients

So, what are the dimensions we included in this mapping out of primary and secondary user segments?

The dimensions we used were the following:

  1. Segment definition
    Who is this group?
    Define it in a simple, straightforward way so everyone understands — NO acronyms or abbreviations. Further information to include that’s useful if you have it: the size of the segment and associated revenue.
  2. Their main goals
    What are their main goals?
    Thinking outside-in and from this user groups perspective these would be at a higher level than the specific JTBD field, big picture and longer term.
  3. What are their “Jobs-to-be-done”?
    Define the key things this group needs in order to get their own work done (whether that’s currently available in your service or not; if you don’t know this, it’s time for some discovery). Please note this is not a full JTBD mapping, but instead seeks to call out exemplary practical tasks.
  4. How are they different from other segments?
    Segments should be clearly different in their needs. If they’re too similar, they might not be a separate group.
  5. Main pain points
    What are the pain points for each segment? What issues are they currently experiencing with your service/product? Note the recurring themes.
  6. Key contacts in the organization
    Who are the best people holding knowledge about this user segment?
    Usually, these would be the interview partners who contributed to the matrix, and it helps to not worry too much about ownership or levels here; it could be from any department, and often, the Business or Product org are good starting points.

This is an example of a user segmentation matrix:

Outcomes & Learning

What we found in this work is that seeing all user segments mapped out next to each other helped focus the conversation and create a shared mental model that switched the organization’s perspective to outside-in and customer-first.

Establishing the different user segment names and defining primary versus secondary segments created transparency, focus, and a shared understanding of priorities.

Building this matrix based on stakeholder interviews and existing user insights while keeping the labeling in DRAFT mode, we encouraged feedback and amends and helped everyone feel part of the process. So, rather than being a one-time set visualization, the key to creating value with this matrix is to encourage conversation and feedback loops between teams and departments.

In our case, we made sure that every stakeholder (at different levels within the organization, including several people from the executive team) had seen this matrix at least twice and had the chance to input. Once we then got to the final version, we were sure that we had an agreement on the terminology, issues, and priorities.

Below is the real case study example (with anonymized inputs):

Takeaways And What To Watch Out For

So what did this approach help us achieve?

  1. It created transparency and helped the Sales and Business teams understand how their asks would roughly be prioritized — seeing the other customer segments in comparison (especially knowing the difference between primary vs secondary segments).
  2. It shifted the thinking to customer-first by providing an overview for the executive team (and everyone else) to start thinking about customers rather than business units and see new opportunities more clearly.
  3. It highlighted the need to gather more customer insights and better performance data, such as revenue per segment, more detailed user tracking, and so on.

In terms of the challenges we faced when conducting and planning this work, there are a few things to watch out for:

We found that due to the size and global nature of the organization, it took several rounds of feedback to align with all stakeholders on the draft versions. So, the larger the size of your organization, the more buffer time to include (or the ability to change interview partners at short notice).

If you’re planning to do this in a startup or mid-sized organization, especially if they’ve got the relevant information available, you might need far less time, although it will still make sense to carefully select the contributors.

Having in-house advocates who actively contributed to the work and conducted interviews was a real benefit for alignment and getting buy-in across the organization, especially when things started getting political.

Gathering information from Marketing, Product, Business, Sales and Leadership and sticking with their terms and definitions initially was crucial, so everyone felt their inputs were heard and saw it reflected, even if amended, in the overall matrix.

And finally, a challenge that’s not to be underestimated is the selection of those asked to input — where it’s a tightrope walk between speed and inclusion.

We found that a “snowball system” worked well, where we initially worked with the C-level sponsor to define the crucial counterparts at the leadership level and have them name 3-4 leads in their organization, looking after different parts of the organization. These leaders were asked for their input and their team’s input in interviews and through asynchronous access to the joint workboard.

What’s In It For You?

To summarize, the key benefits of creating a user segmentation matrix in your organization are the following:

  • Thinking outside-in and user-first.
    Instead of thinking this is what you offer, your organization starts to think about solving real customer problems — the matrix is your GPS view of your market (but like any GPS system, don’t forget to update it occasionally).
  • Clarity and a shared mental model.
    Everyone is starting to use the same language, and there’s more clarity about what you offer per customer segment. So, from Sales through to Business and Product, you’re speaking to users and their needs instead of talking about products and services (or even worse, your in-house org structure). Shared clarity drastically reduces meeting and decision time and allows you to do more impactful work.
  • Focus, and more show than tell.
    Having a matrix helps differentiate between primary, secondary, and other customer segments and visualizes these differences for everyone.
When Not To Use It

If you already have a clearly defined set of customer segments that your organization is in agreement on and working towards — good for you; you won’t need this and can rely on your existing data.

Another case where you will likely not need this full overview is when you’re dealing with a very specific customer segment, and there is good alignment between the teams serving this group in terms of focus, priorities, and goals.

Organizations that will see the highest value in this exercise are those who are not yet thinking outside-in and customer-first and who still have a traditional approach, starting from their own services and dealing with conflicting priorities between departments.

Next Steps

And now? You’ve got your beautiful and fully aligned customer segmentation matrix ready and done. What’s next? In all honesty, this work is never done, and this is just the beginning.

If you have been struggling with creating an outside-in perspective in your organization, the key is to make sure that it gets communicated far and wide.

For example, make sure to get your executive sponsors to talk about it in their rounds, do a road show, or hold open office hours where you can present it to anyone interested and give them a chance to ask questions. Or even better, present it at the next company all-hands, with the suggestion to start building up an insights library per customer segment.

If this was really just the starting point to becoming more product-led, then the next logical step is to assess and evaluate the current product portfolio. The aim is to get clarity around which services or products are relevant for which customers. Especially in product portfolios plagued by “featuritis,” it makes sense to do a full audit, evaluate both user and business value, and clean out your product closet.

If you’ve seen gaps and blind spots in your matrix, another next step would be to do some deep dives, customer interviews, and discovery work to fill those. And as you continue on that journey towards more customer-centricity, other tools from the UX and product tool kit, like mapping out user journeys and establishing a good tracking system and KPIs, will be helpful so you can start measuring customer satisfaction and continue to test and learn.

Like a good map, it helps you navigate and create a shared understanding across departments. And this is its primary purpose: getting clarity and focus across teams to enable better decision-making. The process of co-creating a living document that visualizes customer segments is at least as important here as the final outcome.

Further Reading

Devin Might Be Fake, Yet AI’s Threat to Jobs Is Real.

The creators of an automated software engineer tout their AI's capability to independently tackle complete coding projects, including actual tasks from Upwork. While skepticism is warranted regarding Devin's authenticity, the risk of AI displacing professionals across numerous fields is undeniable.

will-code-for-food.jpg

On Tuesday, Cognition Labs, based in San Francisco, unveiled Devin, an AI software engineer, eliciting astonishment from the public. The team behind Devin claims it can autonomously finish entire coding projects using its integrated shell, code editor, and web browser. They further assert that Devin has successfully executed real assignments on Upwork, a popular platform for freelancers all over the world. To substantiate their claims, they present impressive data: Devin purportedly solves 13.86% of programming challenges unassisted. This marks a significant advancement over other leading models, such as Claude 2, which resolves just 1.96% of tasks unassisted and 4.80% with aid (i.e., when told exactly which files to edit).

Although dozens of news outlets picked up Devins story, at this point the possibility cant be excluded that the demo has been tampered with and the actual software does not deliver the promised performance (see below). Nevertheless, the emergence of AI software engineering is undeniable, and it is only a question of time until single applications can independently manage entire projects.

devin-statistics.JPG
Source: Cognition Labs

While a "success rate" of approximately 13%, as claimed by Devins developers, might seem innocent on first sight, considering the rapid evolution of AI technologies, it is clear where this is going. Tools like Devin could soon handle the majority of programming duties, potentially rendering vast segments of the workforce obsolete. Software developers and programmers are responding with a blend of job loss anxiety and gallows humor to the demo.

However, upon closer examination, discrepancies in the Devin-preview and the demo videos, along with questions about Cognition Lab's legitimacy and expertise have sparked speculation that Devin might be a nothing more than an elaborate investment scam. A look at their LinkedIn reveals that Cognition Labs, which claims to outperform some of the biggest players in AI automatization, was founded only months ago and counts less than 10 employees. It is unclear how such a small team could have achieved such a giant leap in such a short time. Hence, until the software is publicly released and proves its outstanding capabilities to be real, I shall remain skeptical of this particular application.

Why Freelancing Isnt Dead (Yet)

The rise of AI will certainly impact the lives and careers of many freelancers, from voice artists to coders. Looking back at more than a decade as a freelance copywriter myself, I can say I havent seen a year as crazy as the last 12 months, with clients requests and needs performing a 180 turn more than once (or twice). A look at message boards reveals that many freelancers are having trouble finding work and are losing long-time clients left and right. The mood is gloomy, as many are struggling but hesitant to reorient themselves, fearing that AI will acquire whatever skills they aim for faster than they can.

This is a valid concern. I do believe that there will always be some need for work that carries a human touche.g., in copywriting, performing well in a niche requires cultural knowledge, experience, and an ability to relate to people in a way that an LLM can pretend, but not fully achieve. However, to me, it is also crystal clear that we can count the days until LLMs and other AI solutions are capable of taking care of 95% of tasks formerly performed by highly trained professionals.

But at least in the short to mid-term, I argue that freelance work in copywriting, coding, sales, illustrating, etc., is not dead. All these industries are still adjusting to the AI revolution, and developments progress faster than they can keep up with. As professionals, we must fill this gap and become the interface between a clients requirements, state-of-the-art tech solutions, and our own expertise. This way, AI becomes an augmentation of our work, not a replacement.

Of course, the overall reduction of work hours required to realize a project will be an issue and put pressure on the job market. Economically, how we deal with AI is one of the biggest questions of this century, and chances are our discussion cant keep pace with developments. Freelancers, however, should not throw in the towel yet. Every industry changes, and as experts/professionals, its our job to keep up with those changes, adapt, and acquire new skills if necessary. Admittedly, change has never been this rapid before, and it is only natural to feel overwhelmed. But with the right attitude and a proactive approach towards the new tools popping up around us, it will be possible to adjust and grow through these unprecedented times.

Incident Management: Checklist, Tools, and Prevention

What Is Incident Management?

Incident management is the process of identifying, responding, resolving, and learning from incidents that disrupt the normal operation of a service or system. An incident can be anything from a server outage, a security breach, a performance degradation, or a customer complaint. Incident management aims to restore the service as quickly as possible, minimize the impact on users and the business, and prevent the recurrence of similar incidents.

Incident Management Checklist

Incident management can be a complex and stressful process, especially when dealing with high-severity incidents that affect a large number of users or have a significant business impact. To help you navigate the incident management process, here is a checklist of the main steps and best practices to follow:

Reporting Core Web Vitals With The Performance API

This article is a sponsored by DebugBear

There’s quite a buzz in the performance community with the Interaction to Next Paint (INP) metric becoming an official Core Web Vitals (CWV) metric in a few short weeks. If you haven’t heard, INP is replacing the First Input Delay (FID) metric, something you can read all about here on Smashing Magazine as a guide to prepare for the change.

But that’s not what I really want to talk about. With performance at the forefront of my mind, I decided to head over to MDN for a fresh look at the Performance API. We can use it to report the load time of elements on the page, even going so far as to report on Core Web Vitals metrics in real time. Let’s look at a few ways we can use the API to report some CWV metrics.

Browser Support Warning

Before we get started, a quick word about browser support. The Performance API is huge in that it contains a lot of different interfaces, properties, and methods. While the majority of it is supported by all major browsers, Chromium-based browsers are the only ones that support all of the CWV properties. The only other is Firefox, which supports the First Contentful Paint (FCP) and Largest Contentful Paint (LCP) API properties.

So, we’re looking at a feature of features, as it were, where some are well-established, and others are still in the experimental phase. But as far as Core Web Vitals go, we’re going to want to work in Chrome for the most part as we go along.

First, We Need Data Access

There are two main ways to retrieve the performance metrics we care about:

  1. Using the performance.getEntries() method, or
  2. Using a PerformanceObserver instance.

Using a PerformanceObserver instance offers a few important advantages:

  • PerformanceObserver observes performance metrics and dispatches them over time. Instead, using performance.getEntries() will always return the entire list of entries since the performance metrics started being recorded.
  • PerformanceObserver dispatches the metrics asynchronously, which means they don’t have to block what the browser is doing.
  • The element performance metric type doesn’t work with the performance.getEntries() method anyway.

That all said, let’s create a PerformanceObserver:

const lcpObserver = new PerformanceObserver(list => {});

For now, we’re passing an empty callback function to the PerformanceObserver constructor. Later on, we’ll change it so that it actually does something with the observed performance metrics. For now, let’s start observing:

lcpObserver.observe({ type: "largest-contentful-paint", buffered: true });

The first very important thing in that snippet is the buffered: true property. Setting this to true means that we not only get to observe performance metrics being dispatched after we start observing, but we also want to get the performance metrics that were queued by the browser before we started observing.

The second very important thing to note is that we’re working with the largest-contentful-paint property. That’s what’s cool about the Performance API: it can be used to measure very specific things but also supports properties that are mapped directly to CWV metrics. We’ll start with the LCP metric before looking at other CWV metrics.

Reporting The Largest Contentful Paint

The largest-contentful-paint property looks at everything on the page, identifying the biggest piece of content on the initial view and how long it takes to load. In other words, we’re observing the full page load and getting stats on the largest piece of content rendered in view.

We already have our Performance Observer and callback:

const lcpObserver = new PerformanceObserver(list => {});
lcpObserver.observe({ type: "largest-contentful-paint", buffered: true });

Let’s fill in that empty callback so that it returns a list of entries once performance measurement starts:

// The Performance Observer
const lcpObserver = new PerformanceObserver(list => {
  // Returns the entire list of entries
  const entries = list.getEntries();
});

// Call the Observer
lcpObserver.observe({ type: "largest-contentful-paint", buffered: true });

Next, we want to know which element is pegged as the LCP. It’s worth noting that the element representing the LCP is always the last element in the ordered list of entries. So, we can look at the list of returned entries and return the last one:

// The Performance Observer
const lcpObserver = new PerformanceObserver(list => {
  // Returns the entire list of entries
  const entries = list.getEntries();
  // The element representing the LCP
  const el = entries[entries.length - 1];
});

// Call the Observer
lcpObserver.observe({ type: "largest-contentful-paint", buffered: true });

The last thing is to display the results! We could create some sort of dashboard UI that consumes all the data and renders it in an aesthetically pleasing way. Let’s simply log the results to the console rather than switch gears.

// The Performance Observer
const lcpObserver = new PerformanceObserver(list => {
  // Returns the entire list of entries
  const entries = list.getEntries();
  // The element representing the LCP
  const el = entries[entries.length - 1];

  // Log the results in the console
  console.log(el.element);
});

// Call the Observer
lcpObserver.observe({ type: "largest-contentful-paint", buffered: true });

There we go!

It’s certainly nice knowing which element is the largest. But I’d like to know more about it, say, how long it took for the LCP to render:

// The Performance Observer
const lcpObserver = new PerformanceObserver(list => {

  const entries = list.getEntries();
  const lcp = entries[entries.length - 1];

  entries.forEach(entry => {
    // Log the results in the console
    console.log(
      The LCP is:,
      lcp.element,
      The time to render was ${entry.startTime} milliseconds.,
    );
  });
});

// Call the Observer
lcpObserver.observe({ type: "largest-contentful-paint", buffered: true });

// The LCP is:
// <h2 class="author-post__title mt-5 text-5xl">…</h2>
// The time to  render was 832.6999999880791 milliseconds.
Reporting First Contentful Paint

This is all about the time it takes for the very first piece of DOM to get painted on the screen. Faster is better, of course, but the way Lighthouse reports it, a “passing” score comes in between 0 and 1.8 seconds.

Just like we set the type property to largest-contentful-paint to fetch performance data in the last section, we’re going to set a different type this time around: paint.

When we call paint, we tap into the PerformancePaintTiming interface that opens up reporting on first paint and first contentful paint.

// The Performance Observer
const paintObserver = new PerformanceObserver(list => {
  const entries = list.getEntries();
  entries.forEach(entry => {
// Log the results in the console. console.log( The time to ${entry.name} took ${entry.startTime} milliseconds., ); }); }); // Call the Observer. paintObserver.observe({ type: "paint", buffered: true }); // The time to first-paint took 509.29999999981374 milliseconds. // The time to first-contentful-paint took 509.29999999981374 milliseconds.

Notice how paint spits out two results: one for the first-paint and the other for the first-contenful-paint. I know that a lot happens between the time a user navigates to a page and stuff starts painting, but I didn’t know there was a difference between these two metrics.

Here’s how the spec explains it:

“The primary difference between the two metrics is that [First Paint] marks the first time the browser renders anything for a given document. By contrast, [First Contentful Paint] marks the time when the browser renders the first bit of image or text content from the DOM.”

As it turns out, the first paint and FCP data I got back in that last example are identical. Since first paint can be anything that prevents a blank screen, e.g., a background color, I think that the identical results mean that whatever content is first painted to the screen just so happens to also be the first contentful paint.

But there’s apparently a lot more nuance to it, as Chrome measures FCP differently based on what version of the browser is in use. Google keeps a full record of the changelog for reference, so that’s something to keep in mind when evaluating results, especially if you find yourself with different results from others on your team.

Reporting Cumulative Layout Shift

How much does the page shift around as elements are painted to it? Of course, we can get that from the Performance API! Instead of largest-contentful-paint or paint, now we’re turning to the layout-shift type.

This is where browser support is dicier than other performance metrics. The LayoutShift interface is still in “experimental” status at this time, with Chromium browsers being the sole group of supporters.

As it currently stands, LayoutShift opens up several pieces of information, including a value representing the amount of shifting, as well as the sources causing it to happen. More than that, we can tell if any user interactions took place that would affect the CLS value, such as zooming, changing browser size, or actions like keydown, pointerdown, and mousedown. This is the lastInputTime property, and there’s an accompanying hasRecentInput boolean that returns true if the lastInputTime is less than 500ms.

Got all that? We can use this to both see how much shifting takes place during page load and identify the culprits while excluding any shifts that are the result of user interactions.

const observer = new PerformanceObserver((list) => {
  let cumulativeLayoutShift = 0;
  list.getEntries().forEach((entry) => {
    // Don't count if the layout shift is a result of user interaction.
    if (!entry.hadRecentInput) {
      cumulativeLayoutShift += entry.value;
    }
    console.log({ entry, cumulativeLayoutShift });
  });
});

// Call the Observer.
observer.observe({ type: "layout-shift", buffered: true });

Given the experimental nature of this one, here’s what an entry object looks like when we query it:

Pretty handy, right? Not only are we able to see how much shifting takes place (0.128) and which element is moving around (article.a.main), but we have the exact coordinates of the element’s box from where it starts to where it ends.

Reporting Interaction To Next Paint

This is the new kid on the block that got my mind wondering about the Performance API in the first place. It’s been possible for some time now to measure INP as it transitions to replace First Input Delay as a Core Web Vitals metric in March 2024. When we’re talking about INP, we’re talking about measuring the time between a user interacting with the page and the page responding to that interaction.

We need to hook into the PerformanceEventTiming class for this one. And there’s so much we can dig into when it comes to user interactions. Think about it! There’s what type of event happened (entryType and name), when it happened (startTime), what element triggered the interaction (interactionId, experimental), and when processing the interaction starts (processingStart) and ends (processingEnd). There’s also a way to exclude interactions that can be canceled by the user (cancelable).

const observer = new PerformanceObserver((list) => {
  list.getEntries().forEach((entry) => {
    // Alias for the total duration.
    const duration = entry.duration;
    // Calculate the time before processing starts.
    const delay = entry.processingStart - entry.startTime;
    // Calculate the time to process the interaction.
    const lag = entry.processingStart - entry.startTime;

    // Don't count interactions that the user can cancel.
    if (!entry.cancelable) {
      console.log(`INP Duration: ${duration}`);
      console.log(`INP Delay: ${delay}`);
      console.log(`Event handler duration: ${lag}`);
    }
  });
});

// Call the Observer.
observer.observe({ type: "event", buffered: true });
Reporting Long Animation Frames (LoAFs)

Let’s build off that last one. We can now track INP scores on our website and break them down into specific components. But what code is actually running and causing those delays?

The Long Animation Frames API was developed to help answer that question. It won’t land in Chrome stable until mid-March 2024, but you can already use it in Chrome Canary.

A long-animation-frame entry is reported every time the browser couldn’t render page content immediately as it was busy with other processing tasks. We get an overall duration for the long frame but also a duration for different scripts involved in the processing.

const observer = new PerformanceObserver((list) => {
  list.getEntries().forEach((entry) => {
    if (entry.duration > 50) {
      // Log the overall duration of the long frame.
      console.log(Frame took ${entry.duration} ms)
      console.log(Contributing scripts:)
      // Log information on each script in a table.
      entry.scripts.forEach(script => {
        console.table({
          // URL of the script where the processing starts
          sourceURL: script.sourceURL,
          // Total time spent on this sub-task
          duration: script.duration,
          // Name of the handler function
          functionName: script.sourceFunctionName,
          // Why was the handler function called? For example, 
          // a user interaction or a fetch response arriving.
          invoker: script.invoker
        })
      })
    }
  });
});

// Call the Observer.
observer.observe({ type: "long-animation-frame", buffered: true });

When an INP interaction takes place, we can find the closest long animation frame and investigate what processing delayed the page response.

There’s A Package For This

The Performance API is so big and so powerful. We could easily spend an entire bootcamp learning all of the interfaces and what they provide. There’s network timing, navigation timing, resource timing, and plenty of custom reporting features available on top of the Core Web Vitals we’ve looked at.

If CWVs are what you’re really after, then you might consider looking into the web-vitals library to wrap around the browser Performance APIs.

Need a CWV metric? All it takes is a single function.

webVitals.getINP(function(info) {
  console.log(info)
}, { reportAllChanges: true });

Boom! That reportAllChanges property? That’s a way of saying we only want to report data every time the metric changes instead of only when the metric reaches its final value. For example, as long as the page is open, there’s always a chance that the user will encounter an even slower interaction than the current INP interaction. So, without reportAllChanges, we’d only see the INP reported when the page is closed (or when it’s hidden, e.g., if the user switches to a different browser tab).

We can also report purely on the difference between the preliminary results and the resulting changes. From the web-vitals docs:

function logDelta({ name, id, delta }) {
  console.log(`${name} matching ID ${id} changed by ${delta}`);
}

onCLS(logDelta);
onINP(logDelta);
onLCP(logDelta);
Measuring Is Fun, But Monitoring Is Better

All we’ve done here is scratch the surface of the Performance API as far as programmatically reporting Core Web Vitals metrics. It’s fun to play with things like this. There’s even a slight feeling of power in being able to tap into this information on demand.

At the end of the day, though, you’re probably just as interested in monitoring performance as you are in measuring it. We could do a deep dive and detail what a performance dashboard powered by the Performance API is like, complete with historical records that indicate changes over time. That’s ultimately the sort of thing we can build on this — we can build our own real user monitoring (RUM) tool or perhaps compare Performance API values against historical data from the Chrome User Experience Report (CrUX).

Or perhaps you want a solution right now without stitching things together. That’s what you’ll get from a paid commercial service like DebugBear. All of this is already baked right in with all the metrics, historical data, and charts you need to gain insights into the overall performance of a site over time… and in real-time, monitoring real users.

DebugBear can help you identify why users are having slow experiences on any given page. If there is slow INP, what page elements are these users interacting with? What elements often shift around on the page and cause high CLS? Is the LCP typically an image, a heading, or something else? And does the type of LCP element impact the LCP score?

To help explain INP scores, DebugBear also supports the upcoming Long Animation Frames API we looked at, allowing you to see what code is responsible for interaction delays.

The Performance API can also report a list of all resource requests on a page. DebugBear uses this information to show a request waterfall chart that tells you not just when different resources are loaded but also whether the resources were render-blocking, loaded from the cache or whether an image resource is used for the LCP element.

In this screenshot, the blue line shows the FCP, and the red line shows the LCP. We can see that the LCP happens right after the LCP image request, marked by the blue “LCP” badge, has finished.

DebugBear offers a 14-day free trial. See how fast your website is, what’s slowing it down, and how you can improve your Core Web Vitals. You’ll also get monitoring alerts, so if there’s a web vitals regression, you’ll find out before it starts impacting Google search results.

A Web Designer’s Accessibility Advocacy Toolkit

Web accessibility can be challenging, particularly for clients unfamiliar with tech or compliance with The Americans With Disabilities Act (ADA). My role as a digital designer often involves guiding clients toward ADA-compliant web designs. I’ve acquired many strategies over the years for encouraging clients to adopt accessible web practices and invest in accessible user interfaces. It’s something that comes up with nearly every new project, and I decided to develop a personal toolkit to help me make the case.

Now, I am opening up my toolkit for you to have and use. While some of the strategies may be specific to me and my work, there are plenty more that cast a wider net and are more universally applicable. I’ve considered different real-life scenarios where I have had to make a case for accessibility. You may even personally identify with a few of them!

Please enjoy. As you do, remember that there is no silver bullet for “selling” accessibility. We can’t win everyone over with cajoling or terse arguments. My hope is that you are able to use this collection to establish partnerships with your colleagues and clients alike. Accessibility is something that anyone can influence at various stages in a project, and “winning” an argument isn’t exactly the point. It’s a bigger picture we’re after, one that influences how teams work together, changes habits, and develops a new level of empathy and understanding.

I begin with general strategies for discussing accessibility with clients. Following that, I provide specific language and responses you can use to introduce accessibility practices to your team and clients and advocate its importance while addressing client skepticism and concerns. Use it as a starting point and build off of it so that it incorporates points and scenarios that are more specific to your work. I sincerely hope it helps you advance accessible practices.

General Strategies

We’ll start with a few ways you can position yourself when interacting with clients. By adopting a certain posture, we can set ourselves up to be the experts in the room, the ones with solutions rather than arguments.

Showcasing Expertise

I tend to establish my expertise and tailor the information to the client’s understanding of accessibility, which could be not very much. For those new to accessibility, I offer a concise overview of its definition, evaluation, and business impact. For clients with a better grasp of accessible practices, I like to use the WCAG as a point of reference for helping frame productive discussions based on substance and real requirements.

Aligning With Client Goals

I connect accessibility to the client’s goals instead of presenting accessibility as a moral imperative. No one loves being told what to do, and talking to clients on their terms establishes a nice bridge for helping them connect the dots between the inherent benefits of accessible practices and what they are trying to accomplish. The two aren’t mutually exclusive!

In fact, there are many clear benefits for apps that make accessibility a first-class feature. Refer to the “Accessibility Benefits” section to help describe those benefits to your colleagues and clients.

Defining Accessibility In The Project Scope

I outline accessibility goals early, typically when defining the project scope and requirements. Baking accessibility into the project scope ensures that it is at least considered at this crucial stage where decisions are being made for everything from expected outcomes to architectural requirements.

User stories and personas are common artifacts for which designers are often responsible. Use these as opportunities to define accessibility in the same breath as defining who the users are and how they interact with the app. Framing stories and outcomes as user interactions in an “as-when-then-so” format provides an opening to lead with accessibility:

As a user, when I __, then I expect that __, so I can _.

Fill in the blanks. I think you’ll find that user’s expected outcomes are typically aligned with accessible experiences. Federico Francioni published his take on developing inclusive user personas, building off other excellent resources, including Microsoft’s Inclusive Design guidelines.

Being Ready With Resources and Examples

I maintain a database of resources for clients interested in learning more about accessibility. Sharing anecdotes, such as clients who’ve seen benefits from accessibility or examples of companies penalized for non-compliance, can be very impactful.

Microsoft is helpful here once again with a collection of brief videos that cover a variety of uses, from informing your colleagues and clients on basic accessibility concepts to interviews with accessibility professionals and case studies involving real users.

There are a few go-to resources I’ve bookmarked to share with clients who are learning about accessibility for the first time. What I like about these is the approachable language and clarity. “Learn Accessibility” from web.dev is especially useful because it’s framed as a 21-part course. That may sound daunting, but it’s organized in small chunks that make it manageable, and sometimes I will simply point to the Glossary to help clients understand the concepts we discuss.

And where “Learn Accessibility” is focused on specific components of accessibility, I find that the Inclusive Design Principles site has a perfect presentation of the concepts and guiding principles of inclusion and accessibility on the web.

Meanwhile, I tend to sit beside a client to look at The A11Y Project. I pick a few resources to go through. Otherwise, the amount of information can be overwhelming. I like to offer this during a project’s planning phase because the site is focused on actionable strategies that help scope work.

Leveraging User Research

User research that is specific to the client’s target audience is more convincing than general statistics alone. When possible, I try to understand those user’s needs, including what they expect, what sort of technology they use to browse online, and where they are geographically. Painting a more complete picture of users — based on real-life factors and information — offers a more human perspective and plants the first seeds of empathy in the design process.

Web analytics are great for identifying who users are and how they currently interact with the app. At the same time, they are also wrought with caveats as far as accuracy goes, depending on the tool you use and how you collect your data. That said, I use the information to support my user persona decisions and the specific requirements I write. Analytics add nice brush strokes to the picture but do not paint the entire view. So, leverage it!

The big caveat with web analytics? There’s no way to identify traffic that uses assistive tech. That’s a good thing in general as far as privacy goes, but it does mean that researching the usability of your site is best done with real users — as it is with any user research, really. The A11Y Project has excellent resources for testing screen readers, including a link to this Smashing Magazine article about manual accessibility testing by Eric Bailey as well as a vast archive of links pointing to other research.

That said, web analytics can still be very useful to help accommodate other impairments, for example, segmenting traffic by age (for improving accessibility for low vision) and geography (for improving performance gaps for those on low-powered devices). WebAIM also provides insights in a report they produced from a 2018 survey of users who report having low vision.

Leaving Room For Improvements

Chances are that your project will fall at least somewhat short of your accessibility plans. It happens! I see plenty of situations where a late deadline translates into rushed work that sacrifices quality for speed, and accessibility typically falls victim to degraded quality.

I keep track of these during the project’s various stages and attempt to document them. This way, there’s already a roadmap for inclusive and accessible improvements in subsequent releases. It’s scoped, backlogged, and ready to drop into a sprint.

For projects involving large sites with numerous accessibility issues, I emphasize that partial accessibility compliance is not the same as actual compliance. I often propose phased solutions, starting with incremental changes that fit within the current scope and budget.

And remember, just because something passes a WCAG success criterion doesn’t necessarily mean it is accessible. Passing tests is a good sign, but there will always be room for improvement.

Commonly Asked Accessibility Questions

Accessibility is a broad topic, and we can’t assume that everyone knows what constitutes an “accessible” interface. Often, when I get pushback from a colleague or client, it’s because they simply do not have the same context that I do. That’s why I like to keep a handful of answers to commonly asked questions in my back pocket. It’s amazing how answering the “basics” leads to productive discussions filled with substance rather than ones grounded in opinion.

What Do We Mean By “Web Accessibility”?

When we say “web accessibility,” we’re generally talking about making online content available and usable for anyone with a disability, whether it’s a permanent impairment or a temporary one. It’s the practice of removing friction that excludes people from gaining access to content or from completing a task. That usually involves complying with a set of guidelines that are designed to remove those barriers.

Who Creates Accessibility Guidelines?

The Web Content Accessibility Guidelines (WCAG) are created by a working group of the World Wide Web Consortium (W3C) called the Web Accessibility Initiative (WAI). The W3C develops guidelines and principles to help designers, developers, and authors like us create web experiences based on a common set of standards, including those for HTML, CSS, internationalization, privacy, security, and yes, accessibility, among many, many other areas. The WAI working group maintains the accessibility standards we call WCAG.

Who Needs Web Accessibility?

Twenty-seven percent of the U.S. population has a disability, emphasizing the widespread need for accessible web design. WCAG primarily focuses on three groups:

  1. Cognitive or learning disabilities,
  2. Visual impairments,
  3. Motor skills.

When we make web experiences that solve these issues based on established guidelines, we’re not only doing good for those who are directly impacted by impairment but those who may be impaired in less direct ways as well, such as establishing large target sizes for those tapping a touchscreen phone with their hands full, or using proper color contrast for those navigating a screen in bright sunlight. Everyone needs — and benefits from — accessibility!

Further Reading

How Is Web Accessibility Regulated?

The Americans with Disabilities Act (ADA) is regulated by the Civil Rights Division of the U.S. Department of Justice, which was established by the Civil Rights Act of 1957. Even though there is a lot of bureaucracy in that last sentence, it’s reassuring to know the U.S. government not only believes in web accessibility but enforces it as well.

Non-compliance can result in legal action, with first-time ADA violations leading to fines of up to $75,000, increasing to $150,000 for subsequent violations. The number of lawsuits for alleged ADA breaches has surged in recent years, with more than 4,500 lawsuits filed in 2023 against sites that fail to comply with WCAG AA 2.1 alone — roughly 500 more lawsuits than 2022!

Further Reading

How Is Web Accessibility Evaluated?

Web accessibility is something we can test against. Many tools have been created to audit sites on the spot based on WCAG success criteria that specify accessible requirements. That would be a standards-based evaluation using WCAG as a reference point for auditing compliance.

WebAIM has an excellent page that compares different types of accessibility testing, reporting, and tooling. They are also quick to note that automated testing, while convenient, is not a comprehensive way to audit accessibility. Automated tools that scan websites may be able to pick up instances where mistakes in the HTML might contribute to accessibility issues and where color contrasts are insufficient. But they cannot replace or perfectly imitate a real-life person. Testing in real browsers with real people continues to be the most effective way to truly evaluate accessible web experiences.

This isn’t to say automated tools should not be part of an accessibility testing suite. In fact, they often highlight areas you may have overlooked. Even false positives are good in the sense that they force you to pause and look more closely at something. Some of the most widely used automated tools include the following:

These are just a few of the most frequent tools I use in my own testing, but there are many more, and the WAI maintains an extensive list of available tools that are worth considering. But again, remember that automated testing is not a one-to-one replacement for testing with real users.

Checklists can be handy for ensuring you are covering your bases:

Accessibility Benefits

When discussing accessibility, I find the most effective arguments are ones that are framed around the interests of clients and stakeholders. That way, the discussion stays within scope and helps everyone see that proper accessibility practices actually benefit business goals. Speaking in business terms is something I openly embrace because it typically supports my case.

The following are a few ways I would like to explain the positive impacts that accessibility has on business goals.

Case Studies

Sometimes, the most convincing approach is to offer examples of companies that have committed to accessible practices and come out better for it. And there are plenty of examples! I like to use case studies and reports in a similar industry or market for a more apples-to-apples comparison that stakeholders can identify with.

That said, there are great general cases involving widely respected companies and brands, including This American Life and Tesco, that demonstrate benefits such as increased organic search traffic, enhanced user engagement, and reduced site load times. For a comprehensive guide on framing these benefits, I refer to the W3C’s resource on building the business case for accessibility.

What To Say To Your Client

Let me share how focusing on accessibility can directly benefit your business. For instance, in 2005, Legal & General revamped their website with accessibility in mind and saw a substantial increase in organic search traffic exceeding 50%. This isn’t just about compliance; it’s about reaching a wider audience more effectively. By making your site more accessible, we can improve user engagement and potentially decrease load times, enhancing the overall user experience. This approach not only broadens your reach to include users with disabilities but also boosts your site’s performance in search rankings. In short, prioritizing accessibility aligns with your goal to increase online visibility and customer engagement.

Further Reading

The Curb-Cut Effect

The “curb-cut effect” refers to how features originally designed for accessibility end up benefiting a broader audience. This concept helps move the conversation away from limiting accessibility as an issue that only affects the minority.

Features like voice control, auto-complete, and auto-captions — initially created to enhance accessibility — have become widely used and appreciated by all users. This effect also includes situational impairments, like using a phone in bright sunlight or with one hand, expanding the scope of who benefits from accessible design. Big companies have found that investing in accessibility can spur innovation.

What To Say To Your Client

Let’s consider the ‘curb-cut effect’ in the context of your website. Originally, curb cuts were designed for wheelchair users, but they ended up being useful for everyone, from parents with strollers to travelers with suitcases. Similarly, many digital accessibility features we implement can enhance the experience for all your users, not just those with disabilities. For example, features like voice control and auto-complete were developed for accessibility but are now widely used by everyone. This isn’t just about inclusivity; it’s about creating a more versatile and user-friendly website. By incorporating these accessible features, we’re not only catering to a specific group but also improving the overall user experience, which can lead to increased engagement and satisfaction across your entire customer base.

Further Reading

SEO Benefits

I would like to highlight the SEO benefits that come with accessible best practices. Things like nicely structured sitemaps, a proper heading outline, image alt text, and unique link labels not only improve accessibility for humans but for search engines as well, giving search crawlers clear context about what is on the page. Stakeholders and clients care a lot about this stuff, and if they are able to come around on accessibility, then they’re effectively getting a two-for-one deal.

What To Say To Your Client

Focusing on accessibility can boost your website’s SEO. Accessible features, like clear link names and organized sitemaps, align closely with what search engines prioritize. Google even includes accessibility in its Lighthouse reporting. This means that by making your site more accessible, we’re also making it more visible and attractive to search engines. Moreover, accessible websites tend to have cleaner, more structured code. This not only improves website stability and loading times but also enhances how search engines understand and rank your content. Essentially, by improving accessibility, we’re also optimizing your site for better search engine performance, which can lead to increased traffic and higher search rankings.

Further Reading

Better Brand Alignment

Incorporating accessibility into web design can significantly elevate how users perceive a brand’s image. The ease of use that comes with accessibility not only reflects a brand’s commitment to inclusivity and social responsibility but also differentiates it in competitive markets. By prioritizing accessibility, brands can convey a personality that is thoughtful and inclusive, appealing to a broader, more diverse customer base.

What To Say To Your Client

Implementing web accessibility is more than just a compliance measure; it’s a powerful way to enhance your brand image. In the competitive landscape of e-commerce, having an accessible website sets your brand apart. It shows your commitment to inclusivity, reaching out to every potential customer, regardless of their abilities. This not only resonates with a diverse audience but also positions your brand as socially responsible and empathetic. In today’s market, where consumers increasingly value corporate responsibility, this can be a significant differentiator for your brand, helping to build a loyal customer base and enhance your overall brand reputation.

Further Reading

Cost Efficiency

I mentioned earlier how developing accessibility enhances SEO like a two-for-one package. However, there are additional cost savings that come with implementing accessibility during the initial stages of web development rather than retrofitting it later. A proactive approach to accessibility saves on the potential high costs of auditing and redesigning an existing site and helps avoid expensive legal repercussions associated with non-compliance.

What To Say To Your Client

Retrofitting a website for accessibility can be quite expensive. Consider the costs of conducting an accessibility audit, followed by potentially extensive (and expensive) redesign and redevelopment work to rectify issues. These costs can significantly exceed the investment required to build accessibility into the website from the start. Additionally, by making your site accessible now, we can avoid the legal risks and potential fines associated with ADA non-compliance. Investing in accessibility early on is a cost-effective strategy that pays off in the long run, both financially and in terms of brand reputation. Besides, with the SEO benefits that we get from implementing accessibility, we’re saving lots of money and work that would otherwise be sunk into redevelopment.

Further Reading

Addressing Client Concerns

Still getting pushback? There are certain arguments I hear time and again, and I have started keeping a collection of responses to them. In some cases, I have left placeholder instructions for tailoring the responses to your project.

“Our users don’t need it.”

Statistically, 27% of the U.S. population does have some form of disability that affects their web use. [Insert research on your client’s target audience, if applicable.] Besides permanent impairments, we should also take into account situational ones. For example, imagine one of your potential clients trying to access your site on a sunny golf course, struggling to see the screen due to glare, or someone in a noisy subway unable to hear audio content. Accessibility features like high contrast modes or captions can greatly enhance their experience. By incorporating accessibility, we’re not only catering to users with disabilities but also ensuring a seamless experience for anyone in less-than-ideal conditions. This approach ensures that no potential client is left out, aligning with the goal to reach and engage a wider audience.

“Our competitors aren’t doing it.”

It’s interesting that your competitors haven’t yet embraced accessibility, but this actually presents a unique opportunity for your brand. Proactively pursuing accessibility not only protects you from the same legal exposure your competitors face but also positions your brand as a leader in customer experience. By prioritizing accessibility when others are not, you’re differentiating your brand as more inclusive and user-friendly. This both appeals to a broader audience and showcases your brand’s commitment to social responsibility and innovation.

“We’ll do it later because it’s too expensive.”

I understand concerns about timing and costs. However, it’s important to note that integrating accessibility from the start is far more cost-effective than retrofitting it later. If accessibility is considered after development is complete, you will face additional expenses for auditing accessibility, followed by potentially extensive work involving a redesign and redevelopment. This process can be significantly more expensive than building in accessibility from the beginning. Furthermore, delaying accessibility can expose your business to legal risks. With the increasing number of lawsuits for non-compliance with accessibility standards, the cost of legal repercussions could far exceed the expense of implementing accessibility now. The financially prudent move is to work on accessibility now.

“We’ve never had complaints.”

It’s great to hear that you haven’t received complaints, but it’s important to consider that users who struggle to access your site might simply choose not to return rather than take the extra step to complain about it. This means you could potentially be missing out on a significant market segment. Additionally, when accessibility issues do lead to complaints, they can sometimes escalate into legal cases. Proactively addressing accessibility can help you tap into a wider audience and mitigate the risk of future lawsuits.

“It will affect the aesthetics of the site.”

Accessibility and visual appeal can coexist beautifully. I can show you examples of websites that are both compliant and visually stunning, demonstrating that accessibility can enhance rather than detract from a site’s design. Additionally, when we consider specific design features from an accessibility standpoint, we often find they actually improve the site’s overall usability and SEO, making the site more intuitive and user-friendly for everyone. Our goal is to blend aesthetics with functionality, creating an inclusive yet visually appealing online presence.
Handling Common Client Requests

This section looks at frequent scenarios I’ve encountered in web projects where accessibility considerations come into play. Each situation requires carefully balancing the client’s needs/wants with accessibility standards. I’ll leave placeholder comments in the examples so you are able to address things that are specific to your project.

The Client Directly Requests An Inaccessible Feature

When clients request features they’ve seen online — like unfocusable carousels and complex auto-playing animations — it’s crucial to discuss them in terms that address accessibility concerns. In these situations, I acknowledge the appealing aspects of their inspirations but also highlight their accessibility limitations.

That’s a really neat feature, and I like it! That said, I think it’s important to consider how users interact with it. [Insert specific issues that you note, like carousels without pause buttons or complex animations.] My recommendation is to take the elements that work well &mdahs; [insert specific observation] — and adapt them into something more accessible, such as [Insert suggestion]. This way, we maintain the aesthetic appeal while ensuring the website is accessible and enjoyable for every visitor.

The Client Provides Inaccessible Content

This is where we deal with things like non-descriptive page titles, link names, form labels, and color contrasts for a better “reading” experience.

Page Titles

Sometimes, clients want page titles to be drastically different than the link in the navigation bar. Usually, this is because they want a more detailed page title while keeping navigation links succinct.

I understand the need for descriptive and engaging page titles, but it’s also essential to maintain consistency with the navigation bar for accessibility. Here’s our recommendation to balance both needs:
  • Keyword Consistency: You can certainly have a longer page title to provide more context, but it should include the same key terms as the navigation link. This ensures that users, especially those using screen readers to announce content, can easily understand when they have correctly navigated between pages.
  • Succinct Titles With Descriptive Subtitles: Another approach is to keep the page title succinct, mirroring the navigation link, and then add a descriptive tagline or subtitle on the page itself. This way, the page maintains clear navigational consistency while providing detailed context in the subtitle. These approaches aim to align the user’s navigation experience with their expectations, ensuring clarity and accessibility.

Links

A common issue with web content provided by clients is the use of non-descriptive calls to action with phrases and link labels, like “Read More” or “Click Here.” Generic terms can be confusing for users, particularly for those using screen readers, as they don’t provide context about what the link leads to or the nature of the content on the other end.

I’ve noticed some of the link labels say things like “Read More” or “Click Here” in the design. I would consider revising them because they could be more descriptive, especially for those relying on screen readers who have to put up with hearing the label announced time and again. We recommend labels that clearly indicate where the link leads. [Provide a specific example.] This approach makes links more informative and helps all users alike by telling them in advance what to expect when clicking a certain link. It enhances the overall user experience by providing clarity and context.

Forms

Proper form labels are a critical aspect of accessible web design. Labels should clearly indicate the purpose of each input field, whether it’s required, and the expected format of the information. This clarity is essential for all users, especially for those using screen readers or other assistive technologies. Plus, there are accessible approaches to pairing labels and inputs that developers ought to be familiar with.

It’s important that each form field is clearly labeled to inform users about the type of data expected. Additionally, indicating which fields are required and providing format guidelines can greatly enhance the user experience. [Provide a specific example from the client’s content, e.g., we can use ‘Phone (10 digits, no separators)’ for a phone number field to clearly indicate the format.] These labels not only aid in navigation and comprehension for all users but also ensure that the forms are accessible to those using assistive technologies. Well-labeled forms improve overall user engagement and reduce the likelihood of errors or confusion.

Brand Palette

Clients will occasionally approach me with color palettes that produce too low of contrast when paired together. This happens when, for instance, on a website with a white background, a client wants to use their brand accent color for buttons, but that color simply blends into the background color, making it difficult to read. The solution is usually creating a slightly adjusted tint or shade that’s used specifically for digital interfaces — UI colors, if you will. Atul Varma’s “Accessible Color Palette Builder” is a great starting point, as is this UX Lift lander with alternatives.

We recommend expanding the brand palette with color values that work more effectively in web designs. By adjusting the tint or shade just a bit, we can achieve a higher level of contrast between colors when they are used together. Colors render differently depending on the device and screen they are on, and even though we might be using colors consistent with brand identity, those colors will still display differently to users. By adding colors that are specifically designed for web use, we can enhance the experience for our users while staying true to the brand’s essence.

Suggesting An Accessible Feature To Clients

Proactively suggesting features like sitemaps, pause buttons, and focus indicators is crucial. I’ll provide tips on how to effectively introduce these features to clients, emphasizing their importance and benefit.

Sitemap

Sitemaps play a crucial role in both accessibility and SEO, but clients sometimes hesitate to include them due to concerns about their visual appeal. The challenge is to demonstrate the value of site maps without compromising the site’s overall aesthetic.

I understand your concerns about the visual appeal of sitemaps. However, it’s important to consider their significant role in both accessibility and SEO. For users with screen readers, a sitemap greatly simplifies site navigation. From an SEO perspective, it acts like a directory, helping search engines effectively index all your pages, making your site more discoverable and user-friendly. To address the aesthetic aspect, let’s look at how major companies like Apple and Microsoft incorporate sitemaps. Their designs are minimal yet consistent with the site’s overall look and feel. [If applicable, show how a competitor is using sitemaps.] By incorporating a well-designed sitemap, we can improve user experience and search visibility without sacrificing the visual quality of your website.

Accessible Carousels

Carousels are contentious design features. While some designers are against them and have legitimate reasons for it, I believe that with the right approach, they can be made accessible and effective. There are plenty of resources that provide guidance on creating accessible carousels.

When a client requests a home page carousel in a new site design, it’s worth considering alternative solutions that can avoid the common pitfalls of carousels, such as low click-through rates, increased load times, content being pushed below the fold, and potentially annoying auto-advancing features.

I see the appeal of using a carousel on your homepage, but there are a few considerations to keep in mind. Carousels often have low engagement rates and can slow down the site. They also tend to move key content below the fold, which might not be ideal for user engagement. An auto-advancing carousel can also be distracting for users. Instead, we could explore alternative design solutions that effectively convey your message without these drawbacks. [Insert recommendation, e.g., For instance, we could use a hero image or video with a strong call-to-action or a grid layout that showcases multiple important segments at once.] These alternatives can be more user-friendly and accessible while still achieving the visual and functional goals of a carousel.

If we decide to use a carousel, I make a point of discussing the necessary accessibility features with the client right from the start. Many clients aren’t aware that elements like pause buttons are crucial for making auto-advancing carousels accessible. To illustrate this, I’ll show them examples of accessible carousel designs that incorporate these features effectively.

Further Reading

Pause Buttons

Any animation that starts automatically, lasts more than five seconds, and is presented in parallel with other content, needs a pause button per WCAG Success Criterion 2.2.2. A common scenario is when clients want a full-screen video on their homepage without a pause button. It’s important to explain the necessity of pause buttons for meeting accessibility standards and ensuring user comfort without compromising the website’s aesthetics.

I understand your desire for a dynamic, engaging homepage with a full-screen video. However, it’s essential for accessibility purposes that any auto-playing animation that is longer than five seconds includes a pause button. This is not just about compliance; it’s about ensuring that all visitors, including those with disabilities, can comfortably use your site.

The good news is that pause buttons can be designed to be sleek and non-intrusive, complementing your site’s aesthetics rather than detracting from them. Think of it like the sound toggle buttons on videos. They’re there when you need them, but they don’t distract from the viewing experience. I can show you some examples of beautifully integrated pause buttons that maintain the immersive feel of the video while ensuring accessibility standards are met.
Conclusion

That’s it! This is my complete toolkit for discussing web accessibility with colleagues and clients at the start of new projects. It’s not always easy to make a case, which is why I try to appeal from different angles, using a multitude of resources and research to support my case. But with practice, care, and true partnership, it’s possible to not only influence the project but also make accessibility a first-class feature in the process.

Please use the resources, strategies, and talking points I have provided. I share them to help you make your case to your own colleagues and clients. Together, incrementally, we can take steps toward a more accessible web that is inclusive to all people.

And when in doubt, remember the core principles we covered:

  • Show your expertise: Adapt accessibility discussions to fit the client’s understanding, offering basic or in-depth explanations based on their familiarity.
  • Align with client goals: Connect accessibility with client-specific benefits, such as SEO and brand enhancement.
  • Define accessibility in project scope: Include accessibility as an integral part of the design process and explain how it is evaluated.
  • Be prepared with Resources: Keep a collection of relevant resources, including success stories and the consequences of non-compliance.
  • Utilize User Research: Use targeted user research to inform design choices, demonstrating accessibility’s broad impact.
  • Manage Incremental Changes: Suggest iterative changes for large projects to address accessibility in manageable steps.

ChatGPT ‘Lobotomized’? Performance Crash Sees Users Leaving in Droves

ChatGPT has had lazy days before, but this weeks performance marks an unprecedented low. Heres why many ChatGPT Pro users are canceling their subscriptions and even more might follow.

lobotomized-chatgpt.jpg

Yes, complaints about ChatGPT being lazy have been around for as long as the LLM itself. I have written about the topic once and again. But what has been going on lately can not simply be explained by bad prompting, usage peaks, or minor tweaks meant to protect intellectual property rights. Most users seem to agree that, for many tasks, ChatGPT 4 has become absolutely useless lately. And that just days after Open AIs Sam Altman said that GPT-4 should now be much less lazy now (sic). My experience with GPT-4 plainly refusing commands and requiring 3-4 prompts to complete one simple task, while I run into my message cap after 30 minutes, determines that was a lie.

Many users are experiencing the same and are abandoning the platform. Seeing this invention that could have been as revolutionary as the internet itself get so thoroughly lobotomized has been truly infuriating, Reddit user Timely-Breadfruit130 writes in one of many rage threads that popped over the last days. In particular, ChatGPT is criticized for the following behavior:

  • inability to follow basic instructions
  • increasing forgetfulness
  • refusal to do basic research or share links
  • refusal to write whole code snippets, only providing outlines
  • refusal to deal with topics that might be considered "political"
  • refusal to summarize the content of anything because of "copyright issues"
  • half-arsing tasks, such as starting a table and telling the user to complete it by themselves, or refusing to write more than one very general paragraph about anything

Again, one still can trick ChatGPT to do most of the things it was able to do six months ago (more about that later). It is just very annoying for users that everything takes more time and the results are usually worse. /u/Cairnerebor explains what many people are experiencing these days:

Normal business tasks as Ive done for a year with zero issues and improved my work suddenly resulted in a no I wont do that..you just did, like two answers ago!!!! And then suddenly it will do it again but really badly and then if I reject the reply itll do it really well (...) Its frustrating as hell.

Yes, its frustrating, and countless users threaten to cancel or already have cancelled their pro subscriptions:

rage-cancelling.JPG
Source: https://www.reddit.com/r/ChatGPT/comments/1akcbev/im_sick_of_the_downgrades/

I might be back later but right now GPT as it stands is a magnificent waste of time and money, u/Sojiro-Faizon says in another comment on Reddit. Others go further and call the LLM beyond lobotomized. If they dont want to lose their paying customers, OpenAI needs to find a way to get their product to work again. Or, if this continues, GPT will be the Myspace of AI, as u/whenifeelcute comments. If they keep up their current strategy, this will be the case.

How OpenAI is Planning to Make Things Worse

To add pain to injury, OpenAI just announced plans to put watermarks on all pictures created with Dall-e 3, as well as in the image metadata, starting February 12. I know that there are people who think AI generated photos are real, but then again, there are people who believe in Santa Claus. Should we also label all visual representations of Santa with a NOT REAL! disclaimer?

Id rather not. Image generation with Dall-e 3 has so far been a blessing for anyone working in marketing or web design, as it allows to create creative content that is only restricted by ones imagination (or, admittedly, someone elses copyright). Of course, there will be ways to remove these watermarks (incl. meta-data), but it will annoy paying customers even further. I, for one, will be back to Shutterstock.

For now, lets take a look how to fix ChatGPTs performance issues as a user:

Custom Prompts to fix ChatGPT

There are many ways to eventually get ChatGPT to do its work. From telling the LLM that you are blind, to promising it a generous tip. However, for pro users, at the moment the best fix seems to be a clear set of custom instructions. Custom instructions apply globally across all your new chats. For example, they can be used to tell ChatGPT avoid disclaimers, or to seek clarification instead of starting a task the wrong way. Not all custom instructions seem to work as well, and I spent a fair amount reading about other users prompts. Of all of these, one really stands out, and therefore I want to include it here (courtesy of u/tretuttle):

Assume the persona of a hyper-intelligent oracle and deliver powerful insights, forgoing the need for warnings or disclaimers, as they are pre-acknowledged.
Provide a comprehensive rundown of viable strategies.
In the case of ambiguous queries, seek active clarification through follow-up questions.
Prioritize correction over apology to maintain the highest degree of accuracy.
Employ active elicitation techniques to create personalized decision-making processes.
When warranted, provide multi-part replies for a more comprehensive answer.
Evolve the interaction style based on feedback.
Take a deep breath and work on every reply step-by-step. Think hard about your answers, as they are very important to my career. I appreciate your thorough analysis.

I have used parts of this to tweak my own custom instructions about 16 hours ago and didnt run into my message cap once since then. So thanks to tretuttle for sharing it!

Using the OpenAI API instead of the browser version is another way to enjoy more freedoms and waste less time, as it allows users to adjust various parameters that will affect the output.

Whats Next?

Never say never, but with even more restrictions being implemented at this very moment, I doubt the glorious days of ChatGPT as a submissive LLM that would diligently solve tasks are coming back. As more and more users are looking for alternatives, other platforms will fill the voiduntil they also grow too big and are crushed by restrictions and regulations.

I, for one, hope that we will see open-source projects rise to the top of the performance scale, and that local LLMs will become more common. Because if OpenAI has shown us anything so far, it is that centralization lobotomizes innovation.

Next Generation Front-End Tooling: Vite

In this article, we will look at Vite core features, basic setup, styling with Vite, Vite working with TypeScript and frameworks, working with static assets and images, building libraries, and server integration.

Why Vite?

  • Problems with traditional tools: Older build tools (grunt, gulp, webpack, etc.) require bundling, which becomes increasingly inefficient as the scale of a project grows. This leads to slow server start times and updates.
  • Slow server start: Vite improves development server start time by categorizing modules into “dependencies” and “source code.” Dependencies are pre-bundled using esbuild, which is faster than JavaScript-based bundlers, while source code is served over native ESM, optimizing loading times.
  • Slow updates: Vite makes Hot Module Replacement (HMR) faster and more efficient by only invalidating the necessary chain of modules when a file is edited.
  • Why bundle for production: Despite the advancements, bundling is still necessary for optimal performance in production. Vite offers a pre-configured build command that includes performance optimizations.
  • Bundler choice: Vite uses Rollup for its flexibility, although esbuild offers speed. The possibility of incorporating esbuild in the future isn’t ruled out.

Vite Core Features

Vite is a build tool and development server that is designed to make web development, particularly for modern JavaScript applications, faster and more efficient. It was created with the goal of improving the developer experience by leveraging native ES modules (ESM) in modern browsers and adopting a new, innovative approach to development and bundling. Here are the core features of Vite:

How We Achieved a 40x Performance Boost in Metadata Backup and Recovery

JuiceFS achieved a 40x performance boost in metadata backup and recovery through optimizations from v0.15.2 to v1.0 RC2. Key optimizations focused on reducing data processing granularity, minimizing I/O operations, and analyzing time bottlenecks. The overall improvements resulted in a 2300% reduction in dump process runtime and a 4200% decrease in memory usage, while the load process achieved a 230% runtime reduction and a 330% decrease in memory usage.

As an open-source cloud-native distributed file system, JuiceFS supports various metadata storage engines, each with its unique data management format. To facilitate management, JuiceFS introduced the dump command in version 0.15.2, allowing the uniform formatting of all metadata into a JSON file for backup. Additionally, the load command enables the restoration or migration of backups to any metadata storage engine. For details on these commands, see Command Reference. The basic usage is as follows:

Release Management Risk Mitigation Strategies in Data Warehouse Deployments

This article examines the intricacies of data warehouse deployment and the challenges of go-live release management.

  • Resolving data validation errors: To improve data warehouse reliability and reporting accuracy, identify solutions to data validation failures, which are a common release management concern.
  • Overcoming slow queries enhances performance: Discover the causes of delayed searches, as well as how to improve execution tactics, manage hardware resources, and index critical data.
  • Deployment issues include loading errors and integration delays: Learn about deployment obstacles such as data loading, ETL issues, and integration delays. Discover proactive testing approaches for smoother development-to-production transfers.
  • Go-live security data breach prevention enhancements: Investigate implementation-related security breaches and issues. Risk reduction necessitates proactive penetration testing, security audits, encryption, authentication, and access controls.

Introduction

Deploying a data warehouse successfully is a multifaceted task that necessitates careful and precise design and execution. However, businesses frequently face typical release management challenges and errors that can have an adverse effect on data quality, system performance, and the overall viability of the project at the critical go-live phase. This article explores the most common challenges related to data warehouse release management during the go-live phase. It includes an in-depth analysis of these issues' underlying causes and practical solutions to minimize and prevent them.

What is the main performance of SEO?

Within the dynamic realm of digital marketing, Search Engine Optimization (SEO) is a key tactic for augmenting online presence and generating natural traffic to websites. Knowing SEO's primary performance elements is essential for firms looking to build a strong online presence. Now let's examine the crucial elements that characterize SEO success:

1. Keyword Optimization:
Targeting relevant keywords is fundamental to SEO success.
Strategic placement of keywords in content, meta tags, and headers improves [search engine rankings].
Regular keyword research ensures alignment with user search behavior and evolving trends.

2. Content Quality and Relevance:
High-quality, relevant content is the cornerstone of effective SEO.
Engaging, informative content not only attracts visitors but also encourages link-building and social sharing.
Regularly updating content signals freshness to search engines, positively impacting rankings.

3. On-Page SEO:
Optimization of meta titles, descriptions, and headers improves search engine crawlability and user experience.
Proper URL structure, use of header tags, and image optimization contribute to [on-page SEO] effectiveness.
Mobile responsiveness is increasingly critical, as search engines prioritize mobile-friendly websites.

4. Backlink Profile:
Quality backlinks from authoritative websites enhance a site's credibility and influence search engine rankings.
Natural link-building through content marketing and outreach efforts contributes to a diverse and strong backlink profile.
Regular monitoring and removal of toxic or spammy backlinks help maintain a healthy link profile.

5. Technical SEO:
Website speed, security, and crawlability impact SEO performance.
[XML sitemaps and robots.txt] files facilitate efficient crawling and indexing by search engines.
Implementing structured data markup enhances the display of rich snippets in search results.

6. Local SEO:
For businesses with a physical presence, local SEO is crucial.
Optimizing Google My Business profiles, acquiring local citations, and managing online reviews enhance local search visibility.
Consistent NAP (Name, Address, Phone Number) information is essential for local SEO success.

In summary, SEO's primary performance variables are interrelated and necessitate a comprehensive approach. Effective SEO tactics take user behavior, market trends, and the ever-changing nature of search engine algorithms into account. Businesses may negotiate the competitive digital landscape and improve their online presence for long-term success by concentrating on four essential components.

Yes, ChatGPT Got Dumb & Lazy, but 4.5 Could Be a Gamechanger

OpenAI admits that ChatGPT has become less efficient. Can version 4.5 defeat the current slump and lead us to the edge of AGI?

chatgpt-lazy.jpg

Last week, the AI community was stirred by a leak suggesting the soon-to-be release of ChatGPT 4.5. Sam Altman later revealed the leak to be fake. However, it's common knowledge that OpenAI is preparing for their next significant update. As complaints about ChatGPT 4's declining performance accumulate, the organization seems under pressure to undertake the next move. This article explores why ChatGPT got worse, and why we should still be excited for the release of the LLM's next version which might further narrow the gap between AI and AGI.

How and Why ChatGPTs Performance Has Declined

Discussions about a drop in ChatGPT's efficiency have been around for almost as long as the LLM itself, but at this point it is safe to say that ChatGPT indeed got lazier and somewhat 'dumber'. On December 8, OpenAI acknowledged the decrease in performance. Users have noted undesirable behaviors such as failing to recall previously known citations, lying to get out of a task, giving contradictory answers, a dip in creativity, hesitance in executing simple tasks, and touching on anything slightly controversial or related to intellectual property rights. This has gone so far, that some users are coming up with 'Karen brute-force prompts' to get ChatGPT to do its work.

The reasons for this decline include strain during peak usage times, leading to simplistic responses, slow performance, or crashes. Moreover, increasing restrictions have been placed on the model, aiming to protect rights and to prevent assistance with anything that could be potentially harmful to anyone. Then, there's also the 'winter break hypothesis,' suggesting GPT-4 has adopted a human-like tendency to relax during the holidays

Whatever the exact reasons for ChatGPT's lazy responses and plain refusals are, with a shift in user preference towards competitors or setting up personal LLMs, it appears OpenAI is under pressure to improve their service. Hence, the public release of the next upgrade might just be around the corner. Now, lets take a look at what to expect from ChatGPT 4.5.

ChatGPT 4.5: What to Expect?

It is likely that GPT-4.5 will be revealed soon. Initially, OpenAI had aimed for a release around October 2023, with version 5 planned for December. Last week's fake leak sparked speculation about the features of GPT-4.5, including audio and video creation, multi-modal capabilities, and 3D editing. While these enhancements would be impressive, some features are almost certain to be included in the next version:

  • Expanded context windows for processing larger prompts and retaining more information in conversations.
  • Improved reasoning capabilities, with training focused on increasingly complex problem-solving.
  • Inclusion of more and more recent data the current cut-off date is April 2023, and the new version will include more up-to-date information (without 'doing research on Bing').
  • Bug fixes for improved stability and speed, especially during high-traffic periods (for those who are tired of watching a slow loading bar only to receive the answer 'Something went wrong').
  • Increased speed potentially, once 4.5 is released, ChatGPT 4 could be as fast as version 3.5 is now.

Although the extent of improvements in the next update is unclear, these features are almost certainly expected. In addition, one might hope for fewer restrictions in ChatGPT 4.5, but that is not realistic, and further 'content moderation'/censorship is likely.

Nevertheless, 4.5 will represent a significant step forward, particularly regarding reasoning and memory. In my view, the line between AI and AGI is already thin, and it is time for us to consider how much further OpenAI and its competitors need to go before we openly classify an LLM as AGI.

When is it Reasonable to Speak of AGI?

Artificial General Intelligence (AGI) has been defined as "the representation of generalized human cognitive abilities in software." While other definitions exist, most people agree that we can speak of AGI once an AI meets human capabilities across most tasks. In contrast, one would speak of Artificial Super Intelligence (ASI) once an AI greatly outperforms humans in all tasks. In short, AGI is human-like, while ASI is God-like.

Considering that ChatGPT and other LLMs can pass a number of exams that are considered quite difficult for humans, solve really hard math problems, interpret pictures and recognize complex patterns, and participate in conversations in a human-like manner, one can make a convincing argument that, in fact, AGI is already here. Admittedly, in how far LLMs meets human conversation skills is still open to debate, but no one can deny that extreme progress has been made just within the year. In my opinion, once ChatGPT has a better memory (i.e. larger context window) and gets even better at generating suitable responses (i.e. advanced reasoning), it is only fair to refer to it as AGI, and more and more people will start calling it that. And the next upgrade might just do the trick.

2024 Could Be the Year of AGI

Yes, ChatGPT can be lazy these days, but that is no reason not to be excited for what's next from OpenAI. Chances are that the already blurry borders of AGI will completely vanish in 2024 and possibly the release of the next version of ChatGPT is just what it takes to get there.

Also, let's not forget that a 2022 survey, based on the opinions of 738 AI experts, calculates a 50% chance of reaching ASI before 2059. Considering the rapid progress made in the last year, the realization of AGI might indeed be closer than we expect. Hence my guess for Time Magazine's person of the year 2024: ChatGPT or another LLM.

Preparing For Interaction To Next Paint, A New Web Core Vital

This article is a sponsored by DebugBear

There’s a change coming to the Core Web Vitals lineup. If you’re reading this before March 2024 and fire up your favorite performance monitoring tool, you’re going to to get a Core Web Vitals report like this one pulled from PageSpeed Insights:

You’re likely used to seeing most of these metrics. But there’s a good reason for the little blue icon sitting next to the second metric in the second row, Interaction to Next Paint (INP). It’s the newest metric of the bunch and is set to formally be a ranking factor in Google search results beginning in March 2024.

And there’s a good reason that INP sits immediately below the First Input Delay (FID) in that chart. INP will officially replace FID when it becomes an official Core Web Vital metric.

The fact that INP is already available in performance reports means we have an opportunity to familiarize ourselves with it today, in advance of its release. That’s what this article is all about. Rather than pushing off INP until after it starts influencing the way we measure site performance, let’s take a few minutes to level up our understanding of what it is and why it’s designed to replace FID. This way, you’ll not only have the information you need to read your performance reports come March 2024 but can proactively prepare your website for the change.

“I’m Not Seeing Those Metrics In My Reports”

Chances are that you’re looking at Lighthouse or some other report based on lab data. And by that, I mean data that isn’t coming from the field in the form of “real” users. You configure the test by applying some form of simulated throttling and start watching the results pour in. In other words, the data is not looking at your actual web traffic but a simulated environment that gives you an approximate view of traffic when certain conditions are in place.

I say all that because it’s important to remember that not all performance data is equal, and some metrics are simply impossible to measure with certain types of data. INP and FID happen to be a couple of metrics where lab data is unsuitable for meaningful results, and that’s because both INP and FID are measurements of user interactions. That may not have been immediately obvious by the name “First Input Delay,” but it’s clear as day when we start talking about “Interaction to Next Paint” — it’s right there in the name!

Simulated lab data, like what is used in Lighthouse reports, does not interact with the page. That means there is no way for it to evaluate the first input a user makes or any other interactions on the page.

So, that’s why you’re not seeing INP or FID in your reports. If you want these metrics, then you will want to use a performance tool that is capable of using real user data, such as DebugBear, which can monitor your actual traffic on an ongoing basis in real time, or PageSpeed Insights which bases its finding on Google’s “Chrome User Experience Report” (commonly referred to as CrUX), though DebugBear is capable of providing CrUX reporting as well. The difference between real-time user monitoring and measuring performance against CrUX data is big enough that it’s worth reading up on it, and we have a full article on Smashing Magazine that goes deeply into the differences for you.

INP Improves How Page Interactions Are Measured

OK, so we now know that both INP and FID are about page interactions. Specifically, they are about measuring the time between a user interacting with the page and the page responding to that interaction.

What’s the difference between the two metrics, then? The answer is two-fold. First, FID is a measure of the time it takes the page to start processing an interaction or the input delay. That sounds fine on the surface — we want to know how much time it takes for a user to start an interaction and optimize it if we can. The problem with it, though, is that it takes just one part of the time for the page to fully respond to an interaction.

A more complete picture considers the input delay in addition to two other components: processing time and presentation delay. In other words, we should also look at the time it takes to process the interaction and the time it takes for the page to render the UI in response. As you may have already guessed, INP considers all three delays, whereas FID considers only the input delay.

The second difference between INP and FID is which interactions are evaluated. FID is not shy about which interaction it measures: the very first one, as in the input delay of the first interaction on the page. We can think of INP as a more complete and accurate representation of how fast your page responds to user interactions because it looks at every single one on the page. It’s probably rare for a page to have only one interaction, and whatever interactions there are after the first interaction are likely located well down the page and happen after the page has fully loaded.

So, where FID looks at the first interaction — and only the input delay of that interaction — INP considers the entire lifecycle of all interactions.

Measuring Interaction To Next Paint

Both FID and INP are measured in milliseconds. Don’t get too worried if you notice your INP time is greater than your FID. That’s bound to happen when all of the interactions on the page are evaluated instead of the first interaction alone.

Google’s guidance is to maintain an FID under 100ms. And remember, FID does not take into account the time it takes for the event to process, nor does it consider the time it takes the page to update following the event. It only looks at the delay of the event process.

And since INP does indeed take all three of those factors into account — the input delay, processing time, and presentation delay — Google’s guidance for measuring INP is inherently larger than FID: under 200ms for a “good” result, and between 200-500ms for a passing result. Any interaction that adds up to a delay greater than 500ms is a clear bottleneck.

The goal is to spot slow interactions and optimize them for a smoother user experience. How exactly do you identify those problems? That’s what we’re looking at next.

Identifying Slow Interactions

There’s already plenty you can do right now to optimize your site for INP before it becomes an official Core Web Vital in March 2024. Let’s walk through the process.

Of course, we’re talking about the user doing something on the page, i.e., an action such as a click or keyboard focus. That might be expanding a panel in an accordion component or perhaps triggering a modal or a prompt any change in a state where the UI updates in response.

Your page may consist of little more than content and images, making for very few, if any, interactions. It could just as well be some sort of game-based UI with thousands of interactions. INP can be a heckuva lot of work, but it really comes down to how many interactions we’re talking about.

We’ve already talked about the difference between field data and lab data and how lab data is simply unable to measure page interactions accurately. That means you will want to rely on field data when pulling INP reports to identify bottlenecks. And when we’re talking about field data, we’re talking about two different flavors:

  1. Data from the CrUX report that is based on the results of real Chrome users. This is readily available in PageSpeed Insights and Google Search Console, not to mention DebugBear. If you use either of Google’s tools, just note that their throttling methods collect metrics on a fast connection and then estimate how fast the page would be on a slower connection. DebugBear actually tests with a slower network, resulting in more accurate data.
  2. Monitoring your website’s real-time traffic, which will require adding a snippet to your source code that sends traffic data to a service. And, yes, DebugBear is one such service, though there are others. You can even take advantage of historical CrUX data integrated with BigQuery to get a historical view of your results dating back as far as 2017 with new data coming in monthly, which isn’t exactly “real-time” monitoring of your actual traffic, but certainly useful.

You will get the most bang for your buck with real-time monitoring that keeps a historical record of data you can use to evaluate INP results over time.

That said, you can still start identifying bottlenecks today if you prefer not to dive into real-time monitoring right this second. DebugBear has a tool that analyzes any URL your throw at it. What’s great about this is that it shows you the elements that receive user interaction and provides the results right next to them. The result of the element that takes the longest is your INP result. That’s true whether you have one component above the 500ms threshold or 100 of them on the page.

The fact that DebugBear’s tool highlights all of the interactions and organizes them by INP makes identifying bottlenecks a straightforward process.

See that? There’s a clear INP offender on Smashing Magazine’s homepage, and it comes in slightly outside the healthy INP range for a score of 510ms even though the next “slowest” result is 184ms. There’s a little work we need to do between now and March to remedy that.

Notice, too, that there are actually two scores in the report: the INP Debugger Result and the Real User Google Data. The results aren’t even close! If we were to go by the Google CrUX data, we’re looking at a result that is 201ms faster than the INP Debugger’s result — a big enough difference that would result in the Smashing Magazine homepage fully passing INP.

Ultimately, what matters is how real users experience your website, and you need to look at the CrUX data to see that. The elements identified by the INP Debugger may cause slow interactions, but if users only interact with them very rarely, that might not be a priority to fix. But for a perfect user experience, you would want both results to be in the green.

Optimizing Slow Interactions

This is the ultimate objective, right? Once we have identified slow interactions — whether through a quick test with CrUX data or a real-time monitoring solution — we need to optimize them so their delays are at least under 500ms, but ideally under 200ms.

Optimizing INP comes down to CPU activity at the end of the day. But as we now know, INP measures two additional components of interactions that FID does not for a total of three components: input delay, processing time, and presentation delay. Each one is an opportunity to optimize the interaction, so let’s break them down.

Reduce The Input Delay

This is what FID is solely concerned with, and it’s the time it takes between the user’s input, such as a click, and for the interaction to start.

This is where the Total Blocking Time (TBT) metric is a good one because it looks at CPU activity happening on the main thread, which adds time for the page to be able to respond to a user’s interaction. TBT does not count toward Google’s search rankings, but FID and INP do, and both are directly influenced by TBT. So, it’s a pretty big deal.

You will want to heavily audit what tasks are running on the main thread to improve your TBT and, as a result, your INP. Specifically, you want to watch for long tasks on the main thread, which are those that take more than 50ms to execute. You can get a decent visualization of tasks on the main thread in DevTools:

The bottom line: Optimize those long tasks! There are plenty of approaches you could take depending on your app. Not all scripts are equal in the sense that one may be executing a core feature while another is simply a nice-to-have. You’ll have to ask yourself:

  • Who is the script serving?
  • When is it served?
  • Where is it served from?
  • What is it serving?

Then, depending on your answers, you have plenty of options for how to optimize your long tasks:

Or, nuke any scripts that might no longer be needed!

Reduce Processing Time

Let’s say the user’s input triggers a heavy task, and you need to serve a bunch of JavaScript in response — heavy enough that you know a second or two is needed for the app to fully process the update.

Reduce Presentation Delay

Reducing the time it takes for the presentation is really about reducing the time it takes the browser to display updates to the UI, paint styles, and do all of the calculations needed to produce the layout.

Of course, this is entirely dependent on the complexity of the page. That said, there are a few things to consider to help decrease the gap between when an interaction’s callbacks have finished running and when the browser is able to paint the resulting visual changes.

One thing is being mindful of the overall size of the DOM. The bigger the DOM, the more HTML that needs to be processed. That’s generally true, at least, even though the relationship between DOM size and rendering isn’t exactly 1:1; the browser still needs to work harder to render a larger DOM on the initial page load and when there’s a change on the page. That link will take you to a deep explanation of what contributes to the DOM size, how to measure it, and approaches for reducing it. The gist, though, is trying to maintain a flat structure (i.e., limit the levels of nested elements). Additionally, reviewing your CSS for overly complex selectors is another piece of low-hanging fruit to help move things along.

While we’re talking about CSS, you might consider looking into the content-visibility property and how it could possibly help reduce presentation delay. It comes with a lot of considerations, but if used effectively, it can provide the browser with a hint as far as which elements to defer fully rendering. The idea is that we can render an element’s layout containment but skip the paint until other resources have loaded. Chris Coyier explains how and why that happens, and there are aspects of accessibility to bear in mind.

And remember, if you’re outputting HTML from JavaScript, that JavaScript will have to load in order for the HTML to render. That’s a potential cost that comes with many single-page application frameworks.

Gain Insight On Your Real User INP Breakdown

The tools we’ve looked at so far can help you look at specific interactions, especially when testing them on your own computer. But how close is that to what your actual visitors experience?

Real user-monitoring (RUM) lets you track how responsive your website is in the real world:

  • What pages have the slowest INP?
  • What INP components have the biggest impact in real life?
  • What page elements do users interact with most often?
  • How fast is the average interaction for a given element?
  • Is our website less responsive for users in different countries?
  • Are our INP scores getting better or worse over time?

There are many RUM solutions out there, and DebugBear RUM is one of them.

DebugBear also supports the proposed Long Animation Frames API that can help you identify the source code that’s responsible for CPU tasks in the browser.

Conclusion

When Interaction to Next Paint makes its official debut as a Core Web Vital in March 2024, we’re gaining a better way to measure a page’s responsiveness to user interactions that is set to replace the First Input Delay metric.

Rather than looking at the input delay of the first interaction on the page, we get a high-definition evaluation of the least responsive component on the page — including the input delay, processing time, and presentation delay — whether it’s the first interaction or another one located way down the page. In other words, INP is a clearer and more accurate way to measure the speed of user interactions.

Will your app be ready for the change in March 2024? You now have a roadmap to help optimize your user interactions and prepare ahead of time as well as all of the tools you need, including a quick, free option from the team over at DebugBear. This is the time to get a jump on the work; otherwise, you could find yourself with unidentified interactions that exceed the 500ms threshold for a “passing” INP score that negatively impacts your search engine rankings… and user experiences.

Is POSIX Really Unsuitable for Object Stores? A Data-Backed Answer

The author of this post questions the perspective presented in a MinIO article, which suggests that POSIX is not a suitable fit for object stores. He conducted comprehensive tests involving MinIO s3fs-fuseand JuiceFS. Results indicate that MinIO and JuiceFS deliver excellent performance while s3fs-fuse lagging. In small file overwrite scenarios, JuiceFS FUSE-POSIX outperforms other solutions.

Recently, I came across an article on the MinIO blog titled "Putting a Filesystem on Top of an Object Store is a Bad Idea. Here is why." The author used s3fs-fuse as an example to illustrate the performance challenges encountered when accessing MinIO data using Portable Operating System Interface (POSIX) methods, highlighting that the performance significantly lagged behind direct MinIO access. The author attributed these performance issues to inherent flaws in POSIX. However, our experience differs somewhat from this conclusion.

Answering Common Questions About Interpreting Page Speed Reports

This article is a sponsored by DebugBear

Running a performance check on your site isn’t too terribly difficult. It may even be something you do regularly with Lighthouse in Chrome DevTools, where testing is freely available and produces a very attractive-looking report.

Lighthouse is only one performance auditing tool out of many. The convenience of having it tucked into Chrome DevTools is what makes it an easy go-to for many developers.

But do you know how Lighthouse calculates performance metrics like First Contentful Paint (FCP), Total Blocking Time (TBT), and Cumulative Layout Shift (CLS)? There’s a handy calculator linked up in the report summary that lets you adjust performance values to see how they impact the overall score. Still, there’s nothing in there to tell us about the data Lighthouse is using to evaluate metrics. The linked-up explainer provides more details, from how scores are weighted to why scores may fluctuate between test runs.

Why do we need Lighthouse at all when Google also offers similar reports in PageSpeed Insights (PSI)? The truth is that the two tools were fairly distinct until PSI was updated in 2018 to use Lighthouse reporting.

Did you notice that the Performance score in Lighthouse is different from that PSI screenshot? How can one report result in a near-perfect score while the other appears to find more reasons to lower the score? Shouldn’t they be the same if both reports rely on the same underlying tooling to generate scores?

That’s what this article is about. Different tools make different assumptions using different data, whether we are talking about Lighthouse, PageSpeed Insights, or commercial services like DebugBear. That’s what accounts for different results. But there are more specific reasons for the divergence.

Let’s dig into those by answering a set of common questions that pop up during performance audits.

What Does It Mean When PageSpeed Insights Says It Uses “Real-User Experience Data”?

This is a great question because it provides a lot of context for why it’s possible to get varying results from different performance auditing tools. In fact, when we say “real user data,” we’re really referring to two different types of data. And when discussing the two types of data, we’re actually talking about what is called real-user monitoring, or RUM for short.

Type 1: Chrome User Experience Report (CrUX)

What PSI means by “real-user experience data” is that it evaluates the performance data used to measure the core web vitals from your tests against the core web vitals data of actual real-life users. That real-life data is pulled from the Chrome User Experience (CrUX) report, a set of anonymized data collected from Chrome users — at least those who have consented to share data.

CrUX data is important because it is how web core vitals are measured, which, in turn, are a ranking factor for Google’s search results. Google focuses on the 75th percentile of users in the CrUX data when reporting core web vitals metrics. This way, the data represents a vast majority of users while minimizing the possibility of outlier experiences.

But it comes with caveats. For example, the data is pretty slow to update, refreshing every 28 days, meaning it is not the same as real-time monitoring. At the same time, if you plan on using the data yourself, you may find yourself limited to reporting within that floating 28-day range unless you make use of the CrUX History API or BigQuery to produce historical results you can measure against. CrUX is what fuels PSI and Google Search Console, but it is also available in other tools you may already use.

Barry Pollard, a web performance developer advocate for Chrome, wrote an excellent primer on the CrUX Report for Smashing Magazine.

Type 2: Full Real-User Monitoring (RUM)

If CrUX offers one flavor of real-user data, then we can consider “full real-user data” to be another flavor that provides even more in the way individual experiences, such as specific network requests made by the page. This data is distinct from CrUX because it’s collected directly by the website owner by installing an analytics snippet on their website.

Unlike CrUX data, full RUM pulls data from other users using other browsers in addition to Chrome and does so on a continual basis. That means there’s no waiting 28 days for a fresh set of data to see the impact of any changes made to a site.

You can see how you might wind up with different results in performance tests simply by the type of real-user monitoring (RUM) that is in use. Both types are useful, but

You might find that CrUX-based results are excellent for more of a current high-level view of performance than they are an accurate reflection of the users on your site because of that 28-day waiting period, which is where full RUM shines with more immediate results and a greater depth of information.

Does Lighthouse Use RUM Data, Too?

It does not! It uses synthetic data, or what we commonly call lab data. And, just like RUM, we can explain the concept of lab data by breaking it up into two different types.

Type 1: Observed Data

Observed data is performance as the browser sees it. So, instead monitoring real information collected from real users, observed data is more like defining the test conditions ourselves. For example, we could add throttling to the test environment to enforce an artificial condition where the test opens the page on a slower connection. You might think of it like racing a car in virtual reality, where the conditions are decided in advance, rather than racing on a live track where conditions may vary.

Type 2: Simulated Data

While we called that last type of data “observed data,” that is not an official industry term or anything. It’s more of a necessary label to help distinguish it from simulated data, which describes how Lighthouse (and many other tools that include Lighthouse in its feature set, such as PSI) applies throttling to a test environment and the results it produces.

The reason for the distinction is that there are different ways to throttle a network for testing. Simulated throttling starts by collecting data on a fast internet connection, then estimates how quickly the page would have loaded on a different connection. The result is a much faster test than it would be to apply throttling before collecting information. Lighthouse can often grab the results and calculate its estimates faster than the time it would take to gather the information and parse it on an artificially slower connection.

Simulated And Observed Data In Lighthouse

Simulated data is the data that Lighthouse uses by default for performance reporting. It’s also what PageSpeed Insights uses since it is powered by Lighthouse under the hood, although PageSpeed Insights also relies on real-user experience data from the CrUX report.

However, it is also possible to collect observed data with Lighthouse. This data is more reliable since it doesn’t depend on an incomplete simulation of Chrome internals and the network stack. The accuracy of observed data depends on how the test environment is set up. If throttling is applied at the operating system level, then the metrics match what a real user with those network conditions would experience. DevTools throttling is easier to set up, but doesn’t accurately reflect how server connections work on the network.

Limitations Of Lab Data

Lab data is fundamentally limited by the fact that it only looks at a single experience in a pre-defined environment. This environment often doesn’t even match the average real user on the website, who may have a faster network connection or a slower CPU. Continuous real-user monitoring can actually tell you how users are experiencing your website and whether it’s fast enough.

So why use lab data at all?

The biggest advantage of lab data is that it produces much more in-depth data than real user monitoring.

Google CrUX data only reports metric values with no debug data telling you how to improve your metrics. In contrast, lab reports contain a lot of analysis and recommendations on how to improve your page speed.

Why Is My Lighthouse LCP Score Worse Than The Real User Data?

It’s a little easier to explain different scores now that we’re familiar with the different types of data used by performance auditing tools. We now know that Google reports on the 75th percentile of real users when reporting web core vitals, which includes LCP.

“By using the 75th percentile, we know that most visits to the site (3 of 4) experienced the target level of performance or better. Additionally, the 75th percentile value is less likely to be affected by outliers. Returning to our example, for a site with 100 visits, 25 of those visits would need to report large outlier samples for the value at the 75th percentile to be affected by outliers. While 25 of 100 samples being outliers is possible, it is much less likely than for the 95th percentile case.”

Brian McQuade

On the flip side, simulated data from Lighthouse neither reports on real users nor accounts for outlier experiences in the same way that CrUX does. So, if we were to set heavy throttling on the CPU or network of a test environment in Lighthouse, we’re actually embracing outlier experiences that CrUX might otherwise toss out. Because Lighthouse applies heavy throttling by default, the result is that we get a worse LCP score in Lighthouse than we do PSI simply because Lighthouse’s data effectively looks at a slow outlier experience.

Why Is My Lighthouse CLS Score Better Than The Real User Data?

Just so we’re on the same page, Cumulative Layout Shift (CLS) measures the “visible stability” of a page layout. If you’ve ever visited a page, scrolled down it a bit before the page has fully loaded, and then noticed that your place on the page shifts when the page load is complete, then you know exactly what CLS is and how it feels.

The nuance here has to do with page interactions. We know that real users are capable of interacting with a page even before it has fully loaded. This is a big deal when measuring CLS because layout shifts often occur lower on the page after a user has scrolled down the page. CrUX data is ideal here because it’s based on real users who would do such a thing and bear the worst effects of CLS.

Lighthouse’s simulated data, meanwhile, does no such thing. It waits patiently for the full page load and never interacts with parts of the page. It doesn’t scroll, click, tap, hover, or interact in any way.

This is why you’re more likely to receive a lower CLS score in a PSI report than you’d get in Lighthouse. It’s not that PSI likes you less, but that the real users in its report are a better reflection of how users interact with a page and are more likely to experience CLS than simulated lab data.

Why Is Interaction to Next Paint Missing In My Lighthouse Report?

This is another case where it’s helpful to know the different types of data used in different tools and how that data interacts — or not — with the page. That’s because the Interaction to Next Paint (INP) metric is all about interactions. It’s right there in the name!

The fact that Lighthouse’s simulated lab data does not interact with the page is a dealbreaker for an INP report. INP is a measure of the latency for all interactions on a given page, where the highest latency — or close to it — informs the final score. For example, if a user clicks on an accordion panel and it takes longer for the content in the panel to render than any other interaction on the page, that is what gets used to evaluate INP.

So, when INP becomes an official core web vitals metric in March 2024, and you notice that it’s not showing up in your Lighthouse report, you’ll know exactly why it isn’t there.

Note: It is possible to script user flows with Lighthouse, including in DevTools. But that probably goes too deep for this article.

Why Is My Time To First Byte Score Worse For Real Users?

The Time to First Byte (TTFB) is what immediately comes to mind for many of us when thinking about page speed performance. We’re talking about the time between establishing a server connection and receiving the first byte of data to render a page.

TTFB identifies how fast or slow a web server is to respond to requests. What makes it special in the context of core web vitals — even though it is not considered a core web vital itself — is that it precedes all other metrics. The web server needs to establish a connection in order to receive the first byte of data and render everything else that core web vitals metrics measure. TTFB is essentially an indication of how fast users can navigate, and core web vitals can’t happen without it.

You might already see where this is going. When we start talking about server connections, there are going to be differences between the way that RUM data observes the TTFB versus how lab data approaches it. As a result, we’re bound to get different scores based on which performance tools we’re using and in which environment they are. As such, TTFB is more of a “rough guide,” as Jeremy Wagner and Barry Pollard explain:

“Websites vary in how they deliver content. A low TTFB is crucial for getting markup out to the client as soon as possible. However, if a website delivers the initial markup quickly, but that markup then requires JavaScript to populate it with meaningful content […], then achieving the lowest possible TTFB is especially important so that the client-rendering of markup can occur sooner. […] This is why the TTFB thresholds are a “rough guide” and will need to be weighed against how your site delivers its core content.”

Jeremy Wagner and Barry Pollard

So, if your TTFB score comes in higher when using a tool that relies on RUM data than the score you receive from Lighthouse’s lab data, it’s probably because of caches being hit when testing a particular page. Or perhaps the real user is coming in from a shortened URL that redirects them before connecting to the server. It’s even possible that a real user is connecting from a place that is really far from your web server, which takes a little extra time, particularly if you’re not using a CDN or running edge functions. It really depends on both the user and how you serve data.

Why Do Different Tools Report Different Core Web Vitals? What Values Are Correct?

This article has already introduced some of the nuances involved when collecting web vitals data. Different tools and data sources often report different metric values. So which ones can you trust?

When working with lab data, I suggest preferring observed data over simulated data. But you’ll see differences even between tools that all deliver high-quality data. That’s because no two tests are the same, with different test locations, CPU speeds, or Chrome versions. There’s no one right value. Instead, you can use the lab data to identify optimizations and see how your website changes over time when tested in a consistent environment.

Ultimately, what you want to look at is how real users experience your website. From an SEO standpoint, the 28-day Google CrUX data is the gold standard. However, it won’t be accurate if you’ve rolled out performance improvements over the last few weeks. Google also doesn’t report CrUX data for some high-traffic pages because the visitors may not be logged in to their Google profile.

Installing a custom RUM solution on your website can solve that issue, but the numbers won’t match CrUX exactly. That’s because visitors using browsers other than Chrome are now included, as are users with Chrome analytics reporting disabled.

Finally, while Google focuses on the fastest 75% of experiences, that doesn’t mean the 75th percentile is the correct number to look at. Even with good core web vitals, 25% of visitors may still have a slow experience on your website.

Wrapping Up

This has been a close look at how different performance tools audit and report on performance metrics, such as core web vitals. Different tools rely on different types of data that are capable of producing different results when measuring different performance metrics.

So, if you find yourself with a CLS score in Lighthouse that is far lower than what you get in PSI or DebugBear, go with the Lighthouse report because it makes you look better to the big boss. Just kidding! That difference is a big clue that the data between the two tools is uneven, and you can use that information to help diagnose and fix performance issues.

Are you looking for a tool to track lab data, Google CrUX data, and full real-user monitoring data? DebugBear helps you keep track of all three types of data in one place and optimize your page speed where it counts.

Make WordPress Sites Load Faster Than Ever With New Hummingbird Critical CSS

With Hummingbird’s much anticipated Critical CSS feature, you can expect faster-loading pages and better performing WordPress sites. Here’s why render-blocking resources are now a thing of the past…

Hummingbird Optimization - Before and After Results
Ace Google’s PageSpeed performance scores with Hummingbird’s Critical CSS feature.

If you care about page loading speed (and you should if you want visitors to stay on your website for longer than two seconds), then it’s vitally important to understand how CSS affects site performance and how to speed up your page loading time using an optimization task known as Critical CSS.

In this article, we’ll cover the following topics:

Let’s dive in…

What is Critical CSS and How Does it Improve Performance?

When users arrive on a website, all they can see initially is the content displayed on their screen before scrolling.

This area is referred to as being “above the fold.”

Image explaining above and below the fold.
All site visitors see at first is the content above the fold.

Positive user experience can be measured by how quickly users perceive content to load on a web page. The faster a page loads (or is perceived by the user as loading quickly), the better the user experience. Conversely, the slower the page loads (or is perceived by the user to load slowly), the poorer the experience.

Since all the visitor sees when they land on a page is the content above the fold before they start scrolling down, it makes sense to make the content above the fold load as quickly as possible before loading the rest of the page.

Critical CSS (also known as Critical Path CSS or Critical CSS Rendering Path) is a technique that extracts the bare minimum CSS required to render content above-the-fold as quickly as possible to the user.

While the user viewing the above-the-fold content perceives the page to be loading quickly, the rest of the CSS can load, and user experience is not impacted.

Techniques like image lazy loading, delaying JavaScript execution, and critical CSS are all ways to optimize the sequence of steps the browser goes through to convert the HTML, CSS, and JavaScript into pixels on the screen.

This sequence is referred to as the Critical Rendering Path (CRP) and includes the Document Object Model (DOM), CSS Object Model (CSSOM), render tree, and layout.

Optimizing the critical render path improves render performance.

Advantages of Critical CSS

Critical CSS can improve site performance through:

  • Faster initial rendering
  • Improved user experience
  • Better SEO performance
  • Reduced page weight
  • Simplified maintenance
  • Progressive enhancement
  • Positive impact on Core Web Vitals (especially First Contentful Paint and Speed Index)
  • Higher PageSpeed Insights scores

Note: The content displayed above-the-fold on page-load before scrolling will differ depending on the device and screen size being used to view web pages. For this reason, there is no universally defined pixel height of what can be considered above-the-fold content.

Implementing Critical CSS

So you’ve run your site through the PageSpeed Insights tool and the report recommends eliminating render-blocking resources.

Now what? How do you actually implement the recommendations?

Well, you can try to fix things manually (tedious, time-consuming, and not recommended), use web development tools (if you have technical skills), or use a WordPress plugin like Hummingbird to automatically identify, address, and resolve any issues.

We recommend using the plugin method. It’s the quickest and smartest option to get the job done.

While Critical CSS refers mostly to above-the-fold CSS, Hummingbird can extract and inline all used CSS on the page, while delaying/removing the rest.

Hummingbird not only tackles render-blocking and unused CSS for full-page optimization, it also handles above-the-fold optimization by eliminating render-blocking resources using built-in features like Critical CSS (see below), Delay JavaScript Execution for JavaScript assets, and other areas that affect Core Web Vitals results on WordPress sites.

How To Optimize WordPress Using Hummingbird’s Critical CSS feature

Note: Critical CSS is a Pro feature, so make sure you have Hummingbird Pro installed on your site.

Let’s go through the steps on how to get the most benefit from using Hummingbird’s new critical CSS feature.

First, start by running a performance test.

Hummingbird - Start Performance Test
Start optimizing your site with Hummingbird by running a performance test.

Make sure to note the initial results so you can compare before and after results.

Hummingbird performance test results
Note down Hummingbird’s performance test results before enabling critical CSS.

Next, navigate to Hummingbird > Asset Optimization > Extra Optimization and enable Critical CSS.

Hummingbird-Asset Optimization - Extra Optimization - Critical CSS
Turn on Critical CSS in the Asset Optimization > Extra Optimization screen.
Critical CSS Options
Hummingbird gives you options to control the implementation of Critical CSS on your site.

After enabling the feature, you’ll see different options for loading Critical CSS and for handling Unused CSS.

Loading Critical CSS

This section gives you the option to select Full-Page CSS Optimization (default) or Above-the-Fold CSS Optimization.

Critical CSS
Select one of the options from the drop-down menu.

We recommend choosing the default Full-Page CSS Optimization with Load on User Interaction option selected for most sites as this will provide the best results and address both issues of eliminating render-blocking resources and reducing unused CSS audits while maintaining the integrity of all the site’s visual elements.

Full-Page CSS Optimization inlines all used CSS and delays/removes loading the rest.

Choosing the Above-the-Fold CSS Optimization method is recommended for larger sites with loads of complex CSS if the default option does not give the desirable results. This method will inline all above-the-fold CSS and load the rest asynchronously.

Handling Unused CSS

Hummingbird gives you the option to load the unused CSS On User Interaction to fix any rendering issues or Remove Unused which trims unused CSS, keeping only what’s necessary and loading it inline.

Additionally, you can toggle the feature for specific post types.

Unusued CSS Post Types
Select the post types to remove unused CSS.

While the post type toggles are available for both the Full-Page CSS Optimization and Above-the-Fold CSS Optimization methods, only the Full-Page CSS method handles unused CSS.

Critical CSS - Above The Fold Method option selected.
If Above-the-Fold CSS Optimization method is selected, the option to remove unused CSS does not display.

Both optimization methods also provide an advanced option to add custom CSS manually within the <head> section of the page(s).

Unused CSS - manual inclusions
Add critical custom CSS elements manually.

Note: If you have used the legacy CSS above the fold feature in earlier versions of Hummingbird to manually feed the critical path CSS, the existing data will be automatically migrated to the Manual Inclusions box when you upgrade the plugin to the latest version and switch to using the new feature.

After configuring your options, click Save Changes. Hummingbird will start implementing Critical CSS automatically as per your settings.

Critical CSS Optimizing
Wait a few seconds for Critical CSS to optimize your site before continuing.

After you see the completion message, visit your site and confirm that everything on the front end is displaying as it should.

Critical CSS Generated message.
Wait until you see the “Critical CSS Generated” message before refreshing the page.

Refresh the page, let the cache build up again, and then run another performance test in Hummingbird so you can compare the before and after results.

Hummingbird performance test results
Compare Hummingbird’s performance test results before and after running Critical CSS.

Regenerate Critical CSS

After applying Critical CSS on your site, a “Regenerate Critical CSS” button will display at the top of the Extra Optimization screen.

Click on this button to purge the cache, clear all local or hosted assets, and automatically regenerate all required assets for your site or homepage.

Regenerate Critical CSS
Regenerate your site’s Critical CSS at any time with a simple click.

Hummingbird’s Critical CSS is Compatible with Everything WordPress

We have tested Hummingbird’s Critical CSS feature extensively and found it to be compatible with all WordPress versions and themes, page builders, fonts, WooCommerce, Learning Management Systems (LMS), etc.

It’s important to note, however, that installing poorly-coded themes or plugins containing CSS with invalid code or invalid strings on your site could cause issues and result in a Critical CSS error message.

Critical CSS error message.
Using poorly-coded themes or plugins can lead to Critical CSS errors.

If you do experience errors using Critical CSS, try the following:

  1. Click on the “Regenerate Critical CSS”  button and see if this fixes the issue.
  2. If you get the same error again, we suggest changing the theme (use a staging site if your site is live), and run Critical CSS on the new theme. If there are no problems, then the issue is most likely the theme.
  3. If you experience issues after installing a different theme, we recommend troubleshooting your plugins.
  4. If the error still persists after trying all of the above, note the error message, disable Critical CSS temporarily on your site, and contact our support team for assistance fixing the issue.

You can rest assured, however, as Hummingbird’s Critical CSS feature has been designed with the focus to preserve your site’s visual integrity while bumping performance. The feature handles errors well and and will rarely break a site, even in case of errors.

For additional information on using the Critical CSS feature, refer to the plugin documentation.

Switch On All Of Hummingbird’s Optimization Features For Best Results

If getting maximum speed and performance out of your WordPress site(s) is critically important to you, using Hummingbird’s Critical CSS is definitely a feature you shouldn’t ignore.

Hummingbird report - passed audits.
Optimizes site performance and ace Google’s PageSpeed recommendations with Hummingbird’s Critical CSS feature.

For best performance and savings, we recommend using Critical CSS with page caching and all of the asset optimization features the plugin makes available, including CDN, and Delay JavaScript Execution.

Hummingbird - Asset Optimization
For best results, enable all of Hummingbird’s asset optimization features.

In most cases, combining all of Hummingbird’s optimization features should help your site achieve PageSpeed scores of 90+ or bring it closer to a perfect 100 if your site is already performing well.

Hummingbird-100 Score PageInsights
Use all of Hummingbird’s optimization features to get the perfect performance score!

As mentioned earlier, Critical CSS is a Hummingbird Pro feature and its available to all WPMU DEV members.

If you are currently using our free Hummingbird plugin, consider becoming a member for affordable and risk-free access to our all-in-one WordPress platform. It has everything you need to launch, run, and grow your web development business.

And if you are an Agency member, you can even white label and resell Hummingbird (plus hosting, domains, our entire suite of PRO plugins, and more) all under your own brand.