Smart Network Onboarding: Revolutionizing Connectivity With AI and Automation

Network onboarding — the process through which new devices gain access to an organization's network— is a cornerstone of IT operations, affecting everything from security to user satisfaction. Traditionally, this process has been fraught with challenges, particularly at scale. In environments with hundreds or thousands of devices, manual onboarding can consume disproportionate amounts of time and resources. According to a study by Cisco, IT teams spend approximately 20% of their time managing device connectivity issues, highlighting the operational burden of current practices.

Moreover, the scale of the problem becomes more apparent considering the proliferation of IoT devices. Gartner reports that by 2025, over 75 billion connected IoT devices will be in use worldwide. Each device, from the simplest sensor to complex industrial machinery, requires precise and secure network integration, a task that becomes exponentially difficult as network environments grow in complexity and scale.

Mastering BGP Neighborship: Effective Troubleshooting Strategies

Border Gateway Protocol (BGP) is the cornerstone of the internet's routing architecture, enabling data exchange between different autonomous systems (AS’s) and ensuring seamless communication across diverse networks. However, the complexity of BGP can make troubleshooting a daunting task, even for experienced network engineers. Whether you're dealing with connectivity issues, session establishment problems, or routing anomalies, a systematic approach to BGP troubleshooting is essential for maintaining network stability and performance.

Topology

BGPThe two routers have been configured with EBGP but we see that the neighborship did not establish.

The Importance of Code Profiling in Performance Engineering

When we discuss code profiling with a team of developers, they often say, "We don't have time to profile our code: that's why we have performance testers," or, "If your application or system runs very slowly, the developers and performance testers may suggest the infra team to simply add another server to the server farm." 

Developers usually look at code profiling as additional work and as a challenging process. Everyone in the project enters the phase of performance and memory profiling only when something is seriously a problem with performance in production. Due to a lack of knowledge and experience on how to profile and how various profilers work with different profiling types, many of us will fail to identify and address performance problems. As 70 to 80 percent of performance problems are due to inefficient code, it is recommended to use code profiling tools to measure and analyze the performance degradations at the early stages of development. This will help developers and performance engineers to find and fix the performance issues early which can make a big difference overall, especially if all the developers are testing and profiling the code as soon as they write. 

How To Hack Your Google Lighthouse Scores In 2024

This article is a sponsored by Sentry.io

Google Lighthouse has been one of the most effective ways to gamify and promote web page performance among developers. Using Lighthouse, we can assess web pages based on overall performance, accessibility, SEO, and what Google considers “best practices”, all with the click of a button.

We might use these tests to evaluate out-of-the-box performance for front-end frameworks or to celebrate performance improvements gained by some diligent refactoring. And you know you love sharing screenshots of your perfect Lighthouse scores on social media. It’s a well-deserved badge of honor worthy of a confetti celebration.

Just the fact that Lighthouse gets developers like us talking about performance is a win. But, whilst I don’t want to be a party pooper, the truth is that web performance is far more nuanced than this. In this article, we’ll examine how Google Lighthouse calculates its performance scores, and, using this information, we will attempt to “hack” those scores in our favor, all in the name of fun and science — because in the end, Lighthouse is simply a good, but rough guide for debugging performance. We’ll have some fun with it and see to what extent we can “trick” Lighthouse into handing out better scores than we may deserve.

But first, let’s talk about data.

Field Data Is Important

Local performance testing is a great way to understand if your website performance is trending in the right direction, but it won’t paint a full picture of reality. The World Wide Web is the Wild West, and collectively, we’ve almost certainly lost track of the variety of device types, internet connection speeds, screen sizes, browsers, and browser versions that people are using to access websites — all of which can have an impact on page performance and user experience.

Field data — and lots of it — collected by an application performance monitoring tool like Sentry from real people using your website on their devices will give you a far more accurate report of your website performance than your lab data collected from a small sample size using a high-spec super-powered dev machine under a set of controlled conditions. Philip Walton reported in 2021 that “almost half of all pages that scored 100 on Lighthouse didn’t meet the recommended Core Web Vitals thresholds” based on data from the HTTP Archive.

Web performance is more than a single core web vital metric or Lighthouse performance score. What we’re talking about goes way beyond the type of raw data we’re working with.

Web Performance Is More Than Numbers

Speed is often the first thing that comes up when talking about web performance — just how long does a page take to load? This isn’t the worst thing to measure, but we must bear in mind that speed is probably influenced heavily by business KPIs and sales targets. Google released a report in 2018 suggesting that the probability of bounces increases by 32% if the page load time reaches higher than three seconds, and soars to 123% if the page load time reaches 10 seconds. So, we must conclude that converting more sales requires reducing bounce rates. And to reduce bounce rates, we must make our pages load faster.

But what does “load faster” even mean? At some point, we’re physically incapable of making a web page load any faster. Humans — and the servers that connect them — are spread around the globe, and modern internet infrastructure can only deliver so many bytes at a time.

The bottom line is that page load is not a single moment in time. In an article titled “What is speed?” Google explains that a page load event is:

[…] “an experience that no single metric can fully capture. There are multiple moments during the load experience that can affect whether a user perceives it as ‘fast’, and if you just focus solely on one, you might miss bad experiences that happen during the rest of the time.”

The key word here is experience. Real web performance is less about numbers and speed than it is about how we experience page load and page usability as users. And this segues nicely into a discussion of how Google Lighthouse calculates performance scores. (It’s much less about pure speed than you might think.)

How Google Lighthouse Performance Scores Are Calculated

The Google Lighthouse performance score is calculated using a weighted combination of scores based on core web vital metrics (i.e., First Contentful Paint (FCP), Largest Contentful Paint (LCP), Cumulative Layout Shift (CLS)) and other speed-related metrics (i.e., Speed Index (SI) and Total Blocking Time (TBT)) that are observable throughout the page load timeline.

This is how the metrics are weighted in the overall score:

Metric Weighting (%)
Total Blocking Time 30
Cumulative Layout Shift 25
Largest Contentful Paint 25
First Contentful Paint 10
Speed Index 10

The weighting assigned to each score gives us insight into how Google prioritizes the different building blocks of a good user experience:

1. A Web Page Should Respond to User Input

The highest weighted metric is Total Blocking Time (TBT), a metric that looks at the total time after the First Contentful Paint (FCP) to help indicate where the main thread may be blocked long enough to prevent speedy responses to user input. The main thread is considered “blocked” any time there’s a JavaScript task running on the main thread for more than 50ms. Minimizing TBT ensures that a web page responds to physical user input (e.g., key presses, mouse clicks, and so on).

2. A Web Page Should Load Useful Content With No Unexpected Visual Shifts

The next most weighted Lighthouse metrics are Largest Contentful Paint (LCP) and Cumulative Layout Shift (CLS). LCP marks the point in the page load timeline when the page’s main content has likely loaded and is therefore useful.

At the point where the main content has likely loaded, you also want to maintain visual stability to ensure that users can use the page and are not affected by unexpected visual shifts (CLS). A good LCP score is anything less than 2.5 seconds (which is a lot higher than we might have thought, given we are often trying to make our websites as fast as possible).

3. A Web Page Should Load Something

The First Contentful Paint (FCP) metric marks the first point in the page load timeline where the user can see something on the screen, and the Speed Index (SI) measures how quickly content is visually displayed during page load over time until the page is “complete”.

Your page is scored based on the speed indices of real websites using performance data from the HTTP Archive. A good FCP score is less than 1.8 seconds and a good SI score is less than 3.4 seconds. Both of these thresholds are higher than you might expect when thinking about speed.

Usability Is Favored Over Raw Speed

Google Lighthouse’s performance scoring is, without a doubt, less about speed and more about usability. Your SI and FCP could be super quick, but if your LCP takes too long to paint, and if CLS is caused by large images or external content taking some time to load and shifting things visually, then your overall performance score will be lower than if your page was a little slower to render the FCP but didn’t cause any CLS. Ultimately, if the page is unresponsive due to JavaScript blocking the main thread for more than 50ms, your performance score will suffer more than if the page was a little slow to paint the FCP.

To understand more about how the weightings of each metric contribute to the final performance score, you can play about with the sliders on the Lighthouse Scoring Calculator, and here’s a rudimentary table demonstrating the effect of skewed individual metric weightings on the overall performance score, proving that page usability and responsiveness is favored over raw speed.

Description FCP (ms) SI (ms) LCP (ms) TBT (ms) CLS Overall Score
Slow to show something on screen 6000 0 0 0 0 90
Slow to load content over time 0 5000 0 0 0 90
Slow to load the largest part of the page 0 0 6000 0 0 76
Visual shifts occurring during page load 0 0 0 0 0.82 76
Page is unresponsive to user input 0 0 0 2000 0 70

The overall Google Lighthouse performance score is calculated by converting each raw metric value into a score from 0 to 100 according to where it falls on its Lighthouse scoring distribution, which is a log-normal distribution derived from the performance metrics of real website performance data from the HTTP Archive. There are two main takeaways from this mathematically overloaded information:

  1. Your Lighthouse performance score is plotted against real website performance data, not in isolation.
  2. Given that the scoring uses log-normal distribution, the relationship between the individual metric values and the overall score is non-linear, meaning you can make substantial improvements to low-performance scores quite easily, but it becomes more difficult to improve an already high score.

Read more about how metric scores are determined, including a visualization of the log-normal distribution curve on developer.chrome.com.

Can We “Trick” Google Lighthouse?

I appreciate Google’s focus on usability over pure speed in the web performance conversation. It urges developers to think less about aiming for raw numbers and more about the real experiences we build. That being said, I’ve wondered whether today in 2024, it’s possible to fool Google Lighthouse into believing that a bad page in terms of usability and usefulness is actually a great one.

I put on my lab coat and science goggles to investigate. All tests were conducted:

  • Using the Chromium Lighthouse plugin,
  • In an incognito window in the Arc browser,
  • Using the “navigation” and “mobile” settings (apart from where described differently),
  • By me, in a lab (i.e., no field data).

That all being said, I fully acknowledge that my controlled test environment contradicts my advice at the top of this post, but the experiment is an interesting ride nonetheless. What I hope you’ll take away from this is that Lighthouse scores are only one piece — and a tiny one at that — of a very large and complex web performance puzzle. And, without field data, I’m not sure any of this matters anyway.

How to Hack FCP and LCP Scores

TL;DR: Show the smallest amount of LCP-qualifying content on load to boost the FCP and LCP scores until the Lighthouse test has likely finished.

FCP marks the first point in the page load timeline where the user can see anything at all on the screen, while LCP marks the point in the page load timeline when the main page content (i.e., the largest text or image element) has likely loaded. A fast LCP helps reassure the user that the page is useful. “Likely” and “useful” are the important words to bear in mind here.

What Counts as an LCP Element

The types of elements on a web page considered by Lighthouse for LCP are:

  • <img> elements,
  • <image> elements inside an <svg> element,
  • <video> elements,
  • An element with a background image loaded using the url() function, (and not a CSS gradient), and
  • Block-level elements containing text nodes or other inline-level text elements.

The following elements are excluded from LCP consideration due to the likelihood they do not contain useful content:

  • Elements with zero opacity (invisible to the user),
  • Elements that cover the full viewport (likely to be background elements), and
  • Placeholder images or other images with low entropy (i.e., low informational content, such as a solid-colored image).

However, the notion of an image or text element being useful is completely subjective in this case and generally out of the realm of what machine code can reliably determine. For example, I built a page containing nothing but a <h1> element where, after 10 seconds, JavaScript inserts more descriptive text into the DOM and hides the <h1> element.

Lighthouse considers the heading element to be the LCP element in this experiment. At this point, the page load timeline has finished, but the page’s main content has not loaded, even though Lighthouse thinks it is likely to have loaded within those 10 seconds. Lighthouse still awards us with a perfect score of 100 even if the heading is replaced by a single punctuation mark, such as a full stop, which is even less useful.

This test suggests that if you need to load page content via client-side JavaScript, we‘ll want to avoid displaying a skeleton loader screen since that requires loading more elements on the page. And since we know the process will take some time — and that we can offload the network request from the main thread to a web worker so it won’t affect the TBT — we can use some arbitrary “splash screen” that contains a minimal viable LCP element (for better FCP scoring). This way, we’re giving Lighthouse the impression that the page is useful to users quicker than it actually is.

All we need to do is include a valid LCP element that contains something that counts as the FCP. While I would never recommend loading your main page content via client-side JavaScript in 2024 (serve static HTML from a CDN instead or build as much of the page as you can on a server), I would definitely not recommend this “hack” for a good user experience, regardless of what the Lighthouse performance score tells you. This approach also won’t earn you any favors with search engines indexing your site, as the robots are unable to discover the main content while it is absent from the DOM.

I also tried this experiment with a variety of random images representing the LCP to make the page even less useful. But given that I used small file sizes — made smaller and converted into “next-gen” image formats using a third-party image API to help with page load speed — it seemed that Lighthouse interpreted the elements as “placeholder images” or images with “low entropy”. As a result, those images were disqualified as LCP elements, which is a good thing and makes the LCP slightly less hackable.

View the demo page and use Chromium DevTools in an incognito window to see the results yourself.

This hack, however, probably won’t hold up in many other use cases. Discord, for example, uses the “splash screen” approach when you hard-refresh the app in the browser, and it receives a sad 29 performance score.

Compared to my DOM-injected demo, the LCP element was calculated as some content behind the splash screen rather than elements contained within the splash screen content itself, given there were one or more large images in the focussed text channel I tested on. One could argue that Lighthouse scores are less important for apps that are behind authentication anyway: they don’t need to be indexed by search engines.

There are likely many other situations where apps serve user-generated content and you might be unable to control the LCP element entirely, particularly regarding images.

For example, if you can control the sizes of all the images on your web pages, you might be able to take advantage of an interesting hack or “optimization” (in very large quotes) to arbitrarily game the system, as was the case of RentPath. In 2021, developers at RentPath managed to improve their Lighthouse performance score by 17 points when increasing the size of image thumbnails on a web page. They convinced Lighthouse to calculate the LCP element as one of the larger thumbnails instead of a Google Map tile on the page, which takes considerably longer to load via JavaScript.

The bottom line is that you can gain higher Lighthouse performance scores if you are aware of your LCP element and in control of it, whether that’s through a hack like RentPath’s or mine or a real-deal improvement. That being said, whilst I’ve described the splash screen approach as a hack in this post, that doesn’t mean this type of experience couldn’t offer a purposeful and joyful experience. Performance and user experience are about understanding what’s happening during page load, and it’s also about intent.

How to Hack CLS Scores

TL;DR: Defer loading content that causes layout shifts until the Lighthouse test has likely finished to make the test think it has enough data. CSS transforms do not negatively impact CLS, except if used in conjunction with new elements added to the DOM.

CLS is measured on a decimal scale; a good score is less than 0.1, and a poor score is greater than 0.25. Lighthouse calculates CLS from the largest burst of unexpected layout shifts that occur during a user’s time on the page based on a combination of the viewport size and the movement of unstable elements in the viewport between two rendered frames. Smaller one-off instances of layout shift may be inconsequential, but a bunch of layout shifts happening one after the other will negatively impact your score.

If you know your page contains annoying layout shifts on load, you can defer them until after the page load event has been completed, thus fooling Lighthouse into thinking there is no CLS. This demo page I created, for example, earns a CLS score of 0.143 even though JavaScript immediately starts adding new text elements to the page, shifting the original content up. By pausing the JavaScript that adds new nodes to the DOM by an arbitrary five seconds with a setTimeout(), Lighthouse doesn’t capture the CLS that takes place.

This other demo page earns a performance score of 100, even though it is arguably less useful and useable than the last page given that the added elements pop in seemingly at random without any user interaction.

Whilst it is possible to defer layout shift events for a page load test, this hack definitely won’t work for field data and user experience over time (which is a more important focal point, as we discussed earlier). If we perform a “time span” test in Lighthouse on the page with deferred layout shifts, Lighthouse will correctly report a non-green CLS score of around 0.186.

If you do want to intentionally create a chaotic experience similar to the demo, you can use CSS animations and transforms to more purposefully pop the content into view on the page. In Google’s guide to CLS, they state that “content that moves gradually and naturally from one position to another can often help the user better understand what’s going on and guide them between state changes” — again, highlighting the importance of user experience in context.

On this next demo page, I’m using CSS transform to scale() the text elements from 0 to 1 and move them around the page. The transforms fail to trigger CLS because the text nodes are already in the DOM when the page loads. That said, I did observe in my testing that if the text nodes are added to the DOM programmatically after the page loads via JavaScript and then animated, Lighthouse will indeed detect CLS and score things accordingly.

You Can’t Hack a Speed Index Score

The Speed Index score is based on the visual progress of the page as it loads. The quicker your content loads nearer the beginning of the page load timeline, the better.

It is possible to do some hack to trick the Speed Index into thinking a page load timeline is slower than it is. Conversely, there’s no real way to “fake” loading content faster than it does. The only way to make your Speed Index score better is to optimize your web page for loading as much of the page as possible, as soon as possible. Whilst not entirely realistic in the web landscape of 2024 (mainly because it would put designers out of a job), you could go all-in to lower your Speed Index as much as possible by:

  • Delivering static HTML web pages only (no server-side rendering) straight from a CDN,
  • Avoiding images on the page,
  • Minimizing or eliminating CSS, and
  • Preventing JavaScript or any external dependencies from loading.
You Also Can’t (Really) Hack A TBT Score

TBT measures the total time after the FCP where the main thread was blocked by JavaScript tasks for long enough to prevent responses to user input. A good TBT score is anything lower than 200ms.

JavaScript-heavy web applications (such as single-page applications) that perform complex state calculations and DOM manipulation on the client on page load (rather than on the server before sending rendered HTML) are prone to suffering poor TBT scores. In this case, you could probably hack your TBT score by deferring all JavaScript until after the Lighthouse test has finished. That said, you’d need to provide some kind of placeholder content or loading screen to satisfy the FCP and LCP and to inform users that something will happen at some point. Plus, you’d have to go to extra lengths to hack around the front-end framework you’re using. (You don’t want to load a placeholder page that, at some point in the page load timeline, loads a separate React app after an arbitrary amount of time!)

What’s interesting is that while we’re still doing all sorts of fancy things with JavaScript in the client, advances in the modern web ecosystem are helping us all reduce the probability of a less-than-stellar TBT score. Many front-end frameworks, in partnership with modern hosting providers, are capable of rendering pages and processing complex logic on demand without any client-side JavaScript. While eliminating JavaScript on the client is not the goal, we certainly have a lot of options to use a lot less of it, thus minimizing the risk of doing too much computation on the main thread on page load.

Bottom Line: Lighthouse Is Still Just A Rough Guide

Google Lighthouse can’t detect everything that’s wrong with a particular website. Whilst Lighthouse performance scores prioritize page usability in terms of responding to user input, it still can’t detect every terrible usability or accessibility issue in 2024.

In 2019, Manuel Matuzović published an experiment where he intentionally created a terrible page that Lighthouse thought was pretty great. I hypothesized that five years later, Lighthouse might do better; but it doesn’t.

On this final demo page I put together, input events are disabled by CSS and JavaScript, making the page technically unresponsive to user input. After five seconds, JavaScript flips a switch and allows you to click the button. The page still scores 100 for both performance and accessibility.

You really can’t rely on Lighthouse as a substitute for usability testing and common sense.

Some More Silly Hacks

As with everything in life, there’s always a way to game the system. Here are some more tried and tested guaranteed hacks to make sure your Lighthouse performance score artificially knocks everyone else’s out of the park:

  • Only run Lighthouse tests using the fastest and highest-spec hardware.
  • Make sure your internet connection is the fastest it can be; relocate if you need to.
  • Never use field data, only lab data, collected using the aforementioned fastest and highest-spec hardware and super-speed internet connection.
  • Rerun the tests in the lab using different conditions and all the special code hacks I described in this post until you get the result(s) you want to impress your friends, colleagues, and random people on the internet.

Note: The best way to learn about web performance and how to optimize your websites is to do the complete opposite of everything we’ve covered in this article all of the time. And finally, to seriously level up your performance skills, use an application monitoring tool like Sentry. Think of Lighthouse as the canary and Sentry as the real-deal production-data-capturing, lean, mean, web vitals machine.

And finally-finally, here’s the link to the full demo site for educational purposes.

Injecting Chaos: Easy Techniques for Simulating Network Issues in Redis Clusters

While comprehensive chaos testing tools offer a wide range of features, sometimes you just need a quick and easy solution for a specific scenario. This article focuses on a targeted approach: simulating network issues between Redis client and Redis Cluster in simple steps. These methods are ideal when you don't require a complex setup and want to focus on testing a particular aspect of your Redis cluster's behavior under simulated network issues.

Set-Up

This article assumes that you already have a Redis cluster and the client code for sending traffic to the cluster is set up and ready to use. If not, you can refer to the following steps:

Revolutionizing API Security in the Era of GenAI: An Architect’s Perspective

As a seasoned security architect, I have witnessed the transformative impact of AI and ML on the software development landscape, particularly in the context of API security. The advent of GenAI, with its ability to rapidly generate code and entire applications, has unlocked unprecedented opportunities for innovation. However, this acceleration in development speed has also introduced complex security challenges that demand advanced solutions.

In the past, I have encountered numerous real-world scenarios where inadequate API security measures led to devastating consequences. In one notable incident, a major retailer suffered a significant data breach due to an insecure API that allowed unauthorized access to sensitive customer information. The root cause was traced back to a combination of factors, including improper authentication mechanisms, lack of input validation, and insufficient monitoring and logging.

How to Fully Validate URLs in Java

Uniform Resource Locators (URLs) function as the address of unique resources on the internet. Entering a website URL into our browser retrieves the HTML/CSS files required to construct the page we’re visiting, and making API calls against an endpoint URL allows us to remotely access and/or modify important data — the list goes on. URLs effectively facilitate the interconnectivity we take for granted on the internet today.

When we capture URL string inputs in our web applications, it’s critical that we validate those inputs to ensure the URLs are useful. Retrieving and storing any form of address data (whether that's a URL address, an IP address, or even a physical street address) without immediately validating its utility is a waste of time; it’ll leave us empty-handed when we attempt to access important resources in the future.

The Meta-Retrospective

The Meta-Retrospective is an excellent exercise to foster collaboration within the extended team, create a shared understanding of the big picture, and immediately create valuable action items. It comprises team members of one or several product teams — or a representative from those — and stakeholders. Participants from the stakeholder side are people from the business as well as customers. Meta-retrospectives are useful both as a regular event, say once a quarter, or after achieving a particular milestone, for example, a specific release of the product.

The Benefits of the Meta-Retrospectives

Your stakeholders are your allies, not an impediment! 

A Single API for All Your Conversational Generative AI Applications

You can now use the Converse API in Amazon Bedrock to create conversational applications like chatbots and support assistants. It is a consistent, unified API that works with all Amazon Bedrock models that support messages. The benefit is that you have a single code base (application) and use it with different models — this makes it preferable to use the Converse API over InvokeModel (or InvokeModelWithResponseStream) APIs.

I will walk you through how to use this API with the AWS SDK for Go v2.

KNIME’s Path To Empowering Developers in the Evolving Data Science Landscape

In the rapidly evolving world of data science, companies are constantly seeking tools and platforms that can help them harness the power of their data. KNIME, an open-source data science platform, has been at the forefront of this revolution, providing a comprehensive environment for data preparation, machine learning, and analysis. I recently had the opportunity to catch up with Michael Berthold, Founder and CEO of KNIME, at the Snowflake Data Cloud Summit, where we discussed the company's journey over the past five years and its vision for empowering developers, engineers, and architects in the data science landscape.

Evolving With the Times

Over the past five years, KNIME has undergone significant changes to stay ahead of the curve. "We completely changed both of our technologies," Berthold revealed. The analytics platform is now browser-ready, and the KNIME server has been replaced with a cloud-native business hub. The company also recently launched a SaaS offering, allowing users to access KNIME's powerful features without the need for on-premises installation.

GitHub Copilot Tutorial

This article describes the GitHub Copilot tool and the main guidelines and assumptions regarding its use in software development projects. The guidelines concern both the tool’s configuration and its application in everyday work and assume the reader will use GitHub Copilot with IntelliJ IDEA (via a dedicated plugin).

GitHub Copilot: What Is It?

GitHub Copilot is an AI developer assistant that uses a generative AI model trained for all programming languages available in GitHub repositories. The full description and documentation of the tool is available here