Your Codebase Is a Cluttered Garage

Unused code adds time and burden to maintaining the codebase and removing it is the only cure for this side of “more cowbell.” Unfortunately, it’s not always obvious whether developers can remove certain code without breaking the application. As the codebase becomes cluttered and unwieldy, development teams can become mired in mystery code that slows development and lowers morale.

Do you remember the first time you walked into your garage, empty and sparkling, yawning with the promise of protecting your vehicles and power tools? How did it look the last time you walked in? If you’re like many of us, the clutter of long-closed boxes taunts you every time you walk around them, losing precious minutes before you can get to the objects you need while your car sits in the driveway. Sadly, development teams have a similar problem with their source code, which has grown into a cluttered mess.

Java’s New Threat Model

Over the last decade of cloud migration, the threat model against Java applications and the way that we need to defend them has shifted. OpenJDK has made one positive change in this area already by deprecating the old SecurityManager, a relic that protected a bygone era of AOL CDs and paper maps. The next positive change in security is to strengthen the supply chain of software components, know what’s running and what’s vulnerable, and communicate this information with non-technical experts whose data is at risk.

Part of this threat model is driven by vulnerable libraries like last year’s Log4j. Although Log4j is a great logging library and was active on patching, many teams scrambled to identify where they needed to apply those patches. For individual Java developers or teams that knew their code and could deploy, the patch was simple — you updated a library and that was it. The reality though is that software moves fast and far, often leaving the locus of control of these technical experts to stakeholders that don’t have the expertise to manage a problem at this level. In a scramble, teams that did not know Java-specifics looked everywhere including .NET software and Python forums. The government of Quebec shut services down until they knew where Log4j wasn’t. This scrambling was not effective and does not protect our data.

The Trojan Source Is Not Your Mane Problem

A recently published paper provides a logo and slick polish for an old vulnerability about the ability of certain Unicode characters to render differently for human reviewers than the machines that execute the instructions.

  • The code may intend to confuse a human reader to misunderstand the code based on how the compiler reads encoding (specifically Unicode characters). The intended result would be to execute something that an unconfused human would not allow.
  • A human code reviewer using a plain-text editor or editor with inaccurate syntax highlighting may miss the impact of these control characters. Most IDEs and code editors utilize parse trees and make the Unicode characters visible so that it’s easier for someone to understand.
  • Developers discussing this Trojan Source vulnerability may use the opportunity to saddle up on horse puns.

What Is the Trojan Source?

The Trojan Source is a combination of Unicode control characters that intend to confuse a human into thinking the code does one thing while getting the machine to do another. Mainly it involves the ability to change certain control characters like switching right-to-left encoding or to encode similar-looking letters in different character sets.