Duplicate Objects in Java: Not Just Strings

When a Java application consumes a lot of memory, it can be a problem on its own and can lead to increased GC pressure and long GC pauses. In one of my previous articles, I discussed one common source of memory waste in Java: duplicate strings. Two java.lang.String objects a and b are duplicates when a != b && a.equals(b). In other words, there are two (or more) separate strings with the same contents in the JVM memory. This problem occurs very frequently, especially in business applications. In such apps, strings represent a lot of real-world data, and yet, the respective data domains (e.g. customer names, country names, product names) are finite and often small. From my experience, in an unoptimized Java application, duplicate strings typically waste between 5 and 30 percent of the heap. However, did you ever think that instances of other classes, including arrays, can sometimes be duplicate as well, and waste a considerable amount of memory? If not, read on.

Object Duplication Scenarios

Object duplication in memory occurs whenever the number of distinct objects of a certain type is limited but the app keeps creating such objects without trying to cache/reuse the existing ones. Here are just a few concrete examples of object duplication that I've seen: