The Hidden Impact of Cache Locality on Application Performance

My favorite technical experience from grad school was all the cool ways we were trying to squeeze every last bit of performance out of the IBM JVM (now called Eclipse OMR). The majority of such optimizations required an intricate understanding of how CPUs and memories look under the hood. But why is there such an impressive performance gain in padding objects with blank space to the closest multiple of 64 bytes and ensuring they always start at addresses that are exactly divisible with 64?

We’ll need a bit of background before being able to answer this.

CategoriesUncategorized