Secrets Detection: Optimizing Filter Processes

While increasing both the precision and the recall of our secrets detection engine, we felt the need to keep a close eye on speed. In a gearbox, if you want to increase torque, you need to decrease speed. So it wasn’t a surprise to find that our engine had the same problem: more power, less speed. At roughly 10 thousand public documents scanned every minute, this eventually led to a bottleneck.

In a previous article, we explained how we built benchmarks to keep track of those three metrics: precision, recall, and the most important here, speed. These benchmarks taught us a lot about the true internals of our engine at runtime and led to our first improvements.