Developer-Led Code Security: Why False Positives Are Worse than False Negatives

Most SAST tools target security compliance auditors. Their goal is to raise an issue for anything even remotely suspicious. There's no fear of false positives for those tools because the auditors will figure it out; after all it's the auditors' job to sort the wheat from the chaff and the signal from the noise. But the industry should rally around efforts to kill all that noise. There's little tolerance among developers for crying wolf. SAST players should listen to developers and follow the guiding principle to prefer "reasonable" false negatives to raising false positives.

What does that mean in practical terms? Well, let's play with some numbers. Let's say you have a codebase with 12 Vulnerabilities. That's 12 things that absolutely need fixing. A typical SAST analysis might raise 500 issues in total, and then the auditors will spend X weeks sorting through that to bring you, the developer, the audit result maybe a month or so after you've moved on to other code. 

The Curious Case of False Positives in Application Security

Over the past year, data breaches, through web, business, and mobile application exploitation, have continued to run rampant. In 2018, major household names like Ticketmaster, the United States Postal Service (USPS), Air Canada, and British Airways were hit by application-based exploits. To minimize vulnerabilities — and identify existing ones before they can do this level of damage — application security solutions need to be fast, provide good coverage for capturing all classes of vulnerabilities, and more importantly, they need to be highly accurate, to be useful to DevOps application development teams. Providing results fast but less accurately is counter-productive to an efficient and successful application security program. Time wasted by engineers to triage the false positives far outweighs the speedier results provided.

Most automated application security testing solutions have the ability to scan thousands of applications containing millions of lines of code and can produce results containing millions of attack vectors. But every application is different — different functionality, different code, different size, and different complexity —resulting in significantly different security findings with different accuracy. More so, selecting any single scanned application with the best accuracy from many and claiming accuracy is misleading. Even taking averages would be misleading, because it would be a measure of only the limited set of applications that the vendor’s solution scanned, and hence, incomparable to the accuracy of other solutions.