The first terrible practice that we will examine in this series is breaking builds on false positives. Whenever I explain this to people who are new to DevOps, I remind them of the story of ‘the boy who cried wolf’. In the age-old story, a young boy lies and says a wolf has come into the village he lives in, and scares all the villagers. The villagers snap into action, to protect everyone from the wolf. After a brief search they realize there is no wolf, and that the child was lying. The boy plays this trick on the villagers more than once, and the villagers become very angry with the boy. At the end of the story, a wolf does enter the village, and the boy attempts to warn everyone, but no one is willing to listen to him anymore. He has been labeled a liar; someone to ignore. But this time the wolf was real, and people were hurt.
The takeaway of the story is that no one wins when trust is broken. People tell the story to children, to discourage them from lying. I tell the story to security professionals, so that we prioritize building trust with development teams, and thus avoid having our warnings ignored.
DevOps pipelines are built to model real-life, physical assembly lines. Each assembly line has something called an “Andon cord”, which is pulled when there is an emergency to stop the line. The button or pull cord can save lives, and millions of dollars (imagine cars accidentally piling on top of each other and the potential cost). The cord is only pulled if something extremely dangerous is happening. When we “break the build” in a DevOps pipeline, we are pulling a digital Andon cord, which stops the entire process from continuing. And when we do this, we had better have a good reason.
When a test fails in the CI/CD pipeline, it doesn’t always break the build (stop the pipeline from continuing). It depends on how important the finding is, how badly it failed the rest, the risk profile of the app, etc. It breaks the build if the person who put the test into the pipeline feels it’s important enough to break the build. That it’s (literally) a show-stopper, and that they are willing to stop every other person’s work as a result of this test. It’s a big decision.
Now imagine you have put a lot of thought into all the different tests in your pipeline, and as to if they have the importance to break the build or just let it continue and send notifications or alerts instead. You and your team use this pipeline 10+ times a day to test your work, and you depend on it to help you ensure your work is of extremely high quality.
Now imagine someone from the security team comes along and puts a new security tool into your carefully-tuned pipeline, and it starts throwing false positives. All the time. How would that make you feel? Probably not very good.
I have seen this situation more times than I care to count, and (embarrassingly) I have been the cause of it at least once in my life. While working on the OWASP DevSlop project I added a DAST to our Patty-the-pipeline module (an Azure DevOps pipeline with every AppSec tool I could get my hands on). One evening Abel had done an update to the code, and he messaged me to say my scanner had picked something up. I didn’t notice his email, then went to Microsoft to give a presentation for a meetup the next day and… Found out on stage.
When my build broke I thought “OH NO, HOW EMBARRASSING”. But then I had another thought, and proudly announced “wait, it did what it was supposed to do. It stopped a security bug from being released into the wild”. Then we started troubleshooting (40+ nerds in a room, of course we did!), and we figured out it was a false positive. Now that really was embarrassing… I had been trying to convince them that putting a DAST into a CI/CD was a good thing. I did not win my argument that day. Le sigh.
Fast forward a couple years, and I have seen this mistake over and over at various companies (not open source projects, made up of volunteer novices, but real, live, paid professionals). Vendors tell their customers that they can click a few buttons and viola! They are all set! When in fact, generally we should test tools and tune them before we put them into another team’s pipeline.
Tuning your tools means making lots of adjustments until they work ‘just right’. Sometimes this means suppressing false positives, sometimes this means configuration changes, and sometimes it means throwing it in the garbage and buying something else that works better for the way your teams do their everyday work.
Photo by Birmingham Museums Trust
In 2020, I was doing consulting, helping with an AppSec program, and their only full time AppSec person proudly told me that they had a well-known first-generation SAST tool run on CI/CD every build, and that if it found anything that was high or above it broke the build. I said “COOL! Show me!” Obviously I wanted to see this awesomeness.
We logged into the system and noticed something weird: the SAST tool was installed into the pipeline, but it was disabled. “That’s weird” we both said, and went on to the next one. It was uninstalled. HMMMMM. We opened a third, it was disabled. We sat there looking and looking. We found one that was installed and running, but it was just in alerting mode.
The next time I saw him his face was long. He told me that in almost 100% of the pipelines his tool had been uninstalled or disabled, except 2 or 3 where it was in alerting mode (running, but it couldn’t break the build). We investigated further to find out that the teams that had it in alerting mode were not checking the notifications, none of them had ever logged into the tool to see the bugs it had found.
To say the guy was heartbroken would be an understatement. He had been so proud to show me all the amazing work he had done. It had taken him over a year to get this tool installed all over his organization. Only to find out, with a peer watching, that behind his back the developers had undone his hard-earned security work. This was sad, uncomfortable, and I felt so much empathy for him. He did not deserve this.
We met with the management of the developer teams to discuss. They all said the right things, but meeting after meeting, nothing actually changed. After about 3 months the AppSec guy quit. I was sad, but not surprised at all. HE was great. But the situation was not.
I kept on consulting there for a while, and discovered a few things:
- The SAST tool constantly threw false positives. No matter what the AppSec guy had done, working very closely with the vendor, for over a year. It was not him, it was the tool.
- The SAST tool had been selected by the previous CISO, without consultation from the AppSec team (huge mistake), and was licensed for 3 years. So the AppSec guy HAD to use it.
- The AppSec guy had spent several hours a week just trying to keep the SAST server up and running, and it was a Windows 2012 server (despite being 2020, the SAST provider did not support newer operating systems). He also wasn’t allowed to add most patches, which meant he had to add a lot of extra security to keep ot safe. It was not a great situation.
- The developers had been extremely displeased with the tool, having it report false positives over and over, and they turned it off in frustration. It was not malice, or anger, they had felt they couldn’t get their jobs done. They really liked the AppSec guy. When I talked to them about it, they all felt bad that he had quit. It was clear they had respected him quite a lot, and had given the tool more of a chance because of him.
It took over a year, but I eventually convinced them to switch from that original SAST to a next generation SAST (read more on the difference between first and second gen here). The new tool provided almost entirely true positives, which made the developers a lot happier. It also was able to run upon code check in, which worked better for the way they liked to do their work in that shop. When I had left, it was scanning every new check in, then sending an email to whoever checked the code in with a report if any bugs were introduced. Althought I didn’t have it breaking builds by the time I left, we went from zero SAST, to SAST-on-every-new-commit. And devs were actually fixing the bugs! Not all the bugs, but quite a few, which was a giant improvement from when I arrived. To me this was a success.
Avoiding this fate…
To avoid this fate, carefully pick your toolset (make a list of requirements with the developers, and stick to it), then test it out first on your own, then with developers, before purchase. Next, test the tool manually with a friendly developer team and work out as many kinks as you can before putting it into a CI. Then put it in alerting mode in the Ci with that team, again, watching for issues. If it runs well, start adding it for more teams, a few at a time. Pause if you run into problems, work them out, then continue.
Tip: You can also set up most static tools (ones that look at written code, not running code) to automatically scan your code repository. This is further ‘left’ in the CI/CD, because it is even earlier in the system development life cycle (SDLC). You can scan the code as it is checked in, or on a daily, weekly or monthly basis, whatever works best for you and your developers!
The next post in this series is Untested Tools.