DevSecOps Worst Practices – Artificial Gates

Image of a gate. 
Photo by Alexei Maridashvili on Unsplash

Photo by Alexei Maridashvili on Unsplash

The previous post in this series was Untested Tools.

The third bad practice we will examine is security professionals using a CI/CD to set up a gating process, artificially. No one agreed to the gating process, they have set it up this way in secret, pretending this is not the case. 

Back when I first started working in security we had gates. We had to approve every single thing that went to production. I remember being surprised by the attitude of security people I worked with, them saying that the developers had to jump through our hoops, or else. When DevOps came, things changed. The developers didn’t need to wait on us, and they just weren’t willing to anymore. The developers needed to move fast, release features quickly, and beat the competition. We couldn’t just make them wait for three weeks because we were busy, our tools were slow, or we couldn’t keep up. They were going to continue without us, whether we liked it or not.

When I started working in DevSecOps in more different places, a lot of security folks expressed concerns. I remember someone who I like very much, being extremely frustrated, and saying “They’re going too fast! They’re going way too fast! I don’t know what to do anymore! I can’t keep up! I want to stop them!” I explained that we had to work in different ways, use more modern tools, and adjust. The look on her face wasn’t anger, it was fear. She was upset because she was afraid she couldn’t keep up, and she would miss something important.

Image of a gate. Photo by Keith Hardy on Unsplash

After a while, I started seeing organizations that would use the CI CD to create a pretend gate. The organization had agreed that there would not be a gating process anymore, and that things would be approved automatically, or with fast change management processes. But then one or more security people were intentionally configuring tools to break the pipeline, and stop the build. It took me a while to figure out what was happening, because they would tell me they had not created a gate and “we don’t do gates here”. But what it looked like they had done was break the build so they could add more time to the process, so that they could catch up to the software developers. They couldn’t keep up, and rather than letting things go out to prod anyway, they felt it was safer to stop them. Safer to wait. Or at least that’s how it looked to me.

The problem here is obvious, this one person decided to change the policy of the entire organization without permission. They decided on behalf of everyone else, silently, that if they didn’t have enough time, every other person on the entire team would wait. This is against the first way of DevOps, which is emphasizing the speed and efficiency of the entire system. Plus, it’s not OK for one adult to decide to control all the other adults and lie about it. And every time I’ve seen this, it looked like lying/hiding to me. 

What’s so bad about gates?

You might be thinking; what’s so bad about gates? What’s so bad about waiting a couple extra days? What’s so bad about the old way of doing things? Let me tell you!

First of all, someone adding artificial gates completely negates all of the hard work that all of the teams did in order to make DevOps work. So they have undone everyone else’s hard work, and obviously that’s not cool. Let’s continue!

  • Artificial gates will create a slower time to market, which ruins the #1 advantage of DevOps.
  • It can create bottlenecks within the development process, specifically waiting for the security team to get their acts together. If everyone has to wait on them, it slows everything down. That means the teams can’t be agile, which means they often stop innovation, and also negates any sort of competitive advantage.
  • If one person is waiting on the security person, then they probably aren’t getting anything else done. And that means inefficiency and wasted time. This means that we’re wasting money, time, and quite frankly it probably also means people are less happy in their jobs.
  • Generally most people agree that gated security systems cost more. Read the book Accelerate to learn more.  
  • When we do a gated process, we create silos. The security team is the first silo, the development team is another silo, and sometimes the OPS team is yet another silo. This means a lack of collaboration, they share less information, and it certainly doesn’t build trust between teams.
  • The last one is something I hadn’t thought of, but a friend suggested it to me. A gated system often shifts the security responsibility onto the security team. It makes the developers, DevOps folks, and operational people feel that the security team is responsible for the security of their products and systems. This can lead to a culture where it seems like the security of the products they build is not part of their responsibility, which it is. When we weave security through DevOps, it becomes part of the main value stream, which means that it is part of every single person’s shared responsibility. Gating reverses this culture change.

As you can see, there are several reasons why we do not want to go back to a gated security model. If someone on the security team expresses that they need more time to get something done, the answer is not to create artificial gates that no one else agreed to. The answer is that we need to work together, collaborate, and innovate. Perhaps we need to switch out our tools and get more modern tools that work better with our systems? Perhaps we need to do our work in a new way? Perhaps we need to drop some of the things we’ve been doing, and adopt different ways of doing security? The answer is brainstorming, working together, and making a decision as an entire team. The answer is never that one person decides in silence and secrecy, undoing the hard work of everyone else.

Up next we’re going to talk about missing test results!

DevSecOps Worst Practices – Untested Tools

Photo by Roman Mager on Unsplash

The previous post in this series was The Boy Who Cried Wolf.

Although on first blush it may seem that this second ‘worst practice’ is the same as the first (breaking builds on false positives), it’s slightly different, but almost as problematic. Developers constantly test and retest everything, so should we. Unfortunately, I find this is not the norm with security folks.

Security tools can cause more problems than just false positives. Some of them run a very, very long time. Some of them don’t generate a report, or it’s is in the wrong format, or you can’t seem to find it anywhere… Some tests crash or seem to never complete. 

If you’ve ever put a DAST or first generation SAST into a CI/CD, you’ve likely experienced at least one “test that never seems to finish”. There’s a well-known DAST tool, which will remain unnamed, that has a “kill this scan after 6 hours” checkbox in the config for each scan. I remember thinking “SIX HOURS?!?!?!? It will never run that long!” Ah-hem. Over the coming months of that consulting engagement I learned to always check that box…. 

Sometimes security tools boast that they will work with your developer tools but… That’s a matter of opinion. I once had a vendor suggest that I create a container in my pipeline, download their product live from the internet, install it, update it, run it against my code, save the report, then destroy the container. ON EVERY BUILD. Is it possible? Yes. Absolutely. Is it also a giant waste of time, resources, and bandwidth? Also yes. I gently said no, as their product was not-yet-compatible, and we bought a competing product. Even though their product is otherwise really great, and it had been my first choice before I discovered the mismatch with my client’s CI system. It just didn’t make sense to jerry-rig it like that.

At another company, we had a SAST tool that ran on its own server. Because the information on the server was sensitive (all of their bugs), plus it only worked on a terribly old and insecure operating system, they segmented it away from the rest of the network. For developers to see their bugs they had to remote desktop (RDP) over to a jump box, sign into it with MFA, then sign into the SAST tool, then see their bugs. The SAST tool, for security reasons, locked users out after 30 days of non-activity. This led to developers constantly being locked out, and getting really frustrated. So they stopped logging in, very quickly. The first time I logged into it I told the rest of the team “this will never work, this is too hard”, and when we saw only 9% of the developers had logged in that year… We saw I was (unfortunately) correct.

All of this is because we didn’t test the tools first. Or “play with them” as I sometimes instruct my peers. It’s extremely important to set up a proof of concept with a new tool, run it a whole lot of times, invite others to play with it with you (especially developers), and get their feedback. By following this less-than-scientific-sounding process, I have ended up NOT getting tools I originally wanted, because I found a better fit. And sometimes the better fit is a surprise, even for me! 

Photo by Scott Graham on Unsplash

Proof of Concept

A proof of concept (POC) means setting something up and trying it out, in order to prove it actually works. For an AppSec tool, this means installing it, using it against one or more of your systems, ensuring it works with your CI/CD, your code repository, your IDE, your ticketing system, an ASOC (application security orchestration and coorelation tool) if you have one, and anything else you may want to use it with. You’re checking to see if it works the way you need it to (not how the vendor told you that it works). Sometimes I’ve bought tools and had them disappoint, not performing the way I remember the vendor told me they would. Other times I’ve been able to get them to do so much more than the sales person explained it to me, and been extremely pleased, getting value above and beyond what I had expected.  

Note: actually connect it to your systems. Don’t just read the documentation, see that it should work, and call it a day. I’ve worked at lots of places where current systems have been customized, or configured in ways that are non-standard, or they are segmented off on the network for whateever reason, and then the integration I wanted won’t work. Verify that it does the things you need it to do, don’t just blindly trust. 

– moi

Whenever I perform a POC for an AppSec tool, I work on it solo at first. I like to fail privately, whenever possible. Once I have decided it’s working respectably, then I show it to the other AppSec or InfoSec folks on the team and see what they think of it. Assuming they still think it’s good, then I meet with a friendly developer team (a team that is willing to give me their time and attention, and I’ve had good dealings with before). I give them accounts, and show them how to use it, and ask them their opinion after either an hour or a few days, depending. 

It’s really important to me that the developers do not hate any tool I try to bring in. I will not get good adoption if the super-friendly dev team rejects it. That’s not to say they have to LOVE the tool, I mean, it is reporting bugs and all of us would prefer to be told or work is perfect… But if they actually like the tool, that’s a huge win! I remember doing a feedback session with a bunch of devs and at the end one of them shyly asking “Hmmmm, Tanya. Can we keep it?” and I was overjoyed! I said “YEAH you can!” And the decision was made.

Avoiding this fate

The advice from this section is likely obvious already. Perform a proof of concept before purchasing any tool (unless the tool is free or ‘almost free’), and validate that it does everything you need it to. Get feedback from key stakeholders, generally the AppSec team, CISO or management (for reporting and coverage) and then the software developers and/or security champions. Test that the tool works as expected, that it’s not spitting false positives all the time, that it’s fast enough, that it’s interoperable with your toolset (works well with your other products) and that people are willing to use it.  

Roll out in stages once you’ve made your decision. Start with friendly teams, and alerting mode or commenting on Pull Requests (PRs) only. If you play your cards right, and you have a great culture where you work, you might not even ever need to turn on blocking mode. Or perhaps you will just block for certain key vuln types or specific customer rules that are very important for your organization. If everyone is already fixing bugs before it gets to the CI (because they can test in their IDEs and when their code is checked in), validating again in the IDE may be fully automated or minimal. This is an organizational decision, and lots of organizations in the world function just fine without ever enabling blocking on any of their tools. It’s about what’s right for your org, not what you saw some person at a conference proposes.

The next post in this series is Artificial Gates.

DevSecOps Worst Practices – The Boy Who Cried Wolf

Photo by Lenny Kuhne on Unsplash

The previous post in this series was DevSecOps Worst Practices – The Series.

The first terrible practice that we will examine in this series is breaking builds on false positives. Whenever I explain this to people who are new to DevOps, I remind them of the story of ‘the boy who cried wolf’. In the age-old story, a young boy lies and says a wolf has come into the village he lives in, and scares all the villagers. The villagers snap into action, to protect everyone from the wolf. After a brief search they realize there is no wolf, and that the child was lying. The boy plays this trick on the villagers more than once, and the villagers become very angry with the boy. At the end of the story, a wolf does enter the village, and the boy attempts to warn everyone, but no one is willing to listen to him anymore. He has been labeled a liar; someone to ignore. But this time the wolf was real, and people were hurt. 

The takeaway of the story is that no one wins when trust is broken. People tell the story to children, to discourage them from lying. I tell the story to security professionals, so that we prioritize building trust with development teams, and thus avoid having our warnings ignored.

Originating from the word for a paper lantern, Andon is a term that refers to an illuminated signal notifying others of a problem within the quality-control or production streams. Activation of the alert – usually by a pull-cord or button – automatically halts production so that a solution can be found.

Toyota

DevOps pipelines are built to model real-life, physical assembly lines. Each assembly line has something called an “Andon cord”, which is pulled when there is an emergency to stop the line. The button or pull cord can save lives, and millions of dollars (imagine cars accidentally piling on top of each other and the potential cost). The cord is only pulled if something extremely dangerous is happening. When we “break the build” in a DevOps pipeline, we are pulling a digital Andon cord, which stops the entire process from continuing. And when we do this, we had better have a good reason.

When a test fails in the CI/CD pipeline, it doesn’t always break the build (stop the pipeline from continuing). It depends on how important the finding is, how badly it failed the rest, the risk profile of the app, etc. It breaks the build if the person who put the test into the pipeline feels it’s important enough to break the build. That it’s (literally) a show-stopper, and that they are willing to stop every other person’s work as a result of this test. It’s a big decision.

Now imagine you have put a lot of thought into all the different tests in your pipeline, and as to if they have the importance to break the build or just let it continue and send notifications or alerts instead. You and your team use this pipeline 10+ times a day to test your work, and you depend on it to help you ensure your work is of extremely high quality.

Now imagine someone from the security team comes along and puts a new security tool into your carefully-tuned pipeline, and it starts throwing false positives. All the time. How would that make you feel? Probably not very good.

Photo by carlos aranda on Unsplash

I have seen this situation more times than I care to count, and (embarrassingly) I have been the cause of it at least once in my life. While working on the OWASP DevSlop project I added a DAST to our Patty-the-pipeline module (an Azure DevOps pipeline with every AppSec tool I could get my hands on). One evening Abel had done an update to the code, and he messaged me to say my scanner had picked something up. I didn’t notice his email, then went to Microsoft to give a presentation for a meetup the next day and… Found out on stage.

When my build broke I thought “OH NO, HOW EMBARRASSING”. But then I had another thought, and proudly announced “wait, it did what it was supposed to do. It stopped a security bug from being released into the wild”. Then we started troubleshooting (40+ nerds in a room, of course we did!), and we figured out it was a false positive. Now that really was embarrassing… I had been trying to convince them that putting a DAST into a CI/CD was a good thing. I did not win my argument that day. Le sigh.

Fast forward a couple years, and I have seen this mistake over and over at various companies (not open source projects, made up of volunteer novices, but real, live, paid professionals). Vendors tell their customers that they can click a few buttons and viola! They are all set! When in fact, generally we should test tools and tune them before we put them into another team’s pipeline.

Tuning your tools means making lots of adjustments until they work ‘just right’. Sometimes this means suppressing false positives, sometimes this means configuration changes, and sometimes it means throwing it in the garbage and buying something else that works better for the way your teams do their everyday work. 

Photo by Birmingham Museums Trust

In 2020, I was doing consulting, helping with an AppSec program, and their only full time AppSec person proudly told me that they had a well-known first-generation SAST tool run on CI/CD every build, and that if it found anything that was high or above it broke the build. I said “COOL! Show me!” Obviously I wanted to see this awesomeness.

We logged into the system and noticed something weird: the SAST tool was installed into the pipeline, but it was disabled. “That’s weird” we both said, and went on to the next one. It was uninstalled. HMMMMM. We opened a third, it was disabled. We sat there looking and looking. We found one that was installed and running, but it was just in alerting mode. 

The next time I saw him his face was long. He told me that in almost 100% of the pipelines his tool had been uninstalled or disabled, except 2 or 3 where it was in alerting mode (running, but it couldn’t break the build). We investigated further to find out that the teams that had it in alerting mode were not checking the notifications, none of them had ever logged into the tool to see the bugs it had found.

To say the guy was heartbroken would be an understatement. He had been so proud to show me all the amazing work he had done. It had taken him over a year to get this tool installed all over his organization. Only to find out, with a peer watching, that behind his back the developers had undone his hard-earned security work. This was sad, uncomfortable, and I felt so much empathy for him. He did not deserve this.

We met with the management of the developer teams to discuss. They all said the right things, but meeting after meeting, nothing actually changed. After about 3 months the AppSec guy quit. I was sad, but not surprised at all. HE was great. But the situation was not.

I kept on consulting there for a while, and discovered a few things:

  1. The SAST tool constantly threw false positives. No matter what the AppSec guy had done, working very closely with the vendor, for over a year. It was not him, it was the tool.
  2. The SAST tool had been selected by the previous CISO, without consultation from the AppSec team (huge mistake), and was licensed for 3 years. So the AppSec guy HAD to use it.
  3. The AppSec guy had spent several hours a week just trying to keep the SAST server up and running, and it was a Windows 2012 server (despite being 2020, the SAST provider did not support newer operating systems). He also wasn’t allowed to add most patches, which meant he had to add a lot of extra security to keep ot safe. It was not a great situation.
  4. The developers had been extremely displeased with the tool, having it report false positives over and over, and they turned it off in frustration. It was not malice, or anger, they had felt they couldn’t get their jobs done. They really liked the AppSec guy. When I talked to them about it, they all felt bad that he had quit. It was clear they had respected him quite a lot, and had given the tool more of a chance because of him.  

It took over a year, but I eventually convinced them to switch from that original SAST to a next generation SAST (read more on the difference between first and second gen here). The new tool provided almost entirely true positives, which made the developers a lot happier. It also was able to run upon code check in, which worked better for the way they liked to do their work in that shop. When I had left, it was scanning every new check in, then sending an email to whoever checked the code in with a report if any bugs were introduced. Althought I didn’t have it breaking builds by the time I left, we went from zero SAST, to SAST-on-every-new-commit. And devs were actually fixing the bugs! Not all the bugs, but quite a few, which was a giant improvement from when I arrived. To me this was a success. 

Photo by Mech-Mind Robotics on Unsplash

Avoiding this fate…

To avoid this fate, carefully pick your toolset (make a list of requirements with the developers, and stick to it), then test it out first on your own, then with developers, before purchase. Next, test the tool manually with a friendly developer team and work out as many kinks as you can before putting it into a CI. Then put it in alerting mode in the Ci with that team, again, watching for issues. If it runs well, start adding it for more teams, a few at a time. Pause if you run into problems, work them out, then continue. 

Tip: You can also set up most static tools (ones that look at written code, not running code) to automatically scan your code repository. This is further ‘left’ in the CI/CD, because it is even earlier in the system development life cycle (SDLC). You can scan the code as it is checked in, or on a daily, weekly or monthly basis, whatever works best for you and your developers!

The next post in this series is Untested Tools.

DevSecOps Worst Practices – The Series

Photo by Samuel Sascha of #Unsplash

When I started working in DevSecOps, I hadn’t even done DevOps before (gasp!). I had been following the Waterfall software development methodology in the Canadian Public Service (with a tiny bit of agile thrown in) for over a decade, before I joined Microsoft and my entire world changed. I had been working on the OWASP DevSlop open-source project with my friend Nikky Becher at the time, muddling our way through, trying to figure out how to pentest in a DevOps environment, and suddenly I was supposed to be a DevOps expert. I thought “How can I catch up? And how can I do it FAST?”

I decided that I would create an app, using the Azure DevOps CI/CD, and I would ‘share’ my pipeline that I used to do it. Quickly I figured out there wasn’t a way in Azure DevOps (at the time) to open-source or ‘share’ a pipeline, so I decided I would film myself, live, as I built. I made a basic app, and a basic dev -> QA -> UAT -> Prod deployment, and I was off to the races. I started coding live on Twitch, and adding every security tool to my pipeline that I could get my hands on.

Not every episode of the OWASP DevSlop show went well, and neither did every tool. The DAST I chose threw a false positive while I was presenting at Microsoft in Ottawa, Canada (live on stage, naturally). There’s one episode with Nancy Gariché (one of the project leaders) and I just FAILED the whole time, and after 3 hours we never got the tool working at all. There was a 5 hour episode where I updated the .Net CORE framework, and all my dependencies that had vulnerabilities in them, and that was exhausting. But I LEARNED. And I learned fast.

Person on the computer, learning

In 2018 I joined a company called IANs Research, doing ‘Ask an Expert’ calls, helping clients with Azure and AppSec problems. As I learned more about DevOps and DevSecOps, I started helping clients with those topics as well. I would often have to do research to figure out the solution before a call, but I didn’t mind. All of their questions helped give direction to my learning. Over the years I have helped hundreds of AppSec teams crush their problems, and learned a LOT along the way.

In 2020 I started coaching two different companies with their AppSec programs, for a few hours a week, building out their DevSecOps programs. When contrasted with the short types of assistance I was giving to IANS clients, these long term relationships helped me see programs over extended periods of time. And I got my hands dirty on a weekly basis, trying many different types of tools (for better or worse).

I also attended numerous conference talks and read lots of articles, to see what others were doing, and learning best practices a long the way.

All of this learning was A LOT of work. One day an IANs client told me they were going to start in DevSecOps, and asked if I could make them a “do not do” list. A list of pitfalls that they should work to avoid. Dear reader, I dove deep down this rabbit hole. It had never occurred to me to start with “what not to do”, rather than “this is the way”. And how many headaches could be avoided…

WIth this in mind, I wrote a conference talk (video below), on this topic. And this blog series is going to explore each of the 15 ‘worst practices’ that I cover in the talk.

Me, keynoting the #RSAC Conference, April 2023

The 15 items I will present in this series are as follows:

  1. Breaking Builds on False Positives
  2. Untested Tools
  3. Artificial Gates
  4. Missing Test Results
  5. Runaway Tests
  6. Impossible SLAs
  7. Untrained Staff
  8. Forgotten Bugs
  9. No Positive Reinforcement
  10. Only Worrying About Your Part
  11. Multiple Bug Trackers
  12. Insecure SDLC
  13. Overly Permissive CI/CD
  14. Automation Only in the CI/CD
  15. Hiding Mistakes and Errors

I hope that by sharing mistakes that I have seen and made, all of us can avoid these issues going forward.

~ The next article, ‘The Boy Who Cried Wolf’, is here. ~

API10:2019 Insufficient Logging & Monitoring

In the previous post we covered API9:2019 Improper Assets Management, which was the 9th post in this series. If you want to start from the beginning, go to the first post, API1:2019 Broken Object Level Authorization. You can see the formal document from the OWASP API Security Top Ten Team, here.

Photo by Ales Krivec on Unsplash

Years ago, I worked at Microsoft, as a developer advocate. A few of us were working together, creating demos, for ‘Microsoft Ignite The Tour’. I recall two of my colleagues asking me why Azure couldn’t detect an SQL injection attack they had done as part of their demo. I checked their settings and explained that logging and monitoring was turned off. If monitoring is off, that means there was no observation of the application. How would an attack be noticed if they had removed Azure’s ability to watch what as happening? I also explained that since they had also turned off logging, that would mean that an incident responder would have nothing to investigate. They would not only miss the attack happening, but they would also never be able to find out later what had happened if they investigated. It had been turned off to save money (we didn’t want to spend a small fortune just on demonstrations, we had a budget). They turned both logging and monitoring back on, tried the attack, and immediately Azure went into red alert. All was well for the developer advocates and our demos.

Imagine finding your data on the dark web for sale, and not even knowing how it got there. Obviously, this has never happened to me before at a client site… If it had, I would tell you how incredibly frustrating it is not to be able to explain what happened, and therefore ensure it could never happen again. You can’t do that if you have no logs AND no monitoring. Again, this is totally hypothetical and definitely did not happen to me or any of the clients that I have worked with.

– Not me

Back when I was a full-time developer, I remember asking to turn on logging. I had asked the client during the requirements phase, explaining why I wanted it (so we could provide better reliability, and investigate any outages, there was no security slant for me, at the time). The client had agreed immediately. Then we got into the costing phase, trying to calculate how much the final project would be. When the client saw how much logging was going to cost, it was cut immediately. I had this happen several times as a dev, always being told it was a cost-saving measure. Although I didn’t love this decision at the time, it wasn’t a hill I was going to die on.

Fast forward 8 years to when I got my first AppSec job. I recall us having a security incident, and me being able to search through the logs and find the attack in about 30 minutes (the logs were HUGE, and I didn’t have log viewing software, that’s why it took so long). I learned powershell that day, or, the basics of powershell, and wrote a script to de-obfuscate, so then I could see the exact attack commands. It took me quite bit longer to figure out a perfect timeline, and where our AppSec program had broken down to allow this vulnerability into prod…  But that said, I realized that logs were so incredibly valuable for investigating a security incident, there’s just no other way you can find our exactly what happened without them!

Over the years I have learned that 1) working in incident response is absolutely fascinating and 2) I become far, far too stimulated to do incident response work on a regular (full-time) basis. I have a lot of respect for people who do that type of work full time.

– Tanya’s Thoughts on Incident Response
Photo by Charlotte Harrison on Unsplash

Around this time, I also learned that sometimes attackers will modify the logs, erasing their tracks as part of the attack. One of the ways that logs can be manipulated is via attacks against user input fields, where the attacker bypasses the input validation, that input is logged, and that type of attack is referred to as “log injection”. Attacks against the integrity of our logs is the reason I always go on and on about why we need to protect our logs, and back them up to a secure location. Ideally logs should be protected because they are *sensitive information*, they are literally evidence that could be used one day in court. We should treat them as the precious resource they are, a living record of all that has happened to our applications.

When an API, or any other IT system, has logging and monitoring turned off, or have improperly configured or insufficiently protected their logs, this vulnerability applies. It can apply to any IT system, but APIs are the focus of this blog series, and thus we shall concentrate on how to find, avoid, and solve this problem in APIs specifically.

Let’s talk specifics!

  1. Turn on monitoring. Give your monitoring system contact info for the correct people (it should not go to an unmonitored inbox, or phone number that no one answers). Someone needs to receive the alerts, otherwise why bother to pay for monitoring…
  2. You should log every activity that has to do with a security control, even failed ones. Logins, log outs, changes to privilege, account creation or deletion, password changes, authentication and authorization, input validation, system errors (especially if the global exception handler gets called), changing the contact info for the account, etc.
  3. Do not log sensitive info. Examples of sensitive info: complete credit card numbers, name + home address, name + date of birth, SIN/SSN, the text entries for failed password attempts (those are often typos that would allow you to guess the password), anything that could identify the person (PII) from the log data alone, personal health data + name or other identifying info, anything else that qualifies for your specific organization, system or customers.
  4. Do log: user ID, time stamp, what the user was trying to do, if they succeeded or not, their IP address and any other identifying information you can get about the user’s computer.
  5. Ideally your logs would be formatted so that your SIEM is able to consume them. It’s not very common for organizations to feed their custom app logs into the SIEM, but I hope this changes over time. It’s incredibly helpful.
  6. Your logs should not be stored on the web server or whatever your app lives on. It should be in a different place, inside/behind the firewall/perimeter. That location should not have execution privileges (read only), and only the incident response team should have access to this system.
  7. Monitor where your logs are stored. If an inappropriate account attempts to access this file server, initiate the incident response process immediately. Part of this process should be stopping whoever is accessing it, but then also investigating if these logs of been previously disturbed or altered in any other way. This might not be the first attempt to mess with your logs.
  8. Backup your logs! In a geographically differing place than your app server.

Download a PDF with more specifics for logging, error handling, and logging, here.

Advice straight from the OWASP API Security Top Ten Project Team:

  • Log all failed authentication attempts, denied access, and input validation errors.
  • Logs should be written using a format suited to be consumed by a log management solution and should include enough detail to identify the malicious actor.
  • Logs should be handled as sensitive data, and their integrity should be guaranteed at rest and transit.
  • Configure a monitoring system to continuously monitor the infrastructure, network, and the API functioning.
  • Use a Security Information and Event Management (SIEM) system to aggregate and manage logs from all components of the API stack and hosts.
  • Configure custom dashboards and alerts, enabling suspicious activities to be detected and responded to earlier.

This concludes the We Hack Purple blog series on the OWASP API Security Top Ten! Thank you to the volunteers of that project for all of their hard work to create this list and share this information with the world. Hopefully soon they will release the next version, and we can write more posts about their amazing research!

API9:2019 Improper Assets Management

In the previous post we covered API8:2019 Injection, which was the 8th post in this series. If you want to start from the beginning, go to the first post, API1:2019 Broken Object Level Authorization. You can see the formal document from the OWASP API Security Top Ten Team, here.

Inventory Photo by Petrebels on Unsplash

Photo by Petrebels on Unsplash

Taking inventory is the first thing I do whenever I start or join an AppSec program. Figuring out all the applications and APIs that an organization has built, bought (COTS), or are using (SaaS), then doing a fast evaluation of the state they are in, is the best way to figure out where an organization is at regarding their security posture. Doing this helps me know just how much work we have to do and sets the stage for future conversations with management and the developer teams on how we can get them on track for securing all of their apps.

This vulnerability is the reason I start with inventory. I would argue that the majority of organizations around the planet do not have a current and accurate inventory of all of their web assets, including APIs. If they aren’t in the inventory, that means these assets aren’t being managed, which usually also means the security team does not have them on their radar. If the security team doesn’t know about an asset, how can they secure it? Not being in the inventory generally also means no testing, monitoring, logging, or documentation, at a minimum.

Taking inventory (regularly or continuously), and ensuring we properly decommission old versions of APIs when we release new versions, is the way we avoid this vulnerability. Then document all of it, or update documentation as you update your inventory. I realize this is easier said than done!

I recall a penetration tester telling me years ago that one of his tricks for finding vulnerabilities during an engagement was to try to call earlier versions of any APIs that were in scope. If there was a version 2.x, he would try to call version 1.x. He told me that at least once every year he would get a response of a phantom API. And that API was always a complete security disaster. He would earn his entire paycheck with that one Postman call.

PenTester Name Redacted

APIs and web applications that are not part of your inventory are generally also unmonitored, meaning no one is watching them to see if something goes wrong. They are often not behind a WAF, API gateway, or any other shielding that might protect them from common threats. If they are not a part of your inventory, there’s also a good chance that there’s no team in charge of maintenance, meaning no bug fixing is happening and technical debt is accruing. Lastly, it’s very unlikely that they are receiving regular security testing, or any type of security scanning, which can lead to all sorts of problems building up, invisibly.

Cheese melting in the hot sun. Image compliments of https://drawception.com.

Software doesn’t age like wine, getting better over the years. Software ages like cheese in the hot sun; extremely badly! The longer we do not update, test, or patch our software, the more likely it is to have vulnerabilities found within it. Without proper care, software accrues technical debt, which can make it even more difficult to fix security vulnerabilities, because you have to update so many different components (framework, plugins, operating system patches, etc.) in order to fix the real problem at hand (the vulnerability).

The risks of having APIs (or web apps) that are not a part of your inventory and maintenance plans has no bounds. Any type of vulnerability could happen, as no one is watching or paying attention, except perhaps malicious actors. Attacks upon such resources could result in damage to the availability of the system, sensitive data exposure, changes to the data leading to poor integrity, and worse. This makes the risk of this vulnerability very high. On top of no one knowing that the API exists and is live in production, there’s very likely to be little or no documentation about this API. This situation brings me to back to when I was a dev, and the DBA told me I wasn’t allowed to kill an old database server (I wanted to repurpose it), because there were a whole bunch of scripts on there. She said she has no idea which scripts did what, but she turned the server off once and “everything broke” (including payroll being missed for the entire company, yikes!). She said, “Do not touch, I don’t care why, buy a new server!” The DBA lady meant business, so I got a new server. That said… What if there had been documentation? This situation could easily happen to a company with unknown APIs running wild over their network…

Advice From the OWASP Project Team: ‘How to Prevent’

  • Inventory all API hosts and document important aspects of each one of them, focusing on the API environment (e.g., production, staging, test, development), who should have network access to the host (e.g., public, internal, partners) and the API version.
  • Inventory integrated services and document important aspects such as their role in the system, what data is exchanged (data flow), and its sensitivity.
  • Document all aspects of your API such as authentication, errors, redirects, rate limiting, cross-origin resource sharing (CORS) policy and endpoints, including their parameters, requests, and responses.
  • Generate documentation automatically by adopting open standards. Include the documentation build in your CI/CD pipeline.
  • Make API documentation available to those authorized to use the API.
  • Use external protection measures such as API security firewalls for all exposed versions of your APIs, not just for the current production version.
  • Avoid using production data with non-production API deployments. If this is unavoidable, these endpoints should get the same security treatment as the production ones.
  • When newer versions of APIs include security improvements, perform risk analysis to make the decision of the mitigation actions required for the older version: for example, whether it is possible to backport the improvements without breaking API compatibility or you need to take the older version out quickly and force all clients to move to the latest version.

In the next blog post we will be talking about API10:2019 Insufficient Logging & Monitoring.

API8:2019 Injection

In the previous post we covered API7:2019 Security Misconfiguration, which was the 7th post in this series. If you want to start from the beginning, go to the first post, API1:2019 Broken Object Level Authorization.

Injection has been part of the original ‘OWASP Top Ten Risks to Web Apps’ list since the very beginning. Injection happens when an attacker is able to trick an application (or API) into executing malicious code. It does this by adding code to a place in the application where data belongs, and the app becomes confused, and then executes it.

Think of a search field at the top of any website. Imagine if instead of entering in your search term, you added a bunch of code. Then imagine the application executes the code you added. That code would execute with the full authority of that application, and all the same access, behind your firewall. It could result in damage to a database, a web service, your LDAP system, and potentially even worse, assuming there are other vulnerabilities the attacker can combine with this one.

Unfortunately, APIs are subject to this vulnerability, just like a regular web app. No front end does not protect us from injection.

Injection
Photo by Diana Polekhina on Unsplash

What types of injection exist?

If there’s code involved, someone will try to inject their own code. SQL, LDAP, or NoSQL queries, OS commands, XML parsers, and ORM are all potentially problematic (list provided by the OWASP API Security project team). Even Mongo DB databases, that don’t use the SQL language, are potentially vulnerable to NOSQL injection.  

Special note on XSS: Cross Site Scripting (XSS) is also a form of code injection, but it has it’s own classification because of the following reasons:

  • It occurs in the browser, as opposed to back on the server side like every other form of injection.
  • Only works with javascript (because that’s all browsers execute).
  • Is incredibly prevalent, so much so that OWASP felt it was necessary to give it it’s own category.
  • Has several defenses made just for this one vulnerability (cookie settings, and security headers).
  • Does not work on APIs, because they have no GUI front end, meaning no browser.

What do we do?

Hopefully by this point you agree that injection is dangerous and should be remediated as soon as possible if you find it in one of your apps. But how do we find it? How do we fix it? How can we ensure this never happens? Dear reader, secure coding is my favorite topic! Let’s go!

Finding Injection

First off you want to go through your APIs and figure out if you have injection. The most expensive way to do this would be to hire a PenTester to find and then test all the APIs. A cheaper and more sustainable way to do this would be to

  1. Buy a tool that can find all your APIs for you (No, I am not going recommend one at this time, there are several on the market of various qualities, and prices). This function is called “inventory” or “enumeration”.
  2. Run a SAST (static application security testing tool) on all of the APIs, ideally a next gen one, that has low false positives. Fix anything that says injection.
  3. Use a linter on your API, ensure you have completed your API definition file, as per the linter’s instructions. If you can find an API-specific linter, all the better.
  4. Run a DAST tool on the APIs that is made for APIs OR, use an old school DAST but first ensure you’ve linted your API perfectly, so it can hopefully do a good job. It will be easier and faster if you have an API-specific testing tool. Fix anything that says injection.

Fixing Injection

“That’s nice you told me to fix it. Exactly HOW do I do that?”

The first defense for injection is thorough input validation on any input to your app. This means data in the parameters, in a data field, in a hidden field, from an API you called, from the database, any input to your app needs to be validated that it is what you are expecting. What type is it? What size? What’s the content? Is it what we are expecting? If not, reject.

This is functionality is best performed using an approved list, on the server side. By ‘approved list’, we mean using a list of stuff you know is good, rather than a list of what you know is bad. It’s easy for malicious actions to get around a block list, using encoding, obfuscation, and other tactics. But if you give a regular expression and say “if it’s not in here, I’m just not having it”, bad things cannot get in.

As an example, imagine you have a username. It likely accepts numbers and letters. You could use a regular expression (REGEX) like this to say what is okay: [a-z,A-Z,0-9]. That’s an approved list or ‘accept list’. If instead you try to block bad characters such as <, >, ‘, “ and more, you (and your app) are in for a world of hurt.

The next thing you want to do is ensure you perform this check on the server side. Do not do it on the client side, and by this, I mean in the browser/JavaScript. Anyone with a web proxy can get behind your JavaScript in about 5 seconds, unfortunately. If you want to check in your JavaScript for speed, you can do that, in addition to checking on the server.

Input validation is defense number #1. Other defenses include:

  • Always using parameterized queries when making requests to any database (even non-SQL databases long mongo DB). It takes away the ability for it to be interpreted as code.
  • Use output encoding when you put stuff onto the screen. Some frameworks do this for you by default. It takes away and superpowers of the characters, before it puts them on screen, making XSS impossible. Okay, maybe this is only for XSS and doesn’t apply to injection in general, but I would still do it if I were you.

Preventing Injection

If we want to prevent injection (and a myriad of other vulnerabilities), follow this advice:

  • Have your development team take a secure coding course. It can be free or paid, formal or informal, live or recorded, interactive or lecture, the only important part is that they learn. Do the type of training that works best for you and your team.
  • Follow a secure system development life cycle (S-SDLC). Add security steps to each part of your SDLC, such as security requirements, code review, or threat modelling.
  • Ensure your application has thorough testing, which can mean any or several of the following: static analysis, dynamic analysis, manual code review, penetration testing, stress testing, performance testing, unit testing or any other testing you can think of!
  • Whenever possible, use modern and up-to-date frameworks that have security features built in. JavaScript frameworks like Angular and React have so many cool features that help protect your users! They aren’t just nifty dev tools, they can help you build stronger, tougher apps.
  • Never stop learning. Keep reading, studying, learning and hacking.

How To Prevent: OWASP API Security Top Ten Team Advice!

Preventing injection requires keeping data separate from commands and queries.

  • Perform data validation using a single, trustworthy, and actively maintained library.
  • Validate, filter, and sanitize all client-provided data, or other data coming from integrated systems.
  • Special characters should be escaped using the specific syntax for the target interpreter.
  • Prefer a safe API that provides a parameterized interface.
  • Always limit the number of returned records to prevent mass disclosure in case of injection.
  • Validate incoming data using sufficient filters to only allow valid values for each input parameter.
  • Define data types and strict patterns for all string parameters.

In the next blog post we will be talking about API9:2019 Improper Assets Management.

API7:2019 Security Misconfiguration

In the previous post we covered API6:2019 Mass Assignment, which was the 6th post in this series. If you want to start from the beginning, go to the first post, API1:2019 Broken Object Level Authorization.

Security misconfiguration has been on the original OWASP Top Ten list (critical web app risks) for many, many years. It basically means lack of hardening, poor implementation, poor maintenance, mistakes, missing patches, and human error. There’s no difference between web apps and APIs for this; if the server and/or network has not been properly secured, your API may be in danger.

What can happen?

Someone has the giggles.

Because this category of vulnerability is so vague, the risk is anywhere from low to critical, depending upon what you misconfigured and how you misconfigured it. It could result in a complete system compromise, damage to the confidentiality, availably, and/or integrity of your system, and a plethora of other issues. It could result in as little as embarrassing error messages for the attacker, but no actual impact. That said, this vulnerability should not be taken lightly, it’s on this list for a reason.

How do we avoid such a fate?

Prepare for me to sound like a broken record:

  • Follow a secure system development life cycle that includes extensive testing of both the application later, but also the network and infrastructure layer.
  • Following the hardening guide for all infrastructure, middleware, COTS, and SaaS products
  • Scan (apps, network, infrastructure) continuously
  • Create and follow a fast and effective patching process
  • Monitor and log all apps, APIs and any other endpoints you have, for potential danger and/or attacks
  • Ensure you have access for configuring all of these systems locked down, using the principal of least privilege
  • Have an up-to-date and effective incident response (IR) process, and a well-trained IR team

I realize that this blog post is probably not only a bit underwhelming, but you may feel that I have greatly simplified how to avoid this problem. If you feel this way… You’re right. Creating and implementing an effective patch management process in an enterprise is HARD. Continuous scanning is HARD. Getting people to fix misconfigurations (or any vulnerability) that you’ve found is REALLY HARD. None of the things on the list above are easy. Let’s see what the Project Team suggests.

How To Prevent


The API life cycle should include:

  • A repeatable hardening process leading to fast and easy deployment of a properly locked down environment.
  • A task to review and update configurations across the entire API stack. The review should include: orchestration files, API components, and cloud services (e.g., S3 bucket permissions).
  • A secure communication channel for all API interactions access to static assets (e.g., images).
  • An automated process to continuously assess the effectiveness of the configuration and settings in all environments.


Furthermore: (From the project team)

  • To prevent exception traces and other valuable information from being sent back to attackers, if applicable, define and enforce all API response payload schemas including error responses.
  • Ensure API can only be accessed by the specified HTTP verbs. All other HTTP verbs should be disabled (e.g. HEAD).
  • APIs expecting to be accessed from browser-based clients (e.g., WebApp front-end) should implement a proper Cross-Origin Resource Sharing (CORS) policy.

OWASP References (The best kind of references!)

In the next blog post we will be talking about API8:2019 Injection.

API6:2019 Mass Assignment

In the previous post we covered API5:2019 Broken Function Level Authorization, which was the 5th post in this series. If you want to start from the beginning, go to the first post, API1:2019 Broken Object Level Authorization.

Tanya proudly displaying her Sec4Dev T-Shirt.
Tanya proudly displaying her Sec4Dev T-Shirt

This vulnerability is quite poorly named, it is not at all intuitive. When you first hear it, you might think “oh-oh, the API is called in a way where it mass assigns/creates/deletes records, but it should have only done it to one record”. That is not at all what this vulnerability is about, so please set that idea aside.

Mass assignment refers to linking the incorrect fields within a record in a data model to the parameters in an API, such that when the API is called a user is able to update fields they should not be allowed to. This only applies to one single record. Not sure where the word ‘mass’ comes in to play here, but ‘incorrect assignment’ might make a bit more sense if you want to think of it that way.

This situation can happen when we bind data that was provided by the user (and is therefore not trustworthy) to data models, without checking if those fields should be accessible to the user. Should the user be allowed to specific ‘role=admin’? Probably not. But if you bind the entire record (first name, last name, username, password, role), rather than just the properties the user is allowed to update, this can happen.

How does this happen?

Malicious actors can send a GET request to an API, and see if it sends more parameters than just the ones it sends on the PUT, UPDATE, DELETE or POST request. If the developer sends everything (the entire record) back to the front end, someone proxying the web app can easily see “Hey, look at all the fields are here! This is way more than I asked for! What a gold mine!”. Obviously, we do not want this.

Let’s Avoid This, Shall We?

You should never be updating any data or making decisions in your system with values provided by the user until after you have validated that the data is trustworthy/what you are expecting. Validating your inputs is OWASP 101, we all already know this!

You do not need to call default ‘get’ or ‘set’ constructor functions directly from the API, you can write your own functions to be called directly, or have code in-between the original call and the data, that only passes the parameters you need, after you validated the input you got from the user.
 

Let’s hear more advice from the OWASP Project Team!

  • If possible, avoid using functions that automatically bind a client’s input into code variables or internal objects.
  • Create an approved list of only the properties that should be updated by the client, then add checks against that list when performing data updates.
  • Use built-in features to block properties that should not be accessed by clients.
  • If applicable, explicitly define and enforce schemas for the input data payloads.

External Reference from the OWASP API Top Ten Project Team

CWE-915: Improperly Controlled Modification of Dynamically-Determined Object Attributes

In the next blog post we will be talking about API7:2019 Security Misconfiguration!

API5:2019 Broken Function Level Authorization

In the previous post we covered API4:2019 Lack of Resources & Rate Limiting, which was the fourth post in this series. If you want to start from the beginning, go to the first post, API1:2019 Broken Object Level Authorization. You can read the official OWASP listing for this article here.

Image of Tanya commemorating the infamous "Ottawa Sinkhole Van"
Image of Tanya commemorating the infamous “Ottawa Sinkhole Van

_________________________________

Different users within an application are often given different roles within that system. Roles can include administrator, auditor, approver, editor, etc. Often those roles can be organized in a hierarchy, referred to as levels, with the idea of one role being a level above or below another. For instance, the administrative user is usually at the top of the hierarchy, then a regular user and then at the bottom perhaps a guest user of the system.

Each different system role has different areas of the application’s functionality available to them, such as the ability to create or delete users, look through someone’s financial records, approve a new blog post, or edit a document that belongs to a team member. Each one of those features within an application is often called a function. When the application grants or denies access to features within itself, this is called ‘function level authorization’. It is (or is not) authorizing that user to access a specific function.

When we hear the term escalation or elevation of privilege, we mean that a user has moved up one or more levels in the hierarchy of the system. If this happens in a system, we generally consider this to be a serious (critical) vulnerability, that we would work hard to mitigate.

When function level authorization is broken within an API, as is the case with item #5 on this top ten list, this means users are able to successfully call functions that their user role should not have access to. Imagine someone who is an unauthenticated user on your system using an API to call a function that only administrators should have access to, they could cause all sorts of damage, such as deleting users, changing other user’s privileges (perhaps locking out the real administrators), or worse. Depending upon what the API is used for, they could negatively affect the confidentiality, availability and integrity of the system itself and its data. It could even negatively affect other systems that it interacts with. This is dangerous stuff.

What can we do about it?

How do we prevent this you might ask? Let’s go over a mix of our recommendations and those from the OWASP API Top Ten Project Team.

Tanya’s Advice

  • You must test every single HTTP method, for every function, for every user. I know, this doesn’t sound that exciting. But guess what? This is one of the best ways you can ensure you avoid this type of problem. Make a grid with user roles on one axis, and functions they are/are not allowed to access on the other. For each entry in your grid, go through all of the HTTP methods that are enabled on the server. Maybe get a nice beverage or snack, then put on some music, so this task can be at least somewhat enjoyable.
  • Every single function should check in with the authorization mechanism for your app before it does anything, to verify you are still you (authentication) and that you are allowed to use that function (authorization). This is the same as every page within a web app/GUI front end. Make it a habit.
  • One use one authentication and authorization system for all your APIs, if possible. If you use multiple authorization controls, especially within the same API, it’s very easy to make a mistake. Simplify your architecture whenever possible, you will be a much happier developer.
  • Deny access to everything, for every role, by default. Specifically grant permission only to roles that require such access. Apply least privilege (all the time, whenever possible, not just in this situation!).
  • If possible, buy a system that will perform authorization for you, and implement it carefully, following the advice from its makers. Whenever possible, we should follow the order of buy -> borrow -> build. Buy a system that is tried, tested and trust. If not buy then borrow, use open source or a 3rd party component that implements this functionality. As a last resort, write this complex functionality yourself, with a plan for extensive testing and long-term maintenance, as this is a system that many other systems will depend upon. If you don’t think you can maintain something like this in the long term, then you should be buying or borrowing, not building.

The OWASP Project Team’s advice:

  • The enforcement mechanism(s) should deny all access by default, requiring explicit grants to specific roles for access to every function.
  • Review your API endpoints against function level authorization flaws, while keeping in mind the business logic of the application and groups hierarchy.
  • Make sure that all of your administrative controllers inherit from an administrative abstract controller that implements authorization checks based on the user’s group/role.
  • Make sure that administrative functions inside a regular controller implements authorization checks based on the user’s group and role.

Helpful Links from OWASP!

In the next blog post we will be talking about API6:2019 Mass Assignment!

API4:2019 Lack of Resources & Rate Limiting

In the previous post we covered API3:2019 Excessive Data Exposure, which was the third post in this series. If you want to start from the beginning, go to the first post, API1:2019 Broken Object Level Authorization.

You can read the official document from the OWASP Project team here.

Tanya on stage
Who here has perfectly secure APIs? What? No hands? – OWASP Global AppSec, Ireland, Feb 2023

Before diving into this one, I want to briefly discuss bots online.

An Internet bot, web robot, robot or simply bot, is a software application that runs automated tasks over the Internet, usually with the intent to imitate human activity on the Internet, such as messaging, on a large scale. – Wikipedia

While bots can be great (I used to have an automated message for all new twitter followers, to greet them and make them feel welcome), they can also be quite bad (slowly eating away at our defenses, with automated requests).

Bots are one of the quiet enemies of APIs. They hike up our cloud bills, they make our APIs seem unresponsible or slow, and they can be used to brute force an API that is not properly protected.  Because APIs can do so many things, it is possible for them to eat up all sorts of resources on your network, such as CPU, storage and memory on the host of the API, or whatever the API is calling.

There are all sorts of ways that APIs can have their limits tested, including: uploading very large files or amounts of data, making several requests at once, requesting huge amounts of data (above what the system or supporting infrastructure can handle), etc.

Setting Boundaries

The OWASP API Security top ten team recommends setting limits on the following settings:

  • Execution timeouts
  • Max allocable memory
  • Number of file descriptors
  • Number of processes
  • Request payload size (e.g., uploads)
  • Number of requests per client/resource (this is also called resource quotas)
  • Number of records per page to return in a single request response

So how do we avoid this happening to our APIs?

  • We can set throttling limits, to slow down requests that all come from the same source .
  • We can add resource quotas, limits to how many requests someone can make, and then they have to wait a time period to start making requests again.
  • Docker containers has several options built in for adding the limits described earlier in this article, as suggested by the team that maintains this great OWASP project.
  • Send messages back to whoever is calling the API ‘too much’, informing them they’ve reached the limit, and that now they must wait.
  • Design your API to ensure it takes into account if requests are “too big”. This is something threat modelling could help with, but ideally you would start with looking at each function and thinking about this as a problem your API will face at some point. Design with this in mind.
  • Also design into your API maximum amounts of data that it can accept and that it can return to the caller.  This might mean breaking a large request into multiple responses or blocking it altogether. This is something you should talk to your team about, ideally during the requirements or design phase(s) of your project.

They provide several very helpful resources, which you can find here:

OWASP Resources!

Even more resources!

In the next blog post we will be talking about API5:2019 Broken Function Level Authorization!

API3:2019 Excessive Data Exposure

In the previous post we covered API2:2019 Broken User Authentication, which was the second post in this series. If you want to start from the beginning, go to the first post, API1:2019 Broken Object Level Authorization.

#RSAC 2022

Excessive data exposure is something that web applications can face, not just APIs. That said, because web-based APIs are basically services on the web, they can be abused even more easily to exfiltrate sensitive data than a regular web app. It’s easy for an attacker to find APIs (just connect to any web app or mobile app using a web proxy and see the API calls for yourself!), to call them, and then look at the responses to see if anything being sent looks potentially sensitive. For instance, if the data in a field passed back is named “password”, “sin” or “secret”, you’re most likely onto something.

Using a web proxy to watch the API calls go back and forth is sometimes called “sniffing”, but no matter what you call it, it’s easy to do! Anyone with the tiniest amount of web-app hacking training can do this on day one. This means this threat is prevalent (happens all the time) and very dangerous (because unsophisticated attackers can easily execute it).

Some APIs are *supposed* to return sensitive data. This vulnerability is when sensitive data is exposed to someone it should not be (for instance, someone who is not a valid user, seeing another user’s sensitive data, or for whom that specific data should not be shown due to their role within the system). Since whether data is sensitive in nature is not obvious to automated testing tools, it can be a bit more difficult to identify than other types of vulnerabilities.

* Note: occasionally the vulnerability rears its head via poorly-generated and/or overly-populated responses. For instance, the API delivers the entire table worth of data, which includes sensitive information, but then the client-side front-end sifts through it and only reveals the non-sensitive/appropriate data to the end user. Unfortunately, if the API call is not encrypted in transit, this means a malicious actor could see all of the data if they were sniffing the API at the time.

How do we avoid this?

Let’s look at some great advice from the project team (I may have added a bit onto their list):

  • Never rely on the client side to filter sensitive data. By this we mean, only return the data you need to return! Don’t send a ton of stuff you do not need to, then let the GUI/front end decide what to show the user. Make these important decisions on the server.
  • Classifying then label all your data. If you know immediately when you look at something that it is sensitive, it’s automatic to treat it in a certain way. Educate your developers and other areas of IT on how to classify data, and to ask the security team if they aren’t sure.
  • In the design of your API, add user stories and/or threat models around this potential vulnerability. Making protecting sensitive data part of your design.
  • Review the responses from the API to make sure they contain only legitimate data, data that the specific user (or users with that role inside your system) are allowed to access.
  • Back-end engineers should always ask themselves “who is the consumer of the data?” before exposing a new API endpoint. Or better yet, perform threat modelling on your data flows, THEN design.
  • Avoid using generic methods such as to_json() and to_string(). Instead, cherry-pick specific properties you really want to return. You do not need to return everything. In fact, it’s better for your cloud bills to return only what you need, even if it requires a bit more programming.
  • Classify sensitive and personally identifiable information (PII) that your application stores and works with, reviewing all API calls returning such information to see if these responses pose a security issue.
  • Implement a schema-based response validation mechanism as an extra layer of security. As part of this mechanism define and enforce data returned by all API methods, including errors.
  • Perform strict linting on your API definition file, to ensure you have input validation built-in, by default, for every variable.

In the next blog post we will be talking about API4:2019 Lack of Resources & Rate Limiting!