If you are reading this post, then there’s a good chance you understand the need for security surrounding AI systems. As more and more development teams look at ways to increase the intelligence of their applications, the surrounding teams, including security teams, are struggling to keep up with this new paradigm and this trend is only accelerating.
Security leaders need to start bridging the gap and engaging with teams that are developing applications within the spectrum of AI. AI is transforming the world around us, not only for enterprises but also for society, which means failures have a much more significant impact.
Why you should care about AI Security
Everything about AI development and implementation is risky. We as a society tend to overestimate the capabilities of AI, and these misconceptions fuel modern AI product development, turning something understandable into something complex, unexplainable, and, worst of all, unpredictable.
AI is not magic. To take this a step further, what we have isn’t all that intelligent either. We have a brute force approach to development that we run many times until we get a result that we deem acceptable. It’s the kind of thing that if a human developer iterated that many times, you’d find unacceptable.
Failures with these systems happen every day. Imagine being denied acceptance to a university without explanation or being arrested for a crime you didn’t commit. These are real impacts felt by real people. The science fiction perspective of AI failures is a distraction to the practical applications deployed and used daily. Still, most AI projects never make it into production.
Governments are taking notice as well. A slew of new regulations, both proposed and signed into law, are coming out for regions all over the world. These regulations mandate controls and explainability, as well as an assessment of risk for these published models. Some of those, like the EU’s proposed legislation on harmonized AI, even have prohibitions built-in as well.
From a security perspective, ML/DL applications can’t solely be treated as traditional applications. AI systems are a mixture of traditional platforms and new paradigms requiring security teams to adapt when evaluating these systems. This adaptation requires more access to systems than they have had traditionally.
There are requirements for safety, security, privacy, and many others in the development space. Still, there seems to be confusion about who is responsible for what when it comes to AI development. In many cases, the security team, typically the best equipped to provide security feedback, aren’t part of the conversation.
What makes AI Risky?
When it comes to risk in AI, it’s a combination of factors coming together to create risk. It’s important to realize that developers aren’t purposefully trying to create systems that cause harm. So, if this is the case, let’s look at a few issues creating risk.
Poorly defined problems and goals
Problems can crop up at the very beginning of a project. Often, there is a push for differentiators in application development, and this push for more “intelligence” may come from upper management. This puts additional pressure on application developers to use the technology whether it is a good fit for the problem or not.
Not all problems are good candidates for technologies like machine learning and deep learning, but still, you hear comments such as “We have some data, let’s do something cool” or “let’s sprinkle some ML on it.” These statements are not legitimate goals for an ML project and may cause more harm than good.
Requests for this technology can lead to unintended, negative consequences and build up unnecessary technical debt. ML implementations are rarely set it and forget it type of systems. The data may change and shift over time as well as the problem the system was meant to solve.
The velocity of development isn’t a problem constrained to the development of machine learning systems. The speed at which systems are developed has been a problem for safety and security for quite some time. When you add the potential unpredictability of machine learning and deep learning applications, it adds an additional layer of risk.
Developers may be reluctant to follow additional risk and security guidelines due to the perception that they get in the way of innovation or throw up unnecessary roadblocks. This reluctance is regardless of any legal requirements to perform certain activities. This perception needs to be managed by risk management and security teams.
The mantra, “move fast and break things,” is a luxury you have when the cost of failure is low. Unfortunately, the cost of failure for many systems is increasing. Even if the risk seems low at first, these systems can become a dependency for a larger system.
Increased attack surface
Machine learning code is just a small part at the center of a larger ecosystem of traditional and new platforms. A model does nothing on its own and requires supporting architecture and processes, including sensors, IoT devices, APIs, data streams, backends, and even other machine learning models chained together, to name a few. These components connected and working together create the larger system, and attacks can happen at any exposure point along the way.
Lack of Understanding Around AI Risks
In general, there is a lack of understanding surrounding AI risks. This lack of understanding extends from the stakeholders down to the developers. A recent survey from FICO showed there was confusion about who was responsible for risk and security steps. In addition, these leaders also ranked a decision tree as a higher risk than an artificial neural network, even though a decision tree is inherently explainable.
If you’ve attended an AI developer conference or webcast, if risk is mentioned, they are talking about risks involved in developing AI applications, not risks to or from the AI applications. Governance is also discussed in terms of maximizing ROI and not in ensuring that critical steps are adhered to during the development of the system.
Supply Chain Issues
In the world of AI development, model reuse is encouraged. This reuse means that developers don’t need to start from scratch in their development processes. Pre-trained models are available on many different platforms, including Github, model zoos, and even from cloud providers. The issue to keep in mind is that you inherit all of the issues with the previous model. So, if that model has issues with bias, by reusing the model, you are amplifying the issue.
It’s also possible to have backdoors in models where a particular pattern could be shown to a model to have it take a different action when that pattern is shown. Previously, I wrote a blog post on creating a simple neural network backdoor.
A common theme running across issues relating to risk is lack of visibility. In the previous survey, 65% of Data and AI leaders responded that they couldn’t explain how a model makes decisions or predictions. That’s not good, considering regulations like GDPR have a right to explanation.
Other Characteristics Inherent to AI Applications
There are other characteristics specific to AI systems that can lead to increased exposure and risk.
Fragility. Not every attack on AI has to be cutting edge. Machine learning systems can be fragile and break under the best of conditions.
Lack of explicit programming logic. In a traditional application, you specify from start to finish the application’s behavior. With machine learning systems, you give it some data, and it learns based on what it sees in the data. It then uses what it learned to apply to future decisions.
Model uncertainty. Many of the world’s machine learning and deep learning applications only know what they’ve been shown. For example, ImageNet candidates only know the world by the thousand labeled categories it’s been shown. If you show one of the candidates something outside of those thousand things, it goes with the closest thing it knows.
AI Threat Landscape
Threats to AI systems are a combination of traditional and new attacks. Attacks against the infrastructure surrounding these systems are well known. But, with intelligent systems, there are also new vectors of attack that need to be accounted for during testing. These attacks would have inputs specifically constructed and focused on the model being attacked.
Types of Machine Learning Attacks & Their Impacts
In this section, I’ll spell out a couple of the most impactful attacks against machine learning systems. These attacks are not an exhaustive list but they cover a couple of the most interesting attacks relating to AI risk.
Model Evasion Attacks are a type of machine learning attack where an attacker feeds the model an adversarial input, purposefully perturbed for the given model. Based on this perturbed input, the model makes a different decision. You can think of this type of attack as having the system purposefully misclassify one image as another or classify a message as not being spam when it actually is.
Impact: The model logic is bypassed by an attacker.
Model Poisoning Attacks feed “poisoned” data to the system to change the decision boundary to apply to future predictions. In these attacks, attackers are intentionally provided bad data to the model to re-train it.
Impact: Attackers degrade a model by poisoning it, reducing accuracy and confidence.
Membership Inference Attacks are an attack on privacy rather than an attempt to exploit the model. Supervised learning systems tend to overfit to their training data. This overfitting can be harmful when you’re training on sensitive data because the model will be more confident about things it’s seen before than things it hasn’t seen before. This attack means an attacker could determine if someone was part of a particular training set which could reveal sensitive information about the individual.
Impact: Sensitive data is recovered from your published model.
Model Theft / Functional Extraction happens when an attacker creates an offline copy of your model and uses that to create attacks against the model. Then they can use what they’ve learned back against your production system.
Impact: Your model may be stolen or used to create more accurate attacks against it.
Model inaccuracies. Inaccuracies inherent to the model could cause harm to people or groups. Your model may have biases that cause people to be declined for a loan, denied entry to school, or even arrested for a crime they didn’t commit.
Impact: Your inaccurate model causes people harm.
An inaccurate model becomes the truth. To take the previous point to an extreme, imagine a model with inaccuracies is now used as a source of truth. This condition can be much harder to identify and may exist for a long time before being identified. Usage of this model would not allow people any recourse, because the source of the decision was the model.
Impact: Your inaccurate model causes systemic and possibly societal harm.
Where to go from here
Now that we’ve covered some of the issues, it’s time to start building up your defenses. I’ll cover that in another post. In the meantime, check out some of these AI security resources below.
Until recently, the regulation of AI was left up to the organizations developing the technology, allowing these organizations to apply their own judgment and ethical guidelines to the products and services they create. Although this is still widely true, it may be about to change. New regulations are on the horizon, and some already signed into law, mandating requirements that could mean costly fines for non-compliance. In this post, we look at some themes across this legislation and give you some tips to begin your preparation.
When you think of regulations surrounding AI, your mind probably wanders to the use of the technology in things like weapons systems or public safety. The fact of the matter is, harms from these systems extend far beyond these narrow categories.
Many developers are just using the tools available to them. Developers create experiments and evaluate the final result on a simple set of metrics and shipping to production if it meets a threshold. They aren’t thinking specifically about issues related to risk, safety, and security.
AI systems can be unpredictable, which is ironic since often you are using them to predict something. Why unpredictability surfaces is beyond the scope of this post, but it has to do with both technical and human factors.
We’ve had laws indirectly relating to regulations of AI for quite some time and probably haven’t realized it. Not all these regulations specifically spell out AI. They may be part of other consumer safety legislation. For example, the Fair Credit Reporting Act may come into play when making automated decisions about creditworthiness and dictate the data used. In the context of machine learning, this applies to the data used to train a system. So, current regulation that prohibits specific pieces of information such as protected classifications (race, gender, religion, etc.) from being used or prohibits specific practices, then that also applies to AI, whether it spells it out or not.
Governments and elected officials are waking up to the dangers posed by the harm resulting from the unpredictability of AI systems. One early indicator of this is in GDPR Recital 71. In summary, this is the right to explanation. If there is an automated process for determining whether someone gets a loan or not, a person denied has a right to be told why they were rejected. Hint, telling someone one of the neurons in your neural network found them unworthy isn’t an acceptable explanation.
Recently, the EU released a proposal specifically targeting AI systems and system development. This proposed legislation outlines requirements for high-risk systems as well as prohibitions on specific technologies, such as those meant to influence mood as well as ones that create grades like a social score.
Although the US tried to pass similar legislation called the Algorithmic Accountability Act, it did not pass. The US Government did, however, release a draft memo on the regulation of AI. This document covers the evaluation of risks as well as issues specific to safety and security.
The US legislation not passing doesn’t mean the individual US States aren’t taking action on this issue. One example of this is Virginia’s Consumer Data Protection Act.
This is far from an exhaustive list and one thing is for sure, more pieces of regulation are coming, and organizations need to prepare. In the short term, these regulations will continue to lack cohesion and focus and will be hard to navigate.
Even though the specifics of these regulations vary across the geographic regions, some high-level themes tie them together.
The overarching goal of regulation is to inform and hold accountable. These new regulations push the responsibility for these systems onto the creators. Acting irresponsibly or unethically will cost you.
Each regulation has a scope and doesn’t apply universally to all applications across the board. Some have a broad scope, and some are very narrow. They can also lack common definitions making it hard to determine if your application is in scope or not. Regulations may specifically call out a use case or may imply it through a definition of data protection. Most lawmakers aren’t technologists so expect differences across the various legislation you encounter and determine common themes.
Risk Assessments and Mitigations
A major theme of all the proposed legislation is understanding risk and providing mitigations. This assessment should evaluate both risks to and from the system. None of the regulations dictate a specific approach or methodology, but you’ll have to show that you evaluated risks and what steps you took to mitigate those risks. So, in simple terms, how would your system cause harm if it is compromised or fails, and what did you do about it?
Rules aren’t much good without validation. You’ll have to provide proof of the steps you took to protect your systems. In some cases, this may mean algorithmic verification by providing ongoing testing. The output of the testing could be proof you show to the auditor.
Simply put, why did your system make the decision it did? What factors lead to the decision? Explainability also plays a role outside of regulation. Coming up with the right decision isn’t good enough. When your systems lack explainability, they may make the right decision but for the wrong reason. Based on issues with data, the system may “learn” a feature that has high importance but, in reality, isn’t relevant.
How Can Companies Prepare?
The time to start preparing is now, and you can use the themes of current and proposed regulation as a starting point. It will take some time, depending on your organization and the processes and culture currently in place.
AI Strategy and Governance
A key foundation in compliance is the implementation of a strategy and governance program tailored to AI. An AI strategy and governance program allows organizations to implement specific processes and controls and audit compliance.
This program will affect multiple stakeholders, so it shouldn’t be any single person’s sole responsibility. Assemble a collection of stakeholders into an AI governance working group and, at a minimum, include members from the business, development, and security team.
You can’t prepare or protect what you don’t know. Taking and maintaining a proper inventory of AI projects and their criticality levels to the business is a vital first step. Business criticality levels can feed into other phases, such as risk assessments. A byproduct of the inventory is that you communicate with the teams developing these systems and gain feedback for your AI strategy.
Implement Threat and Risk Assessments
A central theme across all of the new regulations is the specific calling out of risk assessments. Implementing an approach where you evaluate both threats and risks will give you a better picture of the protection mechanisms necessary to protect the system and mitigate potential risks and abuses.
At Kudelski Security we have a simple approach for evaluating threats and risks to AI systems consisting of five phases. This approach provides tactical feedback to stakeholders for quick mitigation.
KS AI Threat and Risk Assessment
If you are looking for a quick gut check on the risk of the system, ask a couple of questions.
- What does the system do?
- Does it support a critical business process?
- Was it trained on sensitive data?
- How exposed is the system going to be?
- What would happen if the system failed?
- Could the system be misused?
- Does it fall under any regulatory compliance?
If you would like to dive deeper, check out a webcast I did for Black Hat called Preventing Random Forest Fires: AI Risk and Security First Steps.
Develop AI Specific Testing
Testing and validation of systems implementing machine learning and deep learning technology require different approaches and tooling. An AI system combines traditional and non-traditional platforms, meaning that standard security tooling won’t be effective across the board. However, depending on your current tooling and environment, standard tooling could be a solid foundation.
Security testing for these systems should be more cooperative than some of the more traditional adversarial approaches. Testing should include working with developers to get more visibility and creating a security pipeline to test attacks and potential mitigations.
It may be better to think of security testing in the context of AI more as a series of experiments than as a one-off testing activity. Experiments from both testing and proposed protection mechanisms can be done alongside the regular development pipeline and integrated later. AI attack and defense is a rapidly evolving space, so having a separate area to experiment apart from the production pipeline ensures that experimentation can happen freely without affecting production.
Models aren’t useful on their own. They require supporting infrastructure and may be distributed across many devices. This distribution is why documentation is critical. Understanding data usage and how all of the components work together allows for a better determination of the threats and risks to your systems.
Focus on explainability
Explainability, although not always called out in the legislation, is implied. After all, you can’t tell someone why they were denied a loan if you don’t have an explanation from the system. Explainability is important in a governance context as well. Ensuring you are making the right decision for the right reasons is vital for the normal operation of a system.
Some models are more explainable than others. When performing benchmarking for model performance, it’s a good idea to benchmark the model against a simpler, more explainable model. The performance may not be that different and what you get in return is something more predictable and explainable.
Move fast and break things is a luxury you can afford when the cost of failure is low. More and more machine learning is making its way into high-risk systems. By implementing a strategy and governance program and AI-specific controls, you can reduce your risk and attack surface and comply with regulations. Win-win.
Artificial intelligence (AI) is being discussed quite a bit, in fact, maybe the term is used too much. After all, it means different things in different situations, and vendors are using the term particularly loosely to tie into a hot market and hopefully sell more product. When it comes to information and cyber security – a realm so vital to a company’s reputation – there’s a need to see through the hype and ask the questions that really matter.
For starters, to me, the main question is not whether AI will find its way in our daily life, but what will it mean to us as cybersecurity professional? What are the security risks when adopted by various business lines for different purposes? What are the benefits to our profession? And what are the risks of not considering the opportunity of AI to help us do our job better – or of failing to monitor closely what the business will use it for?
Like so many technology disruptions, AI will change part of the business landscape and it will also shape our own cybersecurity backyard. The logic is implacable, when there is a business opportunity, there are investments to be made and AI presents potential across many aspects of our modern life.
In its simplest essence, AI perceives and processes information from its environment and can use incomplete data to maximize the success rate of completing certain tasks. As you can guess, I just described tons of activities that we as human do every day in the medical, financial, industrial, psychological, analytical and military sectors.
At the moment, I don’t think we should overly focus on its potential to replace cognitive beings. Instead, we should appreciate that AI can leverage broader data input, discover patterns we can’t easily distinguish and is capable of sorting and matching data much faster than we can. Moreover, it never gets tired and can work 24×7. Ultimately, this will result in potentially better and faster decision making, where emotions or artistic sense might not be the primary quality by which we measure output.
That said, all of AI is not “rosy,” and when matched with robotics, it can be the stuff of nightmares. AI comes with challenges, and while it can autonomously trigger actions based on an algorithmic logic, the logic must be flawless. If not, it will create a lot of “mistakes” and very fast. The necessary algorithms rely on data, hence input quality must be tightly controlled, otherwise, garbage-in, garbage-out, right? So, it’s imperative organizations decide what should and shouldn’t be automated, and it’s an approach that needs to be validated by humans first. AI strategy done well can effectively address a skill shortage, but done wrong and with a “set and forget” mentality, it’ll backfire.
Still, keep in mind that AI can also reflect some of the flaws of its creator. Because humans come with their fair share of challenges, let’s focus on two examples.
Trust, either the lack of or too much of it, can make AI react in a way we did not foresee. Sometimes, emotionless decision-making might be best, sometimes not. The more AI we create, the more we will need to deploy a transposable trust model for this community to interact with each other. After all, in the AI world, there is often little-to-no space for human interference if you want to capture its full benefit.
Transparency is another issue. As a society, we are not ready to entrust to machines many of the things we currently make decisions on – and security is a particularly sensitive area. Without transparency and accountability of the AI, how can we start tackling the notion of responsibility when something goes wrong? And, we must consider at what point will use of AI be mandatory under certain conditions? Will tomorrow’s doctor be personally liable for not using an AI and misdiagnosing potentially cancerous cells construction? As happens with humans, what if the physician simply failed to update a codebase?
It’s no secret that there’s is a lack of qualified security personnel today. That said, I feel it is our responsibility to explore ways to use AI as soon as possible in order to remove any item from our task list that can be automated. As a rule of thumb, I think of the Pareto Principal – AI should do the 80 percent of the job so we can focus on the 20 percent where human interaction and decision-making is a must.
Pareto likely never saw this coming, yet, the formula applies to our profession. AI could allow us to free up time and deliver more value with the same salary cost structure.
And believe me, we will need time because part of that 20 percent will be required to analyze the new business risks of using AI in the real world, one fraught with real and increasing security challenges.