News, news analysis, and commentary on the latest trends in cybersecurity technology.
Microsoft Beefs Up Defenses in Azure AI
Microsoft adds tools to protect Azure AI from threats such as prompt injection, as well as to give developers the capabilities to ensure generative AI apps are more resilient to model and content manipulation attacks.
April 1, 2024
Microsoft has announced several new capabilities in Azure AI Studio that the company says should help developers build generative artificial intelligence (GenAI) apps that are more reliable and resilient against malicious model manipulation and other emerging threats.
In a March 29 blog post, Microsoft's chief product officer of responsible AI, Sarah Bird, pointed to growing concerns about threat actors using prompt injection attacks to get AI systems to behave in dangerous and unexpected ways as the primary driving factor for the new tools.
"Organizations are also concerned about quality and reliability," Bird said. "They want to ensure that their AI systems are not generating errors or adding information that isn’t substantiated in the application’s data sources, which can erode user trust."
Azure AI Studio is a hosted platform that organizations can use to build custom AI assistants, copilots, bots, search tools and other applications, grounded in their own data. Announced in November, the platform hosts Microsoft's machine learning models, as well as models from several other sources, including OpenAI, Meta, Hugging Face, and Nvidia. It allows developers to quickly integrate multimodal capabilities and responsible AI features into their models.
Other major players, such as Amazon and Google, have rushed to market with similar offerings over the past year to tap into the surging interest in AI technologies worldwide. A recent IBM-commissioned study found that 42% of organizations with more than 1,000 employees are already actively using AI in some fashion, with many of them planning to increase and accelerate investments in the technology over the next few years. And not all of them were telling IT beforehand about their AI use.
Protecting Against Prompt Engineering
The five new capabilities that Microsoft has added — or will soon add — to Azure AI Studio are Prompt Shields, groundedness detection, safety system messages, safety evaluations, and risk and safety monitoring. The features are designed to address some significant challenges that researchers have uncovered recently — and continue to uncover on a routine basis — with regard to the use of large language models (LLMs) and GenAI tools.
Prompt Shields, for instance, is Microsoft's mitigation for what are known as indirect prompt attacks and jailbreaks. The feature builds on existing mitigations in Azure AI Studio against jailbreak risk. In prompt-engineering attacks, adversaries use prompts that appear innocuous and not overtly harmful to try and steer an AI model into generating harmful and undesirable responses. Prompt engineering is among the most dangerous in a growing class of attacks that try to jailbreak AI models or get them to behave in a manner that is inconsistent with any filters and constraints that developers might have built into them.
Researchers have recently shown how adversaries can engage in prompt-engineering attacks to get GenAI models to spill their training data, spew out personal information, generate misinformation, and potentially harmful content, such as instructions on how to hot-wire a car.
With Prompt Shields, developers can integrate capabilities into their models that help distinguish between valid and potentially untrustworthy system inputs, set delimiters to help mark the beginning and end of input text, and use data marking to mark input texts. Prompt Shields is currently available in preview mode in Azure AI Content Safety and will become generally available soon, according to Microsoft.
Mitigations for Model Hallucinations and Harmful Content
With groundedness detection, Microsoft has added a feature to Azure AI Studio that it says can help developers reduce the risk of their AI models "hallucinating." Model hallucination is a tendency by AI models to generate results that appear plausible but are completely made up and not based — or grounded — on the training data. LLM hallucinations can be hugely problematic if an organization were to take the output as factual and act on it in some way. In a software development environment, for instance, LLM hallucinations could result in developers potentially introducing vulnerable code into their applications.
Azure AI Studio's new groundedness-detection capability is basically about helping detect — more reliably and at greater scale — potentially ungrounded GenAI outputs. The goal is to give developers a way to test their AI models against what Microsoft calls "groundedness metrics" before deploying the model into their products. The feature also highlights potentially ungrounded statements in LLM outputs, so users know to fact check the output before using it. Groundedness detection should be available in the near future, according to Microsoft.
The new system message framework offers a way for developers to clearly define their models' capabilities, profile, and limitations in their specific environments. Developers can use the capability to define the format of the output and provide examples of intended behavior, so it becomes easier for users to detect deviations from intended behavior. The feature isn't available yet but should be soon.
Azure AI Studio's newly announced safety evaluations capability and its risk and safety monitoring feature are both currently available in preview status. Organizations can use the former to assess the vulnerability of their LLM models to jailbreak attacks and generate unexpected content. The risk and safety monitoring capability allows developers to detect model inputs that are problematic and likely to trigger hallucinated or unexpected content, so they can implement mitigations against it.
"Generative AI can be a force multiplier for every department, company, and industry," Microsoft's Bird said. "At the same time, foundation models introduce new challenges for security and safety that require novel mitigations and continuous learning."
About the Author
You May Also Like