AI Red Team – NVIDIA Technical Blog

AI Red Team – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-07-10T19:19:07Z http://www.open-lab.net/blog/feed/ Joseph Lucas <![CDATA[Structuring Applications to Secure the KV Cache]]> http://www.open-lab.net/blog/?p=99425 2025-05-15T19:08:32Z 2025-04-29T22:43:01Z

When interacting with transformer-based models like large language models (LLMs) and vision-language models (VLMs), the structure of the input shapes the...]]>

When interacting with transformer-based models like large language models (LLMs) and vision-language models (VLMs), the structure of the input shapes the...

Structuring Applications to Secure the KV Cache blog

When interacting with transformer-based models like large language models (LLMs) and vision-language models (VLMs), the structure of the input shapes the model��s output. But prompts are often more than a simple user query. In practice, they optimize the response by dynamically assembling data from various sources such as system instructions, context data, and user input.

]]> 0 Leon Derczynski <![CDATA[Defining LLM Red Teaming]]> http://www.open-lab.net/blog/?p=96239 2025-04-23T02:37:15Z 2025-02-25T18:49:26Z

There is an activity where people provide inputs to generative AI technologies, such as large language models (LLMs), to see if the outputs can be made to...]]>

There is an activity where people provide inputs to generative AI technologies, such as large language models (LLMs), to see if the outputs can be made to... Decorative image.

Decorative image.

There is an activity where people provide inputs to generative AI technologies, such as large language models (LLMs), to see if the outputs can be made to deviate from acceptable standards. This use of LLMs began in 2023 and has rapidly evolved to become a common industry practice and a cornerstone of trustworthy AI. How can we standardize and define LLM red teaming?

]]> 0 Rich Harang <![CDATA[Agentic Autonomy Levels and Security]]> http://www.open-lab.net/blog/?p=96341 2025-04-23T02:36:53Z 2025-02-25T18:45:05Z

Agentic workflows are the next evolution in AI-powered tools. They enable developers to chain multiple AI models together to perform complex activities, enable...]]>

Agentic workflows are the next evolution in AI-powered tools. They enable developers to chain multiple AI models together to perform complex activities, enable... Decorative image.

Decorative image.

Agentic workflows are the next evolution in AI-powered tools. They enable developers to chain multiple AI models together to perform complex activities, enable AI models to use tools to access additional data or automate user actions, and enable AI models to operate autonomously, analyzing and performing complex tasks with a minimum of human involvement or interaction. Because of their power��

]]> 0 Joseph Lucas <![CDATA[Sandboxing Agentic AI Workflows with WebAssembly]]> http://www.open-lab.net/blog/?p=93975 2024-12-16T21:06:56Z 2024-12-16T20:33:46Z

Agentic AI workflows often involve the execution of large language model (LLM)-generated code to perform tasks like creating data visualizations. However, this...]]>

Agentic AI workflows often involve the execution of large language model (LLM)-generated code to perform tasks like creating data visualizations. However, this...

cybersecurity-graphic

Agentic AI workflows often involve the execution of large language model (LLM)-generated code to perform tasks like creating data visualizations. However, this code should be sanitized and executed in a safe environment to mitigate risks from prompt injection and errors in the returned code. Sanitizing Python with regular expressions and restricted runtimes is insufficient��

]]> 0 Becca Lynch <![CDATA[NVIDIA Presents AI Security Expertise at Leading Cybersecurity Conferences]]> http://www.open-lab.net/blog/?p=89054 2024-09-19T19:29:43Z 2024-09-18T17:03:46Z

Each August, tens of thousands of security professionals attend the cutting-edge security conferences Black Hat USA and DEF CON. This year, NVIDIA AI security...]]>

Each August, tens of thousands of security professionals attend the cutting-edge security conferences Black Hat USA and DEF CON. This year, NVIDIA AI security...

ai-security-graphic

Each August, tens of thousands of security professionals attend the cutting-edge security conferences Black Hat USA and DEF CON. This year, NVIDIA AI security experts joined these events to share our work and learn from other members of the community. This post provides an overview of these contributions, including a keynote on the rapidly evolving AI landscape��

]]> 0 Joseph Lucas <![CDATA[Defending AI Model Files from Unauthorized Access with Canaries]]> http://www.open-lab.net/blog/?p=85254 2025-02-04T19:45:15Z 2024-07-11T19:06:21Z

As AI models grow in capability and cost of creation, and hold more sensitive or proprietary data, securing them at rest is increasingly important....]]>

As AI models grow in capability and cost of creation, and hold more sensitive or proprietary data, securing them at rest is increasingly important.... An illustration showing a securit alert.

An illustration showing a securit alert.

As AI models grow in capability and cost of creation, and hold more sensitive or proprietary data, securing them at rest is increasingly important. Organizations are designing policies and tools, often as part of data loss prevention and secure supply chain programs, to protect model weights. While security engineering discussions focus on prevention (How do we prevent X?), detection (Did X��

]]> 1 Joseph Lucas <![CDATA[Secure LLM Tokenizers to Maintain Application Integrity]]> http://www.open-lab.net/blog/?p=84504 2024-07-10T15:28:33Z 2024-06-27T18:00:00Z

This post is part of the NVIDIA AI Red Team��s continuing vulnerability and technique research. Use the concepts presented to responsibly assess and increase...]]>

This post is part of the NVIDIA AI Red Team��s continuing vulnerability and technique research. Use the concepts presented to responsibly assess and increase...

digital-field

This post is part of the NVIDIA AI Red Team��s continuing vulnerability and technique research. Use the concepts presented to responsibly assess and increase the security of your AI development and deployment processes and applications. Large language models (LLMs) don��t operate over strings. Instead, prompts are passed through an often-transparent translator called a tokenizer that creates an��

]]> 0 Rich Harang <![CDATA[Best Practices for Securing LLM-Enabled Applications]]> http://www.open-lab.net/blog/?p=73609 2024-07-08T20:07:28Z 2023-11-15T18:00:00Z

Large language models (LLMs) provide a wide range of powerful enhancements to nearly any application that processes text. And yet they also introduce new risks,...]]>

Large language models (LLMs) provide a wide range of powerful enhancements to nearly any application that processes text. And yet they also introduce new risks,...

security-graphic

Large language models (LLMs) provide a wide range of powerful enhancements to nearly any application that processes text. And yet they also introduce new risks, including: This post walks through these security vulnerabilities in detail and outlines best practices for designing or evaluating a secure LLM-enabled application. Prompt injection is the most common and well-known��

]]> 0 Will Pearce <![CDATA[NVIDIA AI Red Team: Machine Learning Security Training]]> http://www.open-lab.net/blog/?p=71491 2024-07-08T20:05:26Z 2023-10-19T20:26:15Z

At Black Hat USA 2023, NVIDIA hosted a two-day training session that provided security professionals with a realistic environment and methodology to explore the...]]>

At Black Hat USA 2023, NVIDIA hosted a two-day training session that provided security professionals with a realistic environment and methodology to explore the... Picture of the ML security training classroom at Black Hat USA

Picture of the ML security training classroom at Black Hat USA

At Black Hat USA 2023, NVIDIA hosted a two-day training session that provided security professionals with a realistic environment and methodology to explore the unique risks presented by machine learning (ML) in today��s environments. In this post, the NVIDIA AI Red Team shares what was covered during the training and other opportunities to continue learning about ML security.

]]> 5 Joseph Lucas <![CDATA[Analyzing the Security of Machine Learning Research Code]]> http://www.open-lab.net/blog/?p=71113 2024-07-08T21:33:52Z 2023-10-04T18:00:00Z

The NVIDIA AI Red Team is focused on scaling secure development practices across the data, science, and AI ecosystems. We participate in open-source security...]]>

The NVIDIA AI Red Team is focused on scaling secure development practices across the data, science, and AI ecosystems. We participate in open-source security...

man-with-laptop

The NVIDIA AI Red Team is focused on scaling secure development practices across the data, science, and AI ecosystems. We participate in open-source security initiatives, release tools, present at industry conferences, host educational competitions, and provide innovative training. Covering 3 years and totaling almost 140GB of source code, the recently released Meta Kaggle for Code dataset is��

]]> 2 Joseph Lucas <![CDATA[Mitigating Stored Prompt Injection Attacks Against LLM Applications]]> http://www.open-lab.net/blog/?p=68917 2024-11-20T23:04:08Z 2023-08-04T16:05:50Z

Prompt injection attacks are a hot topic in the new world of large language model (LLM) application security. These attacks are unique due to how ?malicious...]]>

Prompt injection attacks are a hot topic in the new world of large language model (LLM) application security. These attacks are unique due to how ?malicious...

abstract-graphic

Prompt injection attacks are a hot topic in the new world of large language model (LLM) application security. These attacks are unique due to how ?malicious text is stored in the system. An LLM is provided with prompt text, and it responds based on all the data it has been trained on and has access to. To supplement the prompt with useful context, some AI applications capture the input from��

]]> 0 Rich Harang <![CDATA[Securing LLM Systems Against Prompt Injection]]> http://www.open-lab.net/blog/?p=68819 2024-07-08T20:08:30Z 2023-08-03T18:43:12Z

Prompt injection is a new attack technique specific to large language models (LLMs) that enables attackers to manipulate the output of the LLM. This attack is...]]>

Prompt injection is a new attack technique specific to large language models (LLMs) that enables attackers to manipulate the output of the LLM. This attack is...

security-graphic

Prompt injection is a new attack technique specific to large language models (LLMs) that enables attackers to manipulate the output of the LLM. This attack is made more dangerous by the way that LLMs are increasingly being equipped with ��plug-ins�� for better responding to user requests by accessing up-to-date information, performing complex calculations, and calling on external services through��

]]> 0 Will Pearce <![CDATA[NVIDIA AI Red Team: An Introduction]]> http://www.open-lab.net/blog/?p=66214 2024-07-08T20:06:41Z 2023-06-14T22:00:16Z

Machine learning has the promise to improve our world, and in many ways it already has. However, research and lived experiences continue to show this technology...]]>

Machine learning has the promise to improve our world, and in many ways it already has. However, research and lived experiences continue to show this technology... Two men working at a desktop computer in an office.

Two men working at a desktop computer in an office.

Machine learning has the promise to improve our world, and in many ways it already has. However, research and lived experiences continue to show this technology has risks. Capabilities that used to be restricted to science fiction and academia are increasingly available to the public. The responsible use and development of AI requires categorizing, assessing, and mitigating enumerated risks where��

]]> 0 Joseph Lucas <![CDATA[Evaluating the Security of Jupyter Environments]]> http://www.open-lab.net/blog/?p=60938 2024-07-09T15:25:19Z 2023-02-13T20:30:00Z

How can you tell if your Jupyter instance is secure? The NVIDIA AI Red Team has developed a JupyterLab extension to automatically assess the security of Jupyter...]]>

How can you tell if your Jupyter instance is secure? The NVIDIA AI Red Team has developed a JupyterLab extension to automatically assess the security of Jupyter...

computer-jupyter-code

How can you tell if your Jupyter instance is secure? The NVIDIA AI Red Team has developed a JupyterLab extension to automatically assess the security of Jupyter environments. jupysec is a tool that evaluates the user��s environment against almost 100 rules that detect configurations and artifacts that have been identified by the AI Red Team as potential vulnerabilities, attack vectors��

]]> 0 Joseph Lucas <![CDATA[Improving Machine Learning Security Skills at a DEF CON Competition]]> http://www.open-lab.net/blog/?p=57692 2024-07-09T16:36:32Z 2022-11-30T21:00:00Z

Machine learning (ML) security is a new discipline focused on the security of machine learning systems and the data they are built upon. It exists at the...]]>

Machine learning (ML) security is a new discipline focused on the security of machine learning systems and the data they are built upon. It exists at the... Letters, numbers, and padlocks on black background

Letters, numbers, and padlocks on black background

Machine learning (ML) security is a new discipline focused on the security of machine learning systems and the data they are built upon. It exists at the intersection of the information security and data science domains. While the state-of-the-art moves forward, there is no clear onboarding and learning path for securing and testing machine learning systems. How, then��

]]> 0 ��˳��97caoporen��