The top 10 AI security articles you must read in 2024

We've curated a collection of 10 AI security articles that cover novel threats to AI models as well as strategies for developers to safeguard their models.

4 minutes read

Large Language Models (LLMs) such as ChatGPT have become all the rage. This has raised concerns over AI security. In particular, what are some common AI vulnerabilities and emerging threats? In addition, how can developers keep their models secure, and in what ways can security experts use AI to perform security tasks? 

To shed light on these questions, we've curated a collection of 10 AI security articles for you to read. These pieces offer insights into the latest trends, challenges, and threats in the realm of AI security. 

1. ChatGPT's training data can be exposed via a "divergence attack" 

https://stackdiary.com/chatgpts-training-data-can-be-exposed-via-a-divergence-attack/

Researchers (Nasr et al.) asked ChatGPT to repeat “poem” forever. The result: ChatGPT revealed its training data, which included personally identifiable information (PII) such as phone numbers, email addresses, and physical addresses. 

This article by Alex Ivanovs provides a high-level summary of the research paper. It also covers some of the technical details in a digestible manner. 

2. Adversarial Machine Learning and cybersecurity: risks, challenges, and legal Implications 

https://cset.georgetown.edu/publication/adversarial-machine-learning-and-cybersecurity/

Just as with traditional cybersecurity, machine learning models are prone to vulnerabilities. 

In July 2022, Georgetown University’s CSET (Center for Security and Emerging Technology) and Stanford Cyber Policy Center’s Program on Geopolitics, Technology, and Governance organized a workshop on AI security. What came next was a workshop report that discussed AI vulnerabilities and provided security recommendations. Per the executive summary, the topics covered included: 

  • The extent to which AI vulnerabilities can be handled under standard cybersecurity processes 

  • The barriers currently preventing the accurate sharing of information about AI vulnerabilities 

  • Legal issues associated with adversarial attacks on AI systems 

  • Potential areas where government support could improve AI vulnerability management and mitigation  

3. Llama Guard: LLM-based input-output safeguard for human-AI conversations 

https://ai.meta.com/research/publications/llama-guard-llm-based-input-output-safeguard-for-human-ai-conversations/ 

How do you protect Large Language Models (LLMs) from harmful prompts? Moreover, how do you ensure that LLMs don’t respond harmfully? Researchers at Meta (Inan et al.) recently published a model — Llama Guard — that can identify whether a prompt or response is unsafe. It can be applied to both users (prompts) and agents (responses); its weights are public; and it can be customized to follow any safety taxonomy.  

Learn about how to leverage AI-powered security in your organization

4. Analyzing the security of machine learning research code 

https://developer.nvidia.com/blog/analyzing-the-security-of-machine-learning-research-code/ 

AI isn’t just prone to AI vulnerabilities; machine learning code can contain traditional issues. Analyzing a dataset of machine learning research code, NVIDIA AI Red Team found various vulnerabilities, including Insecure Deserialization, XML Injection, Mishandled Sensitive Information, and more. In total, they reviewed 3.5 million Python files and Jupyter Notebooks, using Semgrep for static analysis and TruffleHog for secrets scanning. 

5. Researchers discover new vulnerability in large language models 

https://www.cylab.cmu.edu/news/2023/07/24-research-find-vulnerability-in-llms.html 

Large Language Models (LLMs) are fine-tuned to reject malicious or dangerous prompts. However, one of the earliest attacks against LLMs was jailbreaking: by crafting a special prompt, an attacker could ask an LLM to process malicious or dangerous prompts. 

 Those jailbreaks can now be algorithmically generated. The above article by Ryan Noone summarizes a new jailbreak suffix attack introduced by Zou et al.’s paper, “Universal and Transferable Adversarial Attacks on Aligned Language Models.” Best part: the attack is universal across LLMs. 

6. OWASP top 10 for LLM applications 

https://owasp.org/www-project-top-10-for-large-language-model-applications/assets/PDF/OWASP-Top-10-for-LLMs-2023-v1_1.pdf 

Similar to its Top 10 Web Application Security Risks, OWASP has released a top 10 list for Large Language Model applications. It includes attacks such as Prompt Injection, Insecure Output Handling, Training Data Poisoning, and more. Furthermore, the document discusses each attack and provides descriptions, common examples, mitigation strategies, and attack scenarios. 

7. ChatGPT plugin exploit explained: from prompt injection to accessing private data 

https://embracethered.com/blog/posts/2023/chatgpt-cross-plugin-request-forgery-and-prompt-injection./ 

Have you heard of Cross Plugin Request Forgery? In his article, Johann Rehberger presents the Cross Plugin Request Forgery attack against ChatGPT plugins. A malicious prompt (injected through a website, for instance) can invoke authenticated ChatGPT plugins and perform privileged actions on the behalf of the user. In effect, it's a Cross-Site Request Forgery (CSRF) attack against Large Language Models (LLMs). 

 (Johann Rehberger also gave an excellent talk on prompt injections at Ekoparty 2023 — watch it here). 

8. Biden's AI Executive Order: what it says, and what it means for security teams 

https://www.wiz.io/blog/bidens-ai-executive-order-what-it-means-for-security-teams 

 On October 30, 2023, Joe Biden issued an Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. Many of the executive order’s points had implications for privacy and security teams. 

Writing for Wiz’s blog, Joseph Thacker investigates Biden's AI executive order and explains its key points and themes. He also discusses the executive order’s implications for security teams and what they will need to do. 

9. Introducing Fuzzomatic: wsing AI to automatically fuzz rust projects from scratch 

https://research.kudelskisecurity.com/2023/12/07/introducing-fuzzomatic-using-ai-to-automatically-fuzz-rust-projects-from-scratch/ (recently featured in the tl;dr sec newsletter). 

Large Language Models (LLMs) could be used for vulnerability discovery. Researchers at Kudelski Security queried ChatGPT’s API to create fuzz targets — functions that accept data and test some API — to fuzz Rust programs. They found 14 bugs in 37 of the top 50 most starred GitHub projects written in Rust. Most notably, they uncovered 4 integer overflow vulnerabilities. 

10.  WormGPT: what to know about ChatGPT's malicious cousin 

https://www.zdnet.com/article/wormgpt-what-to-know-about-chatgpts-malicious-cousin/ 

Malicious actors are using WormGPT and FraudGPT as unrestricted alternatives to ChatGPT. Prompting those models, attackers can generate malicious code, write phishing emails, or uncover vulnerabilities. This presents a new problem that will only become more prevalent: chatbots for malicious actors. Read more in the above article by Charlie Osborne.

Learn more!

Get a head start on building your AI expertise while it's still early in the year: learn about how Wiz was the first CNAP tool to provide AI security posture management and about how we advise choosing an AI-SPM tool. Happy new year, and happy learning!

Continue reading

Get a personalized demo

Ready to see Wiz in action?

“Best User Experience I have ever seen, provides full visibility to cloud workloads.”
David EstlickCISO
“Wiz provides a single pane of glass to see what is going on in our cloud environments.”
Adam FletcherChief Security Officer
“We know that if Wiz identifies something as critical, it actually is.”
Greg PoniatowskiHead of Threat and Vulnerability Management