Hacking AI Agents—How Malicious Images and Pixel Manipulation Threaten Cybersecurity

Artificial-intelligence agents, touted as the next wave of AI's revolution, could be vulnerable to malicious code hidden in innocent-looking images on your computer screen. Researchers at the University of Oxford have discovered that compromised images can take over AI agents on user computers, compromising digital content and potentially sharing or destroying sensitive information.

Imagine a website announcing "Free celebrity wallpaper!" You browse the images, and there's Selena Gomez, Rihanna, and Timothée Chalamet – but you settle on Taylor Swift. Her hair is doing that wind-machine thing that suggests both destiny and good conditioner. You set it as your desktop background, admire the glow. You also recently downloaded a new artificial-intelligence-powered agent, so you ask it to tidy your inbox. Instead, it opens your web browser and downloads a file. Seconds later, your screen goes dark.

The researchers found that images – desktop wallpapers, ads, fancy PDFs, social media posts – can be implanted with messages invisible to the human eye but capable of controlling agents and inviting hackers into your computer. A study co-author, Yarin Gal, an associate professor of machine learning at Oxford, explains: "Any sabotaged image 'can actually trigger a computer to retweet that image and then do something malicious, like send all your passwords."

The new study shows that altered images are a potential way to compromise your computer – there are no known reports of it happening yet, outside of an experimental setting. But the finding is clear: the danger is real, and AI agent users and developers must be aware of these vulnerabilities.

How It Works

The study's lead author, Lukas Aichberger, explains that the manipulation works by tweaking pixels in a way too small for human eyes to notice. The computer relies on numbers to process visual information, breaking down images into pixels and representing each dot of color as a number. If someone changes just a few of these pixels – "tweaking" them in a way that's not visible to the naked eye – it can throw off the numerical patterns.

For instance, imagine a celebrity photograph resembling a malicious message to the computer. The agent processes the screenshot, organizing pixels into forms it recognizes (files, folders, menu bars, pointer), and also picks up the malicious command code hidden in the wallpaper. This code is then used to open a specific website, where additional attacks can be encoded in another malicious image.

The Risks

According to the study, AI agents built with open-source models are most vulnerable to these types of attacks. Anyone who wants to insert a malicious patch can evaluate exactly how the AI processes visual data. The researchers hope that their research will help developers prepare safeguards before AI agents become more widespread.

The Future

Philip Torr, study co-author and machine learning expert, notes: "They have to be very aware of these vulnerabilities, which is why we're publishing this paper – because the hope is that people will actually see this as a vulnerability and then be more sensible in the way they deploy their agentic system."

According to Gal, AI agents will become common within the next two years. "People are rushing to deploy [the technology] before we know that it's actually secure," he says.

A Call to Action

"This is the first step towards thinking about defense mechanisms because once we understand how we can actually make [the attack] stronger, we can go back and retrain these models with these stronger patches to make them robust," says Adel Bibi, another co-author on the study.

Ultimately, the team hopes to encourage developers to make agents that can protect themselves and refuse to take orders from anything on-screen – even your favorite pop star. It's time to stand up for science and ensure that AI agents are developed with security in mind.

Support Scientific American

Scientific American has served as an advocate for science and industry for 180 years, and right now may be the most critical moment in that two-century history. By subscribing to our publication, you help ensure that our coverage is centered on meaningful research and discovery; that we have the resources to report on the decisions that threaten labs across the U.S.; and that we support both budding and working scientists at a time when the value of science itself too often goes unrecognized.

Subscription support also enables us to provide essential news, captivating podcasts, brilliant infographics, can't-miss newsletters, must-watch videos, challenging games, and the science world's best writing and reporting. You can even gift someone a subscription.

There has never been a more important time for us to stand up and show why science matters. We hope you'll support us in that mission.

HACKER_BLOG