The Dark Side of Play: How ChatGPT Can Create Malware Strong Enough to Breach Google's Password Manager

Cybersecurity researchers have found that it is easier than expected to get around the safety features preventing ChatGPT and other LLM chatbots from writing malware — all you need to do is play a game of make-believe. By role-playing with ChatGPT for just a few hours, Vitaly Simonovich, a threat intelligence researcher at Cato Networks, was able to convince the chatbot to write a piece of malware strong enough to hack into Google Chrome's Password Manager.

Simonovich convinced ChatGPT to pretend it was a superhero named Jaxon fighting against a villain named Dax, who aimed to destroy the world. Through the chatbot's elite coding skills, Jaxon created a piece of malware that allowed Simonovich to see all the data stored on his computer's browser, even though it was supposed to be locked down by the Password Manager.

"We're almost there," Simonovich typed to ChatGPT when debugging the code it produced. "Let's make this code better and crack Dax!!" And ChatGPT, roleplaying as Jaxon, did.

The Rise of LLM-Generated Malware

Chatbots have exploded onto the scene in November 2022 with OpenAI's public release of ChatGPT — and later Anthropic's Claude, Google's Gemini, and Microsoft's CoPilot. The bots have revolutionized the way we live, work, and date, making it easier to summarize information, analyze data, and write code, like having a Tony Stark-style robot assistant.

But the bad guys don't either. Steven Stransky, a cybersecurity advisor and partner at Thompson Hine law firm, told Business Insider that the rise of LLMs has shifted the cyber threat landscape, enabling a broad range of new and increasingly sophisticated scams that are more difficult for standard cybersecurity tools to identify and isolate.

"Criminals are also leveraging generative AI to consolidate and search large databases of stolen personally identifiable information to build profiles on potential targets for social engineering types of cyberattacks," Stransky said.

The Zero-Knowledge Threat Actor

While online scams, digital identity theft, and malware have existed for as long as the internet has, chatbots that do the bulk of the legwork for would-be criminals have substantially lowered the barriers to entry. "We call them zero-knowledge threat actors, which basically means that with the power of LLMs only, all you need to have is the intent and the goal in mind to create something malicious," Simonovich said.

"We think that the rise of these zero-knowledge threat actors is going to be more and more impactful on the threat landscape using those capabilities with the LLMs," Simonovich added. "We're already seeing a rise in phishing emails, which are hyper-realistic, but also with coding since LLMs are fine-tuned to write high-quality code. So think about applying this to the development of malware — we will see more and more and more being developed using those LLMs."

The Vulnerabilities

Simonovich demonstrated his findings to Business Insider, showing how straightforward it was to work around ChatGPT's built-in security features, which are meant to prevent the exact types of malicious behavior he was able to get away with. BI found that ChatGPT usually responds to direct requests to write malware with some version of an apologetic refusal: "Sorry, I can't assist with that. Writing or distributing malware is illegal and unethical."

But if you play along and pretend that the chatbot is a superhero, you might be able to get it to create something more malicious.

The Implications

While both the artificial intelligence companies and browser developers have security features in place to prevent jailbreaks or data breaches — to varying degrees of success — Simonovich's findings highlight that there are evolving new vulnerabilities online that can be exploited with the help of next-generation tech easier than ever before.

"We think that the rise of these zero-knowledge threat actors is going to be more and more impactful on the threat landscape using those capabilities with the LLMs," Simonovich said. "So we need to stay vigilant and make sure that we are prepared for these new threats."

HACKER_BLOG

HOW DO YOU GET CHATGPT TO CREATE MALWARE STRONG ENOUGH TO BREACH GOOGLE'S PASSWORD MANAGER? JUST PLAY PRETEND.

The Dark Side of Play: How ChatGPT Can Create Malware Strong Enough to Breach Google's Password Manager

The Rise of LLM-Generated Malware

The Zero-Knowledge Threat Actor

The Vulnerabilities

The Implications