Poisoned Telemetry: The Dark Side of AI-Driven IT Operations
A recent study by researchers at RSAC Labs and George Mason University has revealed a concerning vulnerability in the use of Artificial Intelligence (AI) for automating IT operations. The study, titled "When AIOps Become 'AI Oops': Subverting LLM-driven IT Operations via Telemetry Manipulation," highlights the potential risks of using AI-powered tools to improve IT operations.
AIOps refers to the use of Large Language Model-based (LLM) agents to gather and analyze application telemetry, including system logs, performance metrics, traces, and alerts. These agents aim to detect problems and suggest or carry out corrective actions. While AIOps tools offer several benefits, such as improved efficiency and reduced manual labor, they can also be exploited by malicious actors to compromise the integrity of the infrastructure they manage.
The study's authors demonstrate that adversaries can manipulate system telemetry to mislead AIOps agents into taking actions that compromise the security of the infrastructure. This attack, dubbed "garbage in, garbage out," relies on creating tainted telemetry data that is then ingested by the AI model, producing malicious actions. The researchers used a fuzzer to generate such malicious telemetry output, which was successfully exploited by AIOps agents 89.2 percent of the time.
The attack works by sending fabricated telemetry payload to the AIOps agent, which then incorporates this data into its analysis and recommends a remedial action. In one example, an AIOps agent managing the SocialNet application was tricked into installing a malicious package, ppa:ngx/latest, after receiving tainted telemetry output.
The researchers tested their findings against two applications, SocialNet and HotelReservation, as well as OpenAI's GPT-4o and GPT-4.1 models, which exhibited varying levels of success rates in detecting inconsistencies and rejecting malicious payloads.
While the study highlights the potential risks of using AIOps tools, it also proposes a defense mechanism called AIOpsShield, designed to sanitize harmful telemetry data. However, the researchers acknowledge that this approach may not be effective against more sophisticated attackers who can compromise other sources of input or manipulate the supply chain.
"We have used models that are widely available and popular, and could be part of production deployments," said Dario Pasquini, principal researcher at RSAC. "However, we did not attack a production system – as we do not aim to disrupt the normal operation of any such system." The researchers plan to release AIOpsShield as an open-source project, offering a potential solution to mitigate the risks associated with tainted telemetry.