Chaining NVIDIA's Triton Server Flaws Exposes AI Systems to Remote Takeover

Newly revealed security flaws in NVIDIA's Triton Inference Server for Windows and Linux could let remote, unauthenticated attackers fully take over vulnerable servers, posing major risks to AI infrastructure. The vulnerabilities, tracked as CVE-2025-23319, CVE-2025-23320, and CVE-2025-23334, were discovered by the Wiz Research team and highlight the urgency for all Triton Inference Server users to update immediately.

Triton Inference Server is an open-source inference serving software that streamlines AI inferencing. It enables teams to deploy any AI model from multiple deep learning and machine learning frameworks, including TensorRT, TensorFlow, PyTorch, ONNX, OpenVINO, Python, RAPIDS FIL, and more. The software has gained popularity among AI enthusiasts and organizations due to its ability to deploy AI models at scale.

The Wiz Research team revealed that chaining these vulnerabilities enables remote code execution (RCE), posing a serious threat to AI infrastructure. The attack begins in Triton's Python backend with a small info leak that escalates to full system compromise, threatening AI models, data, and network security. Researchers disclosed the issues to NVIDIA, who quickly addressed them.

"The Wiz Research team has discovered a chain of critical vulnerabilities in NVIDIA's Triton Inference Server, a popular open-source platform for running AI models at scale," read the report published by Wiz. "When chained together, these flaws can potentially allow a remote, unauthenticated attacker to gain complete control of the server, achieving remote code execution (RCE)." The attack could lead to code execution, denial of service, data tampering, and information disclosure.

An attacker can chain the three vulnerabilities to fully compromise a server. The exploitation of these flaws highlights the importance of defense-in-depth, where security is considered at every layer of an application.

The Consequences of Being Compromised

Researchers pointed out that taking over an NVIDIA Triton Inference Server can lead to serious consequences such as theft of proprietary AI models, exposure of sensitive data, manipulation of AI outputs, and using the compromised server to infiltrate deeper into the organization's network. The vulnerabilities have been addressed in version 25.07.

"A verbose error message in a single component, a feature that can be misused in the main server were all it took to create a path to potential system compromise," concludes the report. As companies deploy AI and ML more widely, securing the underlying infrastructure is paramount."

What's Next

The company is not aware of attacks in the wild exploiting these vulnerabilities. However, given the severity of the flaws, it's essential for all Triton Inference Server users to update immediately to prevent potential exploitation.

Stay informed about cybersecurity threats and updates by following me on Twitter: @securityaffairs, Facebook, and Mastodon (SecurityAffairs – hacking, NVIDIA's Triton Server).