What if the AI model server you just deployed for your startup became a hacker's gateway into your AWS infrastructure — before you even finished your coffee?
On April 21, 2026, GitHub published advisory GHSA-6w67-hwm5-92mq, later assigned CVE-2026-33626, describing a Server-Side Request Forgery (SSRF) vulnerability in LMDeploy. LMDeploy is an open-source toolkit developed by Shanghai AI Laboratory for compressing, deploying, and serving large language models — the kind of tool startups and researchers spin up daily to run inference APIs.
The CVSS score: 7.5 (High). The time to first exploitation: 12 hours and 31 minutes. That's not a typo.
The Vulnerability: Trusting User Input With Network Access
LMDeploy versions before 0.12.3 contain a vulnerable load_image() implementation in their vision-language image loader. When processing an image URL from user input, the server fetches the remote resource without validating hostnames, IP ranges, or URL schemes.
This means an attacker can feed the model server a URL like http://169.254.169.254/latest/meta-data/iam/security-credentials/ — the AWS Instance Metadata Service endpoint — and the server will happily fetch it, returning IAM role credentials to the attacker.
But it gets worse. Because the server makes outbound HTTP requests on behalf of the attacker, it acts as a generic SSRF primitive. The attacker can use it to:
- Port-scan internal networks behind the model server
- Access cloud metadata services (AWS, Azure, GCP)
- Probe internal services like Redis (port 6379), MySQL (port 3306), and admin interfaces
- Exfiltrate data via out-of-band DNS requests
The Attack: 8 Minutes of Methodical Exploration
At 03:35 UTC on April 22, 2026 — less than 13 hours after the advisory went public — Sysdig's threat research team observed the first exploitation attempt against a honeypot running a vulnerable LMDeploy instance.
The source: IP address 103.116.72.119 in Kowloon Bay, Hong Kong (AS400618 Prime Security Corp). Note: this is likely a proxy, VPN endpoint, or rented cloud instance — not the attacker's true origin.
What happened next was textbook reconnaissance, compressed into a single eight-minute session:
- AWS IMDSv1 probe: Request to
http://169.254.169.254/latest/meta-data/iam/security-credentials/attempting IAM credential exfiltration - Local Redis scan:
http://127.0.0.1:6379 - Local MySQL scan:
http://127.0.0.1:3306 - Admin interface probes: Ports 8080 and 80 on localhost
- Out-of-band confirmation: Blind SSRF test via
cw2mhnbd.requestrepo.com
The attacker wasn't just testing if the bug existed. They were using LMDeploy's image loader as a generic HTTP proxy into the internal network — validating the SSRF primitive before likely weaponizing it for a larger operation.
Why This Matters: AI Infrastructure Is the New Perimeter
Here's the uncomfortable truth: AI infrastructure is becoming a high-value target, and attackers are adapting faster than defenders.
LMDeploy isn't some obscure tool. It's part of a growing ecosystem of open-source LLM serving frameworks — alongside vLLM, TGI, and TensorRT-LLM — that developers deploy to run inference at scale. These tools often sit in cloud environments with access to sensitive data, internal APIs, and credentials.
The vulnerability reflects a broader pattern:
- Fast exploitation cycles: 12 hours from disclosure to active exploitation. Your monthly patch cycle? Irrelevant.
- AI as attack surface: Model servers process untrusted user input (prompts, images, documents) and often have network access. That's a perfect SSRF storm.
- Cloud metadata as crown jewels: IMDS credentials are the keys to the kingdom. Once an attacker has them, they can pivot to S3 buckets, databases, and lateral movement across your cloud estate.
How to Protect Yourself
If you're running LMDeploy or similar AI inference stacks, here's your action list:
- Upgrade immediately: Update to LMDeploy v0.12.3 or later. The patched release introduces strict URL safety checks blocking requests to link-local (169.254.0.0/16), loopback (127.0.0.0/8), and RFC 1918 private ranges.
- Enforce IMDSv2: On AWS instances, require session tokens for metadata access. IMDSv1 is unauthenticated and vulnerable to SSRF by design.
- Restrict egress: Limit outbound traffic from GPU and inference nodes to only necessary destinations (object storage, logging endpoints). If the server can't reach the metadata service, the SSRF chain breaks.
- Log and alert: Any inference server that fetches URLs from user input should log resolved IPs and alert on requests to link-local, loopback, or private ranges.
The Bottom Line
CVE-2026-33626 isn't just another SSRF bug. It's a signal that attackers are reconnoitering AI infrastructure with the same speed and precision they've long applied to web apps and enterprise networks.
The gap between disclosure and exploitation is now measured in hours, not days. If your AI deployment pipeline doesn't include emergency patching for inference infrastructure, you're not just behind — you're already a target.
Sources: Intrucept Labs, Sysdig Threat Research, GitHub Advisory GHSA-6w67-hwm5-92mq