An AI Inference Server Bug Was Weaponized in Twelve Hours — LMDeploy Joins the Fast-Burn AI-Infra List

The window between an AI-infrastructure CVE going public and that CVE getting weaponized has narrowed to a single working day. On April 22, 2026, Orca Security’s Igor Stepansky disclosed CVE-2026-33626, a server-side request forgery (SSRF) flaw in LMDeploy’s vision-language module. Per Sysdig telemetry reported by The Hacker News, the first exploitation attempt landed twelve hours and thirty-one minutes later — before most operations teams had finished reading the advisory.

The flaw, in one line

CVE-2026-33626 carries a CVSS of 7.5 and affects LMDeploy versions 0.12.0 and earlier with vision-language support enabled. The defect is narrow: the vision-language endpoint fetches arbitrary URLs without validating that the destination is not an internal address. That single missing check is enough to turn a public-facing inference endpoint into a pivot for the rest of the cloud account. Per the advisory, exploitation grants “access to cloud metadata services, internal networks, and sensitive resources” — the textbook SSRF blast radius applied to a tool that ML teams typically run alongside privileged cloud roles.

What the first campaigns went for

Sysdig’s honeypot telemetry shows the early campaigns were not exploratory. Across ten distinct requests, attackers ran port scans against the AWS Instance Metadata Service (IMDS), Redis, and MySQL, and tested DNS exfiltration paths to confirm outbound channels. That pattern is consistent with operators trying to pull short-term cloud credentials through IMDSv1 fallbacks, hop laterally to in-cluster data stores, and stand up a covert beacon — all from a single unauthenticated request.

Sysdig’s researchers framed the broader trend without hedging: “CVE-2026-33626 fits a pattern that we have observed repeatedly in the AI-infrastructure space over the past six months: critical vulnerabilities in inference servers, model gateways, and agent orchestration tools are being weaponized within hours of advisory publication.”

What this means

Three implications follow for defenders.

First, AI inference infrastructure now belongs on the same patch cadence as edge web infrastructure — not the slower “research-tooling” cycle most organizations still use. Inference servers, vector databases, and model gateways are production-tier from an attacker’s perspective regardless of how they are budgeted internally.

Second, IMDSv2 is no longer optional. Any LMDeploy host running on EC2 with IMDSv1 still reachable was a credential-grant-on-demand for whoever fired this exploit first. Forcing token-bound metadata access — and rejecting requests without the X-aws-ec2-metadata-token header at the host level — neutralizes the most damaging step of this exact attack chain.

Third, the AI-infrastructure stack now follows the same compressed weaponization timeline that MCP servers, model gateways, and ML notebooks have endured over the past quarter. Treat every new advisory in this layer as an emergency patch ticket from the moment it ships, because somewhere a Sysdig-equivalent honeypot will see traffic before lunch.

An AI Inference Server Bug Was Weaponized in Twelve Hours — LMDeploy Joins the Fast-Burn AI-Infra List

The flaw, in one line

What the first campaigns went for

What this means

Anthropic's 'Glasswing' Puts an Offensive-Grade AI in the Hands of 50 Defenders — And Schneier Isn't Convinced It's a Gift

OpenAI Opens GPT-5.4-Cyber to Thousands of Defenders as the SOC-Copilot Race Tightens

An 'Expected' MCP Behavior Is Now an RCE Vector Across 7,000 AI Servers