The window between an AI-infrastructure CVE going public and that CVE getting weaponized has narrowed to a single working day. On April 22, 2026, Orca Security’s Igor Stepansky disclosed CVE-2026-33626, a server-side request forgery (SSRF) flaw in LMDeploy’s vision-language module. Per Sysdig telemetry reported by The Hacker News, the first exploitation attempt landed twelve hours and thirty-one minutes later — before most operations teams had finished reading the advisory.
The flaw, in one line
CVE-2026-33626 carries a CVSS of 7.5 and affects LMDeploy versions 0.12.0 and earlier with vision-language support enabled. The defect is narrow: the vision-language endpoint fetches arbitrary URLs without validating that the destination is not an internal address. That single missing check is enough to turn a public-facing inference endpoint into a pivot for the rest of the cloud account. Per the advisory, exploitation grants “access to cloud metadata services, internal networks, and sensitive resources” — the textbook SSRF blast radius applied to a tool that ML teams typically run alongside privileged cloud roles.
What the first campaigns went for
Sysdig’s honeypot telemetry shows the early campaigns were not exploratory. Across ten distinct requests, attackers ran port scans against the AWS Instance Metadata Service (IMDS), Redis, and MySQL, and tested DNS exfiltration paths to confirm outbound channels. That pattern is consistent with operators trying to pull short-term cloud credentials through IMDSv1 fallbacks, hop laterally to in-cluster data stores, and stand up a covert beacon — all from a single unauthenticated request.
Sysdig’s researchers framed the broader trend without hedging: “CVE-2026-33626 fits a pattern that we have observed repeatedly in the AI-infrastructure space over the past six months: critical vulnerabilities in inference servers, model gateways, and agent orchestration tools are being weaponized within hours of advisory publication.”
What this means
Three implications follow for defenders.
First, AI inference infrastructure now belongs on the same patch cadence as edge web infrastructure — not the slower “research-tooling” cycle most organizations still use. Inference servers, vector databases, and model gateways are production-tier from an attacker’s perspective regardless of how they are budgeted internally.
Second, IMDSv2 is no longer optional. Any LMDeploy host running on EC2 with IMDSv1 still reachable was a credential-grant-on-demand for whoever fired this exploit first. Forcing token-bound metadata access — and rejecting requests without the X-aws-ec2-metadata-token header at the host level — neutralizes the most damaging step of this exact attack chain.
Third, the AI-infrastructure stack now follows the same compressed weaponization timeline that MCP servers, model gateways, and ML notebooks have endured over the past quarter. Treat every new advisory in this layer as an emergency patch ticket from the moment it ships, because somewhere a Sysdig-equivalent honeypot will see traffic before lunch.