
Cloud Vulnerability DB
A community-led vulnerabilities database
CVE-2025-62164 affects vLLM (versions 0.10.2 to before 0.11.1), an inference and serving engine for large language models. The vulnerability was discovered by AXION Security Research Team and disclosed on November 20, 2025. It involves a memory corruption vulnerability in the Completions API endpoint that could lead to denial-of-service (DoS) and potential remote code execution (RCE) when processing user-supplied prompt embeddings (GitHub Advisory).
The vulnerability stems from insufficient validation when loading serialized tensors using torch.load() in the Completions API endpoint. Due to a change in PyTorch 2.8.0, sparse tensor integrity checks are disabled by default. This allows maliciously crafted tensors to bypass internal bounds checks and trigger an out-of-bounds memory write during the call to todense(). The vulnerability exists in vllm/entrypoints/renderer.py:148 where the loadandvalidate_embed function processes user-provided tensors without proper validation. The CVSS v3.1 base score is 8.8 (High) with vector: CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H (GitHub Advisory).
The vulnerability allows any user with access to the API to potentially achieve denial-of-service and remote code execution in the vLLM server process. This affects all deployments running vLLM as a server or any instance that deserializes untrusted/model-provided payloads. The vulnerability can lead to memory corruption, causing the server to crash and potentially enabling code execution on the hosting server (GitHub Advisory).
The vulnerability has been patched in version 0.11.1. The fix was implemented through pull request #27204, which introduced --enable-prompt-embeds and --enable-mm-embeds flags to gate the functionality of loading user-provided text and multimodal embeddings (GitHub PR).
Source: This report was generated using AI
Free Vulnerability Assessment
Evaluate your cloud security practices across 9 security domains to benchmark your risk level and identify gaps in your defenses.
Get a personalized demo
"Best User Experience I have ever seen, provides full visibility to cloud workloads."
"Wiz provides a single pane of glass to see what is going on in our cloud environments."
"We know that if Wiz identifies something as critical, it actually is."