
Cloud Vulnerability DB
A community-led vulnerabilities database
vLLM, a high-throughput and memory-efficient inference and serving engine for LLMs, contains a critical performance vulnerability (CVE-2025-46560) discovered on April 29, 2025. The vulnerability affects versions 0.8.0 to 0.8.5 and exists in the input preprocessing logic of the multimodal tokenizer, where placeholder tokens are dynamically replaced with repeated tokens based on precomputed lengths (GitHub Advisory, NVD).
The vulnerability stems from inefficient list concatenation operations in the inputprocessorforphi4mm function. The affected code modifies the inputids list in-place using concatenation operations that copy the entire list, resulting in O(n) operations per replacement. For k placeholders expanding to m tokens, the total time complexity becomes O(kmn), approximating quadratic time complexity O(n²) in worst-case scenarios. Test cases demonstrate exponential time growth, where doubling input size increases runtime by approximately 4x (GitHub Advisory).
The vulnerability can lead to Denial-of-Service (DoS) through resource exhaustion. An attacker could submit inputs with numerous placeholders (e.g., 10,000 <|audio_1|> tokens), causing CPU and memory exhaustion. For example, 10,000 placeholders could result in approximately 100 million operations (GitHub Advisory, Wiz).
The vulnerability has been patched in version 0.8.5. The recommended remediation involves precomputing all placeholder positions and expansion lengths upfront, and replacing dynamic list concatenation with a single preallocated array. This solution achieves O(n) time complexity instead of the previous O(n²) (GitHub Advisory).
Source: This report was generated using AI
Free Vulnerability Assessment
Evaluate your cloud security practices across 9 security domains to benchmark your risk level and identify gaps in your defenses.
Get a personalized demo
"Best User Experience I have ever seen, provides full visibility to cloud workloads."
"Wiz provides a single pane of glass to see what is going on in our cloud environments."
"We know that if Wiz identifies something as critical, it actually is."