
Cloud Vulnerability DB
A community-led vulnerabilities database
CVE-2025-62426 affects vLLM, an inference and serving engine for large language models (LLMs), from versions 0.5.5 to before 0.11.1. The vulnerability exists in the /v1/chat/completions and /tokenize endpoints where the chattemplatekwargs request parameter is processed before proper validation against the chat template. The issue was discovered and disclosed on November 20, 2025, and has been patched in version 0.11.1 (GitHub Advisory).
The vulnerability stems from the servingengine.py component where chattemplatekwargs are unpacked into kwargs and passed to chatutils.py applyhfchat_template without proper validation of keys or values. This allows overriding optional parameters like tokenize, changing its default from False to True. The vulnerability has been assigned a CVSS v3.1 base score of 6.5 (Medium) with vector string CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H, indicating network attack vector, low attack complexity, and high impact on availability (GitHub Advisory).
The vulnerability can lead to a denial of service condition where any authenticated user can cause the vLLM server to block processing for extended periods through Chat Completion or Tokenize requests. Since tokenization is a blocking operation, sufficiently large input can block the API server's event loop, preventing the handling of all other requests until the tokenization completes (GitHub Advisory).
The vulnerability has been patched in vLLM version 0.11.1. The fix involves enforcing tokenize=False when applying chat templates and rejecting tokenize and chattemplate parameters within chattemplate_kwargs. Users should upgrade to version 0.11.1 or later to address this vulnerability (GitHub Advisory).
Source: This report was generated using AI
Free Vulnerability Assessment
Evaluate your cloud security practices across 9 security domains to benchmark your risk level and identify gaps in your defenses.
Get a personalized demo
"Best User Experience I have ever seen, provides full visibility to cloud workloads."
"Wiz provides a single pane of glass to see what is going on in our cloud environments."
"We know that if Wiz identifies something as critical, it actually is."