CVE-2026-44018
Python Analyse et atténuation des vulnérabilités

Aperçu

CVE-2026-44018 is a moderate-severity vulnerability in the Docling Python library's METS-GBS backend that enables XML External Entity (XXE) attacks, decompression bomb (zip bomb) attacks, and unbounded archive extraction. It affects docling (pip) versions >= 2.45.0 and < 2.91.0. The advisory was published on June 2, 2026, by dolfim-ibm and added to the GitHub Advisory Database on June 3, 2026. It carries a CVSS v3.1 base score of 5.5 (Medium) (GitHub Advisory).

Détails techniques

The vulnerability stems from three distinct weaknesses in Docling's METS-GBS backend: CWE-611 (Improper Restriction of XML External Entity Reference), CWE-409 (Improper Handling of Highly Compressed Data), and CWE-776 (Improper Restriction of Recursive Entity References in DTDs). The XML parser lacked hardened settings — specifically, it did not disable entity resolution, DTD loading, or network access — allowing an attacker to embed malicious external entity references in METS-GBS XML to read local files or trigger denial of service. Separately, the archive extraction logic imposed no size or member-count limits, enabling decompression bombs and unbounded resource consumption. Exploitation requires a user to open or process a maliciously crafted METS-GBS archive, making user interaction a prerequisite (GitHub Advisory, Docling Security Advisory).

Impact

Successful exploitation can result in high availability impact through memory and disk exhaustion caused by decompression bombs or unbounded archive extraction, potentially crashing the application or the host system. XXE attacks additionally expose sensitive local files readable by the process, posing a confidentiality risk not fully reflected in the CVSS score (which scores confidentiality as None under the local-attack-vector scenario). Systems that automatically process user-supplied METS-GBS archives — such as document ingestion pipelines — are at greatest risk of resource exhaustion and data exposure (GitHub Advisory).

Étapes d’exploitation

  1. Craft a malicious METS-GBS archive: Create a tar archive containing a METS XML file with an XXE payload (e.g., <!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]><mets>&xxe;</mets>) to read local files, or embed a decompression bomb (e.g., a highly compressed file that expands to hundreds of megabytes) to exhaust system resources.
  2. Deliver the archive to the target: Social-engineer a user or automated pipeline running a vulnerable version of Docling (>= 2.45.0, < 2.91.0) into processing the malicious archive — for example, by submitting it through a document upload interface or sharing it as a legitimate-looking METS-GBS file.
  3. Trigger processing: The target user or system invokes Docling's METS-GBS backend to parse the archive. The unsanitized XML parser resolves the external entity or the extraction logic decompresses the bomb without limits.
  4. Achieve objective: For XXE, the resolved entity content (e.g., /etc/passwd) is embedded in the parser output, leaking sensitive file contents. For decompression bombs, memory and disk are exhausted, causing application crash or denial of service (GitHub Advisory, Docling Security Advisory).

Indicateurs de compromis

  • File System: Unexpected large temporary files or directories created during archive extraction; disk space rapidly consumed in the Docling working or temp directory.
  • Process: Docling process exhibiting abnormally high memory consumption or CPU usage during METS-GBS file processing; process crash or OOM-killer events in system logs.
  • Logs: Application error logs showing XML parsing exceptions related to external entity resolution or DTD loading; extraction errors referencing oversized files or excessive member counts (prior to patching, these would be absent — their presence post-patch indicates attempted exploitation).
  • Network: Unexpected outbound DNS or HTTP requests originating from the Docling process to external hosts during XML parsing (indicative of XXE with network-based entity URIs) (GitHub Advisory).

Atténuation et solutions de contournement

Upgrade docling to version 2.91.0 or later, which introduces secure XML parsing (resolve_entities=False, load_dtd=False, no_network=True) and enforces extraction limits (300 MB total, 10 MB per file, 1000 members). As a temporary workaround, avoid processing METS-GBS archives from untrusted sources; if processing is necessary, pre-validate archives in an isolated environment with strict resource limits (e.g., containers with memory/disk quotas). The fix was released on April 23, 2026, as part of Docling v2.91.0 (Docling v2.91.0 Release, GitHub Advisory).

Ressources additionnelles


SourceCe rapport a été généré à l’aide de l’IA

Apparenté Python Vulnérabilités:

Identifiant CVE

Sévérité

Score

Technologies

Nom du composant

Exploit CISA KEV

A corrigé

Date de publication

CVE-2026-48804HIGH7.5
  • Python logoPython
  • python-socketio
NonOuiJun 26, 2026
CVE-2026-48802HIGH7.5
  • Python logoPython
  • python-engineio
NonOuiJun 26, 2026
GHSA-75mw-h36v-2jv7MEDIUM6.1
  • Python logoPython
  • dosage
NonOuiJun 26, 2026
CVE-2026-48813LOWN/A
  • Python logoPython
  • flawfinder
NonOuiJun 26, 2026
GHSA-98x5-vq43-vc5pCRITICALN/A
  • Python logoPython
  • semantic-router
NonOuiJun 26, 2026

Évaluation gratuite des vulnérabilités

Évaluez votre posture de sécurité dans le cloud

Évaluez vos pratiques de sécurité cloud dans 9 domaines de sécurité pour évaluer votre niveau de risque et identifier les failles dans vos défenses.

Demander une évaluation

Obtenez une démo personnalisée

Prêt(e) à voir Wiz en action ?

"La meilleure expérience utilisateur que j’ai jamais vue, offre une visibilité totale sur les workloads cloud."
David EstlickRSSI
"Wiz fournit une interface unique pour voir ce qui se passe dans nos environnements cloud."
Adam FletcherChef du service de sécurité
"Nous savons que si Wiz identifie quelque chose comme critique, c’est qu’il l’est réellement."
Greg PoniatowskiResponsable de la gestion des menaces et des vulnérabilités