
Cloud Vulnerability DB
A community-led vulnerabilities database
NLTK (Natural Language Toolkit) versions prior to 3.6.5 contained a Regular Expression Denial of Service (ReDoS) vulnerability identified as CVE-2021-43854. The vulnerability was present in PunktSentenceTokenizer, sent_tokenize, and word_tokenize functions. The issue was discovered in October 2021 and patched in version 3.6.6 (GitHub Advisory).
The vulnerability stemmed from an inefficient regular expression pattern in the PunktSentenceTokenizer implementation. The regex pattern '\S*' at the start of the expression caused the Python regex engine to attempt matching from the beginning of the input, only recognizing failure after reaching a whitespace character or the end of input. This resulted in quadratic time complexity O(n^2), where n is the input length. For a malicious input of length n, the regex engine would require (n^2 + n) / 2 steps to process (GitHub PR). The vulnerability received a CVSS v3.1 score of 7.5 (High) with vector string CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H (GitHub Advisory).
When exploited, the vulnerability could cause a significant denial of service through CPU exhaustion. Testing showed that processing a malicious input of 100,000 characters took over 56 seconds, with processing time increasing exponentially for longer inputs. This made the system vulnerable to denial of service attacks when processing untrusted input (GitHub Advisory).
The vulnerability was patched in NLTK version 3.6.6 by removing the problematic '\S*' pattern from the regex and implementing a new matching approach. For users unable to upgrade, the recommended workaround is to implement a maximum length limit on inputs to the vulnerable functions. After the fix, processing time showed a linear relationship with input length, significantly improving performance (GitHub Advisory).
Source: This report was generated using AI
Free Vulnerability Assessment
Evaluate your cloud security practices across 9 security domains to benchmark your risk level and identify gaps in your defenses.
Get a personalized demo
"Best User Experience I have ever seen, provides full visibility to cloud workloads."
"Wiz provides a single pane of glass to see what is going on in our cloud environments."
"We know that if Wiz identifies something as critical, it actually is."