In 2016, the popular NPM package `moment.js` patched a vulnerability.
In 2019, Cloudflare suffered a massive global outage that took down millions of websites.
The root cause of both of these catastrophic failures? A single, poorly written Regular Expression.
When writing regex, developers focus heavily on making sure it matches the right strings. But a much more dangerous problem lurks in the shadows: what happens when it tries to match the wrong strings?
Welcome to the world of Catastrophic Backtracking and ReDoS (Regular Expression Denial of Service).
What is Backtracking?
Most regex engines (like the ECMAScript engine used in Node.js and browsers) are "NFA" (Nondeterministic Finite Automaton) engines.
This means that when a pattern has multiple ways to match a string (like using a greedy `*` quantifier or an alternation `|`), the engine will make a guess. If that guess ultimately leads to a failure later in the pattern, the engine will "backtrack"—it rewinds its position and tries the next possible permutation.
Normally, backtracking happens in microseconds. But when you nest greedy quantifiers, the number of permutations the engine must check grows exponentially.
The Anatomy of a ReDoS Attack
Let's look at a classic vulnerable pattern:
```regex
^(a+)+$
```
Why is this dangerous?
- `a+` matches one or more "a"s.
- `( ... )+` matches that entire group one or more times.
If you give it the string `aaaaa`, it matches instantly.
But what if you give it the string `aaaaaaaaaaaaaaaaaaaaaaaaa!`?
The engine reads the 25 "a"s perfectly. Then it hits the `!`. Because of the `$` anchor at the end of the regex, the engine realizes the string is invalid. But before it can return `false`, it has to backtrack and try every single mathematical permutation of `(a+)+` to be absolutely certain there isn't a combination that works.
For 25 characters, it checks millions of combinations. If you give it 30 characters, the server will freeze for a minute. If you give it 50 characters, your Node.js event loop is blocked forever. Your server is effectively dead.
Real-World Vulnerabilities
You probably aren't writing `^(a+)+$` in production. But you might write something like this to validate an email list:
```regex
^([a-zA-Z0-9]+\s?)+$
```
This pattern suffers from the exact same vulnerability. The inner group `[a-zA-Z0-9]+` is greedy, and the outer group `(...)+` is also greedy.
If a malicious user submits a string with 50 valid characters followed by one invalid symbol (e.g., `abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz!`), the regex engine will spin out of control trying to match it.
How to Prevent Catastrophic Backtracking
1. Avoid Nested Quantifiers
Never put a `+` or `` directly inside a group that also has a `+` or ``.
Instead of `^([a-zA-Z0-9]+\s?)+$`, rewrite it to be mutually exclusive:
```regex
^[a-zA-Z0-9]+(?:\s[a-zA-Z0-9]+)*$
```
2. Use Possessive Quantifiers (If Supported)
Languages like PHP (PCRE) and Java support possessive quantifiers like `++`. This tells the engine "grab as much as you can, and NEVER backtrack to try other options." Unfortunately, JavaScript does not support possessive quantifiers natively.
3. Use the RE2 Engine
If you are building an application where users can supply their own regular expressions (like a search tool), never run them using the native Node.js `RegExp` object.
Instead, use Google's RE2 engine. RE2 uses a deterministic DFA architecture that strictly prohibits backtracking, guaranteeing linear execution time. (Note: RE2 achieves this by removing support for lookaheads and backreferences).
Test your patterns for performance and ReDoS vulnerabilities using the FluxToolkit Regex Sandbox. If your browser freezes while testing a string, you know you have a backtracking issue!




