Regex Cheat Sheet: Patterns You'll Actually Use

A practical reference of the regular expressions worth memorising — with the greedy-vs-lazy trap explained.

By Ben Praveen J · May 22, 2026

Regular expressions look like line noise until they click, and then they become one of the most useful skills a developer, analyst, or writer can have. This is not a theory lesson — it is a practical cheat sheet of patterns you will reach for again and again, with plain-English explanations of what each piece does and a note on the traps that catch people.

The building blocks, in one minute

Almost every pattern is assembled from a small vocabulary:

. matches any single character. \d a digit, \w a word character (letter, digit, or underscore), \s any whitespace.
* means “zero or more”, + “one or more”, ? “zero or one”. Add ? after any of them to make it lazy (match as little as possible).
^ anchors to the start of the line, $ to the end.
[abc] is a character class — any one of a, b, or c. [^abc] means anything except those.
(...) groups and captures; (?:...) groups without capturing.
{2,4} means “between two and four times”.

Patterns you will actually use

1. Trim trailing whitespace

\s+$ — matches any run of spaces or tabs at the end of a line. Replace with nothing to clean up sloppy text.

2. Collapse multiple spaces into one

{2,} — two or more spaces. Replace with a single space to normalise text pasted from PDFs.

3. Match an email address (pragmatic version)

[\w.+-]+@[\w-]+\.[\w.-]+ — good enough for finding emails in a blob of text. Do not try to write the “perfect” email regex; the official one is hundreds of characters and still not worth it.

4. Find URLs

https?://[^\s]+ — “http” with an optional “s”, then everything up to the next space. Simple and effective for pulling links out of documents.

5. Match a date like 2026-01-31

\d{4}-\d{2}-\d{2} — four digits, dash, two digits, dash, two digits. Tighten it with \d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01]) if you need to reject impossible months and days.

6. Capture a number, including decimals

-?\d+(?:\.\d+)? — an optional minus sign, digits, then an optional decimal part. The (?:...) keeps the decimal grouped without creating a separate capture.

7. Validate a simple slug

^[a-z0-9]+(?:-[a-z0-9]+)*$ — lowercase letters and digits, with single hyphens between words. Rejects leading, trailing, and doubled hyphens.

8. Extract content between quotes

"([^"]*)" — a quote, then a captured run of anything that is not a quote, then the closing quote. The negated class is what stops it from greedily swallowing across multiple quoted strings.

The greedy-vs-lazy trap

This is the single most common regex bug. Suppose you want to match an HTML tag with <.+> in the text <b>hi</b>. Because + is greedy, it matches from the first < all the way to the last > — the whole string. Make it lazy with <.+?> and it stops at the first >, matching just <b>. Whenever a pattern “matches too much”, suspect greediness first.

Capture groups and backreferences

Parentheses do double duty: they group for repetition and they capture for reuse. In a find-and-replace you can refer to captured groups as $1, $2 (or \1, \2 in some tools). To swap “Last, First” into “First Last”, match (\w+),\s*(\w+) and replace with $2 $1. Backreferences inside the pattern itself — (\w+)\s+\1 — find doubled words like “the the”.

Test before you trust

Regex is unforgiving: a single misplaced . or a forgotten escape changes what matches. Never deploy a pattern you have only read; run it against real samples, including the awkward ones — empty strings, values with commas, lines with trailing spaces. A live regex tester that highlights matches as you type turns an hour of guesswork into a couple of minutes.

Try it: Test your regex live → See matches and groups highlighted as you type.

Final advice

Keep patterns as simple as the job allows. A regex that is “clever” today is unreadable in six months. When a pattern grows past a line or two, that is often a sign the problem is better solved with real parsing rather than one heroic expression. And always escape the characters that have special meaning — . * + ? ( ) [ ] { } ^ $ | \ — when you mean them literally. Once these patterns are muscle memory, cleaning data, validating input, and bulk-editing text becomes genuinely fast.

← Blog index | Quick guides | All tools