Regular Expressions in Python

Getting your Trinity Audio player ready...

Have you ever needed to search for a pattern in text, like finding all emails, phone numbers, or dates in a file?
That’s exactly what Regular Expressions in Python — or Regex — help you do!

They’re like supercharged search tools that let you match patterns instead of typing exact words.
If you’ve ever used Ctrl + F to find something, think of Regex as its smarter, more powerful cousin.

Table of Contents

What Is a Regular Expression?

A Regular Expression (Regex) is a special sequence of characters that helps you match, find, or manipulate strings using patterns.

Python provides this functionality through the re module.

Let’s start by importing it:

import re

import re

Why Use Regular Expressions in Python?

Regular expressions are used for:

Validating user inputs (like email, phone number, or password)
Searching and extracting text patterns
Replacing or formatting data
Data cleaning in text analytics or data science

Basic Functions in `re` Module

Let’s explore the most useful functions from the re module.

Function	Description	Example
`re.match()`	Checks for a match only at the beginning of the string	`re.match("Hello", "Hello World")`
`re.search()`	Searches the entire string for a match	`re.search("World", "Hello World")`
`re.findall()`	Returns a list of all matches	`re.findall("\d+", "There are 12 apples and 5 mangoes")`
`re.split()`	Splits a string by the matched pattern	`re.split("\s", "Python is fun")`
`re.sub()`	Replaces all matches with a new string	`re.sub("\d", "#", "A1B2C3")`
`re.compile()`	Compiles a regex pattern for reuse	`pattern = re.compile("\d+")`

Also Read: JSON Module in Python

Regex Meta Characters (The Building Blocks)

Meta characters are symbols with special meanings in Regex.

Symbol	Description	Example
`.`	Matches any character (except newline)	`re.search("P.thon", "Python")`
`^`	Matches start of string	`re.match("^Hello", "Hello World")`
`$`	Matches end of string	`re.search("World$", "Hello World")`
`*`	Matches 0 or more occurrences	`re.findall("ab*", "a ab abb abbb")`
`+`	Matches 1 or more occurrences	`re.findall("ab+", "a ab abb abbb")`
`?`	Matches 0 or 1 occurrence	`re.findall("ab?", "a ab abb abbb")`
`{n}`	Exactly n repetitions	`re.findall("a{3}", "aa aaaa aaa")`
`{n,}`	At least n repetitions	`re.findall("a{2,}", "aa aaaa aaa")`
`{n,m}`	Between n and m repetitions	`re.findall("a{2,4}", "a aa aaa aaaa")`
`[]`	Matches any one character in brackets	`[aeiou]` matches any vowel
`	`	Acts as OR operator
`()`	Groups expressions	`(ab)+` matches repeated “ab” patterns

Special Sequences in Regex

Code	Description	Example
`\d`	Any digit (0–9)	`re.findall("\d", "A1B2C3") → ['1', '2', '3']`
`\D`	Non-digit characters	`re.findall("\D", "A1B2") → ['A', 'B']`
`\s`	Whitespace (space, tab, newline)	`re.findall("\s", "Python is fun")`
`\S`	Non-whitespace	`re.findall("\S", "Python is fun")`
`\w`	Alphanumeric (letters, digits, _)	`re.findall("\w", "A_B1!")`
`\W`	Non-alphanumeric	`re.findall("\W", "A_B1!")`
`\b`	Word boundary	`re.findall(r"\bword\b", "word world sword")`
`\B`	Non-word boundary	`re.findall(r"\Bword\B", "password")`

Examples of Common Use Cases

1. Validate an Email Address

import re

email = "user123@gmail.com"
pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-z]{2,}$'

if re.match(pattern, email):
    print("✅ Valid email address")
else:
    print("❌ Invalid email")

import re

email = "user123@gmail.com"
pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-z]{2,}$'

if re.match(pattern, email):
    print("✅ Valid email address")
else:
    print("❌ Invalid email")

Output:

✅ Valid email address

2. Extract All Phone Numbers from Text

text = "Call me at 9876543210 or 9123456789"
phones = re.findall(r'\b\d{10}\b', text)
print(phones)

text = "Call me at 9876543210 or 9123456789"
phones = re.findall(r'\b\d{10}\b', text)
print(phones)

Output:

['9876543210', '9123456789']

3. Find All Capital Words

sentence = "Python is Fun and POWERFUL"
caps = re.findall(r'\b[A-Z]{2,}\b', sentence)
print(caps)

sentence = "Python is Fun and POWERFUL"
caps = re.findall(r'\b[A-Z]{2,}\b', sentence)
print(caps)

Output:

['POWERFUL']

4. Replace All Digits with “#”

data = "Order ID: 12345"
cleaned = re.sub(r'\d', '#', data)
print(cleaned)

data = "Order ID: 12345"
cleaned = re.sub(r'\d', '#', data)
print(cleaned)

Output:

Order ID: #####

5. Split a String by Multiple Delimiters

text = "apple,banana;grape orange"
fruits = re.split(r'[;,\s]+', text)
print(fruits)

text = "apple,banana;grape orange"
fruits = re.split(r'[;,\s]+', text)
print(fruits)

Output:

['apple', 'banana', 'grape', 'orange']

6. Extract Dates from a Paragraph

text = "Meeting on 25-12-2025 and 01/01/2026."
dates = re.findall(r'\b\d{2}[-/]\d{2}[-/]\d{4}\b', text)
print(dates)

text = "Meeting on 25-12-2025 and 01/01/2026."
dates = re.findall(r'\b\d{2}[-/]\d{2}[-/]\d{4}\b', text)
print(dates)

Output:

['25-12-2025', '01/01/2026']

Using `re.compile()` for Reusability

Instead of writing the pattern every time, you can compile it once:

pattern = re.compile(r'\d{10}')

if pattern.search("My number is 9876543210"):
    print("Phone number found!")

pattern = re.compile(r'\d{10}')

if pattern.search("My number is 9876543210"):
    print("Phone number found!")

Output:

Phone number found!

✅ Why use it?
It makes code faster and cleaner when you reuse the same regex multiple times.

Flags in RegEx

Regex flags let you modify the behavior of pattern matching.

Flag	Description	Example
`re.IGNORECASE` or `re.I`	Case-insensitive match	`re.findall(r'python', 'PYTHON rocks', re.I)`
`re.MULTILINE` or `re.M`	`^` and `$` match start/end of each line	`re.findall('^Hello', text, re.M)`
`re.DOTALL` or `re.S`	`.` matches newline too	`re.findall('a.*b', 'a\nb', re.S)`
`re.VERBOSE` or `re.X`	Allows multi-line regex with comments	For complex patterns

Project: Text Data Extractor

Let’s make a mini project using everything we learned.

Problem:

We have a text file with messy data (emails, phone numbers, and dates).
We need to extract and clean this information using Regex.

Code:

import re

# Read text from file
with open("data.txt", "r") as file:
    text = file.read()

# Define patterns
email_pattern = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-z]{2,}'
phone_pattern = r'\b\d{10}\b'
date_pattern = r'\b\d{2}[-/]\d{2}[-/]\d{4}\b'

# Extract data
emails = re.findall(email_pattern, text)
phones = re.findall(phone_pattern, text)
dates = re.findall(date_pattern, text)

# Display results
print("Emails Found:", emails)
print("Phone Numbers:", phones)
print("Dates Found:", dates)

import re

# Read text from file
with open("data.txt", "r") as file:
    text = file.read()

# Define patterns
email_pattern = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-z]{2,}'
phone_pattern = r'\b\d{10}\b'
date_pattern = r'\b\d{2}[-/]\d{2}[-/]\d{4}\b'

# Extract data
emails = re.findall(email_pattern, text)
phones = re.findall(phone_pattern, text)
dates = re.findall(date_pattern, text)

# Display results
print("Emails Found:", emails)
print("Phone Numbers:", phones)
print("Dates Found:", dates)

Read More: File Handling in Python

Code Explanation

re.findall() → Scans through the file and extracts all matches.
We used different regex patterns for emails, phones, and dates.
This kind of text extraction is common in data cleaning, NLP, and web scraping.

Quick Regex Cheat Sheet

Pattern	Description	Example
`\d`	Digit	`5`
`\w`	Word character	`A`, `b`, `3`
`\s`	Whitespace	Space, tab, newline
`.`	Any character except newline	`a.b` → matches `acb`
`[a-z]`	Lowercase letters	`a` to `z`
`[A-Z]`	Uppercase letters	`A` to `Z`
`[0-9]`	Digits	`5`, `9`
`[^abc]`	Not a, b, or c	`d`, `e`
`^pattern`	Pattern at start	`^Hello`
`pattern$`	Pattern at end	`World$`
`pattern1	pattern2`	Either pattern

Final Thoughts

Regular Expressions in Python may look tricky at first — but once you understand their logic, they become an indispensable tool for data validation, cleaning, and automation.

You’ve just learned how to:

Search, match, and replace text using patterns
Validate emails, phone numbers, and dates
Use flags, groups, and special sequences
Build a real-world data extraction project

Keep practicing with real examples — the more you use Regex, the more natural it becomes.

What’s Next?

In the next post, we’ll learn about the Requests Module in Python

Spread the love

What Is a Regular Expression?

Why Use Regular Expressions in Python?

Basic Functions in re Module

Regex Meta Characters (The Building Blocks)

Special Sequences in Regex

Examples of Common Use Cases

1. Validate an Email Address

2. Extract All Phone Numbers from Text

3. Find All Capital Words

4. Replace All Digits with “#”

5. Split a String by Multiple Delimiters

6. Extract Dates from a Paragraph

Using re.compile() for Reusability

Flags in RegEx

Project: Text Data Extractor

Problem:

Code:

Code Explanation

Quick Regex Cheat Sheet

Final Thoughts

What’s Next?

Leave a Comment Cancel Reply

Basic Functions in `re` Module

Using `re.compile()` for Reusability