Regular Expressions in Python

Getting your Trinity Audio player ready...

Have you ever needed to search for a pattern in text, like finding all emails, phone numbers, or dates in a file?
That’s exactly what Regular Expressions in Python — or Regex — help you do!

Regular Expressions in Python

They’re like supercharged search tools that let you match patterns instead of typing exact words.
If you’ve ever used Ctrl + F to find something, think of Regex as its smarter, more powerful cousin.


What Is a Regular Expression?

A Regular Expression (Regex) is a special sequence of characters that helps you match, find, or manipulate strings using patterns.

Python provides this functionality through the re module.

Let’s start by importing it:

import re

Why Use Regular Expressions in Python?

Regular expressions are used for:

  • Validating user inputs (like email, phone number, or password)
  • Searching and extracting text patterns
  • Replacing or formatting data
  • Data cleaning in text analytics or data science

Basic Functions in re Module

Let’s explore the most useful functions from the re module.

FunctionDescriptionExample
re.match()Checks for a match only at the beginning of the stringre.match("Hello", "Hello World")
re.search()Searches the entire string for a matchre.search("World", "Hello World")
re.findall()Returns a list of all matchesre.findall("\d+", "There are 12 apples and 5 mangoes")
re.split()Splits a string by the matched patternre.split("\s", "Python is fun")
re.sub()Replaces all matches with a new stringre.sub("\d", "#", "A1B2C3")
re.compile()Compiles a regex pattern for reusepattern = re.compile("\d+")

Also Read: JSON Module in Python


Regex Meta Characters (The Building Blocks)

Meta characters are symbols with special meanings in Regex.

SymbolDescriptionExample
.Matches any character (except newline)re.search("P.thon", "Python")
^Matches start of stringre.match("^Hello", "Hello World")
$Matches end of stringre.search("World$", "Hello World")
*Matches 0 or more occurrencesre.findall("ab*", "a ab abb abbb")
+Matches 1 or more occurrencesre.findall("ab+", "a ab abb abbb")
?Matches 0 or 1 occurrencere.findall("ab?", "a ab abb abbb")
{n}Exactly n repetitionsre.findall("a{3}", "aa aaaa aaa")
{n,}At least n repetitionsre.findall("a{2,}", "aa aaaa aaa")
{n,m}Between n and m repetitionsre.findall("a{2,4}", "a aa aaa aaaa")
[]Matches any one character in brackets[aeiou] matches any vowel
``Acts as OR operator
()Groups expressions(ab)+ matches repeated “ab” patterns

Special Sequences in Regex

CodeDescriptionExample
\dAny digit (0–9)re.findall("\d", "A1B2C3") → ['1', '2', '3']
\DNon-digit charactersre.findall("\D", "A1B2") → ['A', 'B']
\sWhitespace (space, tab, newline)re.findall("\s", "Python is fun")
\SNon-whitespacere.findall("\S", "Python is fun")
\wAlphanumeric (letters, digits, _)re.findall("\w", "A_B1!")
\WNon-alphanumericre.findall("\W", "A_B1!")
\bWord boundaryre.findall(r"\bword\b", "word world sword")
\BNon-word boundaryre.findall(r"\Bword\B", "password")

Examples of Common Use Cases

1. Validate an Email Address

import re

email = "user123@gmail.com"
pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-z]{2,}$'

if re.match(pattern, email):
    print("✅ Valid email address")
else:
    print("❌ Invalid email")

Output:

✅ Valid email address

2. Extract All Phone Numbers from Text

text = "Call me at 9876543210 or 9123456789"
phones = re.findall(r'\b\d{10}\b', text)
print(phones)

Output:

['9876543210', '9123456789']

3. Find All Capital Words

sentence = "Python is Fun and POWERFUL"
caps = re.findall(r'\b[A-Z]{2,}\b', sentence)
print(caps)

Output:

['POWERFUL']

4. Replace All Digits with “#”

data = "Order ID: 12345"
cleaned = re.sub(r'\d', '#', data)
print(cleaned)

Output:

Order ID: #####

5. Split a String by Multiple Delimiters

text = "apple,banana;grape orange"
fruits = re.split(r'[;,\s]+', text)
print(fruits)

Output:

['apple', 'banana', 'grape', 'orange']

6. Extract Dates from a Paragraph

text = "Meeting on 25-12-2025 and 01/01/2026."
dates = re.findall(r'\b\d{2}[-/]\d{2}[-/]\d{4}\b', text)
print(dates)

Output:

['25-12-2025', '01/01/2026']

Using re.compile() for Reusability

Instead of writing the pattern every time, you can compile it once:

pattern = re.compile(r'\d{10}')

if pattern.search("My number is 9876543210"):
    print("Phone number found!")

Output:

Phone number found!

Why use it?
It makes code faster and cleaner when you reuse the same regex multiple times.


Flags in RegEx

Regex flags let you modify the behavior of pattern matching.

FlagDescriptionExample
re.IGNORECASE or re.ICase-insensitive matchre.findall(r'python', 'PYTHON rocks', re.I)
re.MULTILINE or re.M^ and $ match start/end of each linere.findall('^Hello', text, re.M)
re.DOTALL or re.S. matches newline toore.findall('a.*b', 'a\nb', re.S)
re.VERBOSE or re.XAllows multi-line regex with commentsFor complex patterns

Project: Text Data Extractor

Let’s make a mini project using everything we learned.

Problem:

We have a text file with messy data (emails, phone numbers, and dates).
We need to extract and clean this information using Regex.

Code:

import re

# Read text from file
with open("data.txt", "r") as file:
    text = file.read()

# Define patterns
email_pattern = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-z]{2,}'
phone_pattern = r'\b\d{10}\b'
date_pattern = r'\b\d{2}[-/]\d{2}[-/]\d{4}\b'

# Extract data
emails = re.findall(email_pattern, text)
phones = re.findall(phone_pattern, text)
dates = re.findall(date_pattern, text)

# Display results
print("Emails Found:", emails)
print("Phone Numbers:", phones)
print("Dates Found:", dates)

Read More: File Handling in Python


Code Explanation

  • re.findall() → Scans through the file and extracts all matches.
  • We used different regex patterns for emails, phones, and dates.
  • This kind of text extraction is common in data cleaning, NLP, and web scraping.

Quick Regex Cheat Sheet

PatternDescriptionExample
\dDigit5
\wWord characterA, b, 3
\sWhitespaceSpace, tab, newline
.Any character except newlinea.b → matches acb
[a-z]Lowercase lettersa to z
[A-Z]Uppercase lettersA to Z
[0-9]Digits5, 9
[^abc]Not a, b, or cd, e
^patternPattern at start^Hello
pattern$Pattern at endWorld$
`pattern1pattern2`Either pattern

Final Thoughts

Regular Expressions in Python may look tricky at first — but once you understand their logic, they become an indispensable tool for data validation, cleaning, and automation.

You’ve just learned how to:

  • Search, match, and replace text using patterns
  • Validate emails, phone numbers, and dates
  • Use flags, groups, and special sequences
  • Build a real-world data extraction project

Keep practicing with real examples — the more you use Regex, the more natural it becomes.

What’s Next?

In the next post, we’ll learn about the Requests Module in Python

Spread the love

Leave a Comment

Your email address will not be published. Required fields are marked *

Translate »
Scroll to Top