How can I find all matches to a regular expression in Python?
In Python, you typically use re.findall() or re.finditer() to retrieve all matches of a regular expression in a string. Below are the common ways and their differences.
1. re.findall() for a List of Matches
import re
text = "Hello 123, goodbye 456"
pattern = r"\d+"
matches = re.findall(pattern, text)
print(matches)
# Output: ["123", "456"]
re.findall(pattern, string)returns a list of matched substrings.- If the pattern has capturing groups,
findallreturns either:- a list of strings (if there's exactly one capturing group), or
- a list of tuples (if there are multiple capturing groups).
Example with Capturing Group
text = "Name: John, Age: 30"
pattern = r"(\w+):\s(\w+)"
# Each match has two groups -> result is a list of tuples
matches = re.findall(pattern, text)
print(matches)
# Output: [('Name', 'John'), ('Age', '30')]
2. re.finditer() for an Iterator of Match Objects
import re
text = "Hello 123, goodbye 456"
pattern = r"\d+"
for match in re.finditer(pattern, text):
print("Match:", match.group(0), "at", match.span())
re.finditer(pattern, string)returns an iterator of match objects (re.Matchin Python 3.7+).- Each match object gives you start/end indices (
.span()), the full match (.group(0)), and any capturing groups (e.g..group(1),.group(2), etc.). - Ideal if you need detailed info about the positions or groups for each match.
3. Other Tips & Flags
3.1 Regex Flags
import re
text = "HELLO\nhello"
pattern = r"hello"
# re.IGNORECASE -> case-insensitive
# re.DOTALL -> '.' matches newline
# re.MULTILINE -> '^' and '$' match start/end of lines
matches = re.findall(pattern, text, flags=re.IGNORECASE)
print(matches) # ['HELLO', 'hello']
Common flags include:
re.IGNORECASEorre.I: case-insensitive matching.re.DOTALLorre.S: '.' matches newline.re.MULTILINEorre.M:^and$match start/end of each line, not just the entire string.
Recommended Courses
3.2 Overlapping Matches
findall()andfinditer()find non-overlapping matches. If you need overlapping matches, you have to devise a custom loop (e.g. adjusting the search start index on each iteration) or use a regex trick like lookahead. For example:
This captures overlapping occurrences ofimport re text = "aaaa" pattern = r"(?=(aa))" # lookahead-based approach matches = re.findall(pattern, text) print(matches) # ['aa', 'aa', 'aa']"aa".
4. Summary
re.findall(pattern, string): Returns a list of all matched substrings (or a list of tuples if multiple capturing groups).re.finditer(pattern, string): Returns an iterator of match objects, offering more control (like match positions, individual groups, etc.).- Non-overlapping: By default, both skip overlapping matches unless you use lookaheads or specialized logic.
Bonus: Level Up Your Regex & Coding Interview Skills
If you’re digging into Python and regex while preparing for interviews or real-world tasks, check out these DesignGurus.io resources:
Grokking the Coding Interview: Patterns for Coding Questions
Master common coding patterns essential for interviews and problem-solving.Grokking Data Structures & Algorithms for Coding Interviews
Strengthen your DS&A fundamentals—key for technical interviews.Grokking Python Fundamentals
Dive into Python essentials.
For personalized feedback from ex-FAANG engineers, explore Mock Interviews:
Also, find free content on the DesignGurus.io YouTube channel.
Conclusion: Use re.findall() or re.finditer() to retrieve all regex matches in Python. findall gives you a list of matches or tuples, while finditer yields match objects for more detailed info.