Python Fundamentals Tutorial: Regular Expressions

12. Regular Expressions (re)

While regular expression handling is very complete in Python, regular expressions are not a first-class language element as they are in Perl or JavaScript. Regular expression handling is found in the re module.

At the simplest level, there are module-level functions in re that can be used to search for regular expresssions. In many cases, calling the search() function and checking for the presence of a return value is enough. search() returns None if the pattern was not found.


It is customary in Python regular expressions to pass the patterns as raw strings (r'pattern') to avoid escaping the special characters that are likely included in the pattern.

>>> text = 'All your base are belong to us.'

>>>'o\s?u', text)
<_sre.SRE_Match object at 0x10041f718>

Take note of the match() function, which specifically only matches the beginning of the text being matched and does not search throughout for the pattern.

>>> re.match(r'o\s?u', text)
>>> re.match('All', text)
<_sre.SRE_Match object at 0x10041f718>

Regular expressions can also be used to split strings in more advanced ways than the string.split() method.

>>> re.split(r'o\s?u', text)
['All y', 'r base are belong t', 's.']

Using the findall() or finditer() methods, it is possible to process all the matching groups. findall() returns a list while finditer() returns an iterator.

>>> re.findall(r'o\s?u', text)
['ou', 'o u']

>>> re.finditer(r'o\s?u', text)
<callable-iterator object at 0x100516610>

If you need to use the same pattern multiple times, you can improve performance by compiling the regex and then using the methods of the regex object, rather than the module-level functions.

If groups are defined in the pattern, they can be accessed using the group() method of the returned Match object. Note that they are 1-indexed to conform to most other regex utilities.

>>> text = 'All your base are belong to us.'

>>> pattern = re.compile(r'you[r]?\s*(\S*)\s*are belong to us')

>>> match =


12.1. Lab

  1. Rename to
  2. Implement to pass the doctests
  3. Create a second file that accepts command line arguments and calls the function in
  4. Accept the following command-line args:

    1. -v, --invert-match select non-matching lines
    2. -E, --extended-regexp PATTERN is an extended regular expression
    3. And a list of files