diff --git a/contrib/advanced-python/index.md b/contrib/advanced-python/index.md index bf377fc..d91f4b2 100644 --- a/contrib/advanced-python/index.md +++ b/contrib/advanced-python/index.md @@ -2,3 +2,4 @@ - [Decorators/\*args/**kwargs](decorator-kwargs-args.md) - [Working with Dates & Times in Python](dates_and_times.md) +- [Regular Expressions in Python](regular_expressions.md) diff --git a/contrib/advanced-python/regular_expressions.md b/contrib/advanced-python/regular_expressions.md new file mode 100644 index 0000000..65ff2c2 --- /dev/null +++ b/contrib/advanced-python/regular_expressions.md @@ -0,0 +1,96 @@ +## Regular Expressions in Python +Regular expressions (regex) are a powerful tool for pattern matching and text manipulation. +Python's re module provides comprehensive support for regular expressions, enabling efficient text processing and validation. + +## 1. Introduction to Regular Expressions +A regular expression is a sequence of characters defining a search pattern. Common use cases include validating input, searching within text, and extracting +specific patterns. + +## 2. Basic Syntax +Literal Characters: Match exact characters (e.g., abc matches "abc"). +Metacharacters: Special characters like ., *, ?, +, ^, $, [ ], and | used to build patterns. + +**Common Metacharacters:** + +* .: Any character except newline. +* ^: Start of the string. +* $: End of the string. +* *: 0 or more repetitions. +* +: 1 or more repetitions. +* ?: 0 or 1 repetition. +* []: Any one character inside brackets (e.g., [a-z]). +* |: Either the pattern before or after. + +## 3. Using the re Module + +**Key functions in the re module:** + +* re.match(): Checks for a match at the beginning of the string. +* re.search(): Searches for a match anywhere in the string. +* re.findall(): Returns a list of all matches. +* re.sub(): Replaces matches with a specified string. + +Examples: +```bash +import re + +# Match at the beginning +print(re.match(r'\d+', '123abc').group()) # Output: 123 + +# Search anywhere +print(re.search(r'\d+', 'abc123').group()) # Output: 123 + +# Find all matches +print(re.findall(r'\d+', 'abc123def456')) # Output: ['123', '456'] + +# Substitute matches +print(re.sub(r'\d+', '#', 'abc123def456')) # Output: abc#def# +``` + +## 4. Compiling Regular Expressions +Compiling regular expressions improves performance for repeated use. + +Example: +```bash +import re + +pattern = re.compile(r'\d+') +print(pattern.match('123abc').group()) # Output: 123 +print(pattern.search('abc123').group()) # Output: 123 +print(pattern.findall('abc123def456')) # Output: ['123', '456'] +``` + +## 5. Groups and Capturing +Parentheses () group and capture parts of the match. + +Example: +```bash +import re + +match = re.match(r'(\d{3})-(\d{2})-(\d{4})', '123-45-6789') +if match: + print(match.group()) # Output: 123-45-6789 + print(match.group(1)) # Output: 123 + print(match.group(2)) # Output: 45 + print(match.group(3)) # Output: 6789 +``` + +## 6. Special Sequences +Special sequences are shortcuts for common patterns: + +* \d: Any digit. +* \D: Any non-digit. +* \w: Any alphanumeric character. +* \W: Any non-alphanumeric character. +* \s: Any whitespace character. +* \S: Any non-whitespace character. +Example: +```bash +import re + +print(re.search(r'\w+@\w+\.\w+', 'Contact: support@example.com').group()) # Output: support@example.com +``` + +## Summary +Regular expressions are a versatile tool for text processing in Python. The re module offers powerful functions and metacharacters for pattern matching, +searching, and manipulation, making it an essential skill for handling complex text processing tasks.