kopia lustrzana https://github.com/animator/learn-python
Create regular_expressions.md
Added content for regular expressions in pythonpull/488/head
rodzic
406004d9c9
commit
2967206c16
|
@ -0,0 +1,96 @@
|
|||
## Regular Expressions in Python
|
||||
Regular expressions (regex) are a powerful tool for pattern matching and text manipulation.
|
||||
Python's re module provides comprehensive support for regular expressions, enabling efficient text processing and validation.
|
||||
|
||||
## 1. Introduction to Regular Expressions
|
||||
A regular expression is a sequence of characters defining a search pattern. Common use cases include validating input, searching within text, and extracting
|
||||
specific patterns.
|
||||
|
||||
## 2. Basic Syntax
|
||||
Literal Characters: Match exact characters (e.g., abc matches "abc").
|
||||
Metacharacters: Special characters like ., *, ?, +, ^, $, [ ], and | used to build patterns.
|
||||
|
||||
**Common Metacharacters:**
|
||||
|
||||
* .: Any character except newline.
|
||||
* ^: Start of the string.
|
||||
* $: End of the string.
|
||||
* *: 0 or more repetitions.
|
||||
* +: 1 or more repetitions.
|
||||
* ?: 0 or 1 repetition.
|
||||
* []: Any one character inside brackets (e.g., [a-z]).
|
||||
* |: Either the pattern before or after.
|
||||
|
||||
## 3. Using the re Module
|
||||
|
||||
**Key functions in the re module:**
|
||||
|
||||
* re.match(): Checks for a match at the beginning of the string.
|
||||
* re.search(): Searches for a match anywhere in the string.
|
||||
* re.findall(): Returns a list of all matches.
|
||||
* re.sub(): Replaces matches with a specified string.
|
||||
|
||||
Examples:
|
||||
```bash
|
||||
import re
|
||||
|
||||
# Match at the beginning
|
||||
print(re.match(r'\d+', '123abc').group()) # Output: 123
|
||||
|
||||
# Search anywhere
|
||||
print(re.search(r'\d+', 'abc123').group()) # Output: 123
|
||||
|
||||
# Find all matches
|
||||
print(re.findall(r'\d+', 'abc123def456')) # Output: ['123', '456']
|
||||
|
||||
# Substitute matches
|
||||
print(re.sub(r'\d+', '#', 'abc123def456')) # Output: abc#def#
|
||||
```
|
||||
|
||||
## 4. Compiling Regular Expressions
|
||||
Compiling regular expressions improves performance for repeated use.
|
||||
|
||||
Example:
|
||||
```bash
|
||||
import re
|
||||
|
||||
pattern = re.compile(r'\d+')
|
||||
print(pattern.match('123abc').group()) # Output: 123
|
||||
print(pattern.search('abc123').group()) # Output: 123
|
||||
print(pattern.findall('abc123def456')) # Output: ['123', '456']
|
||||
```
|
||||
|
||||
## 5. Groups and Capturing
|
||||
Parentheses () group and capture parts of the match.
|
||||
|
||||
Example:
|
||||
```bash
|
||||
import re
|
||||
|
||||
match = re.match(r'(\d{3})-(\d{2})-(\d{4})', '123-45-6789')
|
||||
if match:
|
||||
print(match.group()) # Output: 123-45-6789
|
||||
print(match.group(1)) # Output: 123
|
||||
print(match.group(2)) # Output: 45
|
||||
print(match.group(3)) # Output: 6789
|
||||
```
|
||||
|
||||
## 6. Special Sequences
|
||||
Special sequences are shortcuts for common patterns:
|
||||
|
||||
* \d: Any digit.
|
||||
* \D: Any non-digit.
|
||||
* \w: Any alphanumeric character.
|
||||
* \W: Any non-alphanumeric character.
|
||||
* \s: Any whitespace character.
|
||||
* \S: Any non-whitespace character.
|
||||
Example:
|
||||
```bash
|
||||
import re
|
||||
|
||||
print(re.search(r'\w+@\w+\.\w+', 'Contact: support@example.com').group()) # Output: support@example.com
|
||||
```
|
||||
|
||||
## Summary
|
||||
Regular expressions are a versatile tool for text processing in Python. The re module offers powerful functions and metacharacters for pattern matching,
|
||||
searching, and manipulation, making it an essential skill for handling complex text processing tasks.
|
Ładowanie…
Reference in New Issue