pull/1262/head
Damini2004 2024-06-23 00:04:05 +05:30
rodzic 9cb2523ae7
commit 417e3c81a3
1 zmienionych plików z 106 dodań i 4 usunięć

Wyświetl plik

@ -1,6 +1,7 @@
## Regular Expressions in Python
Regular expressions (regex) are a powerful tool for pattern matching and text manipulation.
Python's re module provides comprehensive support for regular expressions, enabling efficient text processing and validation.
Regular expressions (regex) are a versitile tool for matching patterns in strings. In Python, the `re` module provides support for working with regular expressions.
## 1. Introduction to Regular Expressions
A regular expression is a sequence of characters defining a search pattern. Common use cases include validating input, searching within text, and extracting
@ -20,6 +21,80 @@ Metacharacters: Special characters like ., *, ?, +, ^, $, [ ], and | used to bui
* ?: 0 or 1 repetition.
* []: Any one character inside brackets (e.g., [a-z]).
* |: Either the pattern before or after.
* \ : Used to drop the special meaning of character following it
* {} : Indicate the number of occurrences of a preceding regex to match.
* () : Enclose a group of Regex
Examples:
```bash
1. `.`
import re
pattern = r'c.t'
text = 'cat cot cut cit'
matches = re.findall(pattern, text)
print(matches) # Output: ['cat', 'cot', 'cut', 'cit']
2. `^`
pattern = r'^Hello'
text = 'Hello, world!'
match = re.search(pattern, text)
print(match.group() if match else 'No match') # Output: 'Hello'
3. `$`
pattern = r'world!$'
text = 'Hello, world!'
match = re.search(pattern, text)
print(match.group() if match else 'No match') # Output: 'world!'
4. `*`
pattern = r'ab*'
text = 'a ab abb abbb'
matches = re.findall(pattern, text)
print(matches) # Output: ['a', 'ab', 'abb', 'abbb']
5. `+`
pattern = r'ab+'
text = 'a ab abb abbb'
matches = re.findall(pattern, text)
print(matches) # Output: ['ab', 'abb', 'abbb']
6. `?`
pattern = r'ab?'
text = 'a ab abb abbb'
matches = re.findall(pattern, text)
print(matches) # Output: ['a', 'ab', 'ab', 'ab']
7. `[]`
pattern = r'[aeiou]'
text = 'hello world'
matches = re.findall(pattern, text)
print(matches) # Output: ['e', 'o', 'o']
8. `|`
pattern = r'cat|dog'
text = 'I have a cat and a dog.'
matches = re.findall(pattern, text)
print(matches) # Output: ['cat', 'dog']
9. `\``
pattern = r'\$100'
text = 'The price is $100.'
match = re.search(pattern, text)
print(match.group() if match else 'No match') # Output: '$100'
10. `{}`
pattern = r'\d{3}'
text = 'My number is 123456'
matches = re.findall(pattern, text)
print(matches) # Output: ['123', '456']
11. `()`
pattern = r'(cat|dog)'
text = 'I have a cat and a dog.'
matches = re.findall(pattern, text)
print(matches) # Output: ['cat', 'dog']
```
## 3. Using the re Module
@ -29,7 +104,8 @@ Metacharacters: Special characters like ., *, ?, +, ^, $, [ ], and | used to bui
* re.search(): Searches for a match anywhere in the string.
* re.findall(): Returns a list of all matches.
* re.sub(): Replaces matches with a specified string.
* re.split(): Returns a list where the string has been split at each match.
* re.escape(): Escapes special character
Examples:
```bash
import re
@ -45,6 +121,12 @@ print(re.findall(r'\d+', 'abc123def456')) # Output: ['123', '456']
# Substitute matches
print(re.sub(r'\d+', '#', 'abc123def456')) # Output: abc#def#
#Return a list where it get matched
print(re.split("\s", txt)) #['The', 'Donkey', 'in', 'the','Town']
# Escape special character
print(re.escape("We are good to go")) #We\ are\ good\ to\ go
```
## 4. Compiling Regular Expressions
@ -58,6 +140,7 @@ pattern = re.compile(r'\d+')
print(pattern.match('123abc').group()) # Output: 123
print(pattern.search('abc123').group()) # Output: 123
print(pattern.findall('abc123def456')) # Output: ['123', '456']
```
## 5. Groups and Capturing
@ -78,19 +161,38 @@ if match:
## 6. Special Sequences
Special sequences are shortcuts for common patterns:
* \A:Returns a match if the specified characters are at the beginning of the string.
* \b:Returns a match where the specified characters are at the beginning or at the end of a word.
* \B:Returns a match where the specified characters are present, but NOT at the beginning (or at the end) of a word.
* \d: Any digit.
* \D: Any non-digit.
* \w: Any alphanumeric character.
* \W: Any non-alphanumeric character.
* \s: Any whitespace character.
* \S: Any non-whitespace character.
* \Z:Returns a match if the specified characters are at the end of the string.
Example:
```bash
import re
print(re.search(r'\w+@\w+\.\w+', 'Contact: support@example.com').group()) # Output: support@example.com
```
## 7.Sets
A set is a set of characters inside a pair of square brackets [] with a special meaning:
* [arn] : Returns a match where one of the specified characters (a, r, or n) is present.
* [a-n] : Returns a match for any lower case character, alphabetically between a and n.
* [^arn] : Returns a match for any character EXCEPT a, r, and n.
* [0123] : Returns a match where any of the specified digits (0, 1, 2, or 3) are present.
* [0-9] : Returns a match for any digit between 0 and 9.
* [0-5][0-9] : Returns a match for any two-digit numbers from 00 and 59.
* [a-zA-Z] : Returns a match for any character alphabetically between a and z, lower case OR upper case.
* [+] : In sets, +, *, ., |, (), $,{} has no special meaning, so [+] means: return a match for any + character in the string.
## Summary
Regular expressions are a versatile tool for text processing in Python. The re module offers powerful functions and metacharacters for pattern matching,
searching, and manipulation, making it an essential skill for handling complex text processing tasks.
Regular expressions (regex) are a powerful tool for text processing in Python, offering a flexible way to match, search, and manipulate text patterns. The re module provides a comprehensive set of functions and metacharacters to tackle complex text processing tasks.
With regex, you can:
1.Match patterns: Use metacharacters like ., *, ?, and {} to match specific patterns in text.
2.Search text: Employ functions like re.search() and re.match() to find occurrences of patterns in text.
3.Manipulate text: Utilize functions like re.sub() to replace patterns with new text.