Merge branch 'main' into kRiShNa-429407-patch-1

pull/1157/head
Ashita Prasad 2024-06-22 18:53:21 +05:30 zatwierdzone przez GitHub
commit ac873305a4
Nie znaleziono w bazie danych klucza dla tego podpisu
ID klucza GPG: B5690EEEBB952194
34 zmienionych plików z 2928 dodań i 15 usunięć

Wyświetl plik

@ -0,0 +1,110 @@
## Asynchronous Context Managers and Generators in Python
Asynchronous programming in Python allows for more efficient use of resources by enabling tasks to run concurrently. Python provides support for asynchronous
context managers and generators, which help manage resources and perform operations asynchronously.
### Asynchronous Context Managers
Asynchronous context managers are similar to regular context managers but are designed to work with asynchronous code. They use the async with statement and
typically include the '__aenter__' and '__aexit__' methods.
### Creating an Asynchronous Context Manager
Here's a simple example of an asynchronous context manager:
```bash
import asyncio
class AsyncContextManager:
async def __aenter__(self):
print("Entering context")
await asyncio.sleep(1) # Simulate an async operation
return self
async def __aexit__(self, exc_type, exc, tb):
print("Exiting context")
await asyncio.sleep(1) # Simulate cleanup
async def main():
async with AsyncContextManager() as acm:
print("Inside context")
asyncio.run(main())
```
Output:
```bash
Entering context
Inside context
Exiting context
```
### Asynchronous Generators
Asynchronous generators allow you to yield values within an asynchronous function. They use the async def syntax along with the yield statement and are
iterated using the async for loop.
### Creating an Asynchronous Generator
Here's a basic example of an asynchronous generator:
```bash
import asyncio
async def async_generator():
for i in range(5):
await asyncio.sleep(1) # Simulate an async operation
yield i
async def main():
async for value in async_generator():
print(value)
asyncio.run(main())
```
Output:
```bash
0
1
2
3
4
```
### Combining Asynchronous Context Managers and Generators
You can combine asynchronous context managers and generators to create more complex and efficient asynchronous workflows.
Example: Fetching Data with an Async Context Manager and Generator
Consider a scenario where you need to fetch data from an API asynchronously and manage the connection using an asynchronous context manager:
```bash
import aiohttp
import asyncio
class AsyncHTTPClient:
def __init__(self, url):
self.url = url
async def __aenter__(self):
self.session = aiohttp.ClientSession()
self.response = await self.session.get(self.url)
return self.response
async def __aexit__(self, exc_type, exc, tb):
await self.response.release()
await self.session.close()
async def async_fetch(urls):
for url in urls:
async with AsyncHTTPClient(url) as response:
data = await response.text()
yield data
async def main():
urls = ["http://example.com", "http://example.org", "http://example.net"]
async for data in async_fetch(urls):
print(data)
asyncio.run(main())
```
### Benefits of Asynchronous Context Managers and Generators
1. Efficient Resource Management: They help manage resources like network connections or file handles more efficiently by releasing them as soon as they are no longer needed.
2. Concurrency: They enable concurrent operations, improving performance in I/O-bound tasks such as network requests or file I/O.
3. Readability and Maintainability: They provide a clear and structured way to handle asynchronous operations, making the code easier to read and maintain.
### Summary
Asynchronous context managers and generators are powerful tools in Python that enhance the efficiency and readability
of asynchronous code. By using 'async with' for resource management and 'async for' for iteration, you can write more performant and maintainable asynchronous
programs.

Wyświetl plik

@ -12,8 +12,11 @@
- [Protocols](protocols.md)
- [Exception Handling in Python](exception-handling.md)
- [Generators](generators.md)
- [Match Case Statement](match-case.md)
- [Closures](closures.md)
- [Filter](filter-function.md)
- [Reduce](reduce-function.md)
- [List Comprehension](list-comprehension.md)
- [Eval Function](eval_function.md)
- [Magic Methods](magic-methods.md)
- [Asynchronous Context Managers & Generators](asynchronous-context-managers-generators.md)

Wyświetl plik

@ -0,0 +1,151 @@
# Magic Methods
Magic methods, also known as dunder (double underscore) methods, are special methods in Python that start and end with double underscores (`__`).
These methods allow you to define the behavior of objects for built-in operations and functions, enabling you to customize how your objects interact with the
language's syntax and built-in features. Magic methods make your custom classes integrate seamlessly with Pythons built-in data types and operations.
**Commonly Used Magic Methods**
1. **Initialization and Representation**
- `__init__(self, ...)`: Called when an instance of the class is created. Used for initializing the object's attributes.
- `__repr__(self)`: Returns a string representation of the object, useful for debugging and logging.
- `__str__(self)`: Returns a human-readable string representation of the object.
**Example** :
```python
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
def __repr__(self):
return f"Person({self.name}, {self.age})"
def __str__(self):
return f"{self.name}, {self.age} years old"
p = Person("Alice", 30)
print(repr(p))
print(str(p))
```
**Output** :
```python
Person("Alice",30)
Alice, 30 years old
```
2. **Arithmetic Operations**
- `__add__(self, other)`: Defines behavior for the `+` operator.
- `__sub__(self, other)`: Defines behavior for the `-` operator.
- `__mul__(self, other)`: Defines behavior for the `*` operator.
- `__truediv__(self, other)`: Defines behavior for the `/` operator.
**Example** :
```python
class Vector:
def __init__(self, x, y):
self.x = x
self.y = y
def __add__(self, other):
return Vector(self.x + other.x, self.y + other.y)
def __repr__(self):
return f"Vector({self.x}, {self.y})"
v1 = Vector(2, 3)
v2 = Vector(1, 1)
v3 = v1 + v2
print(v3)
```
**Output** :
```python
Vector(3, 4)
```
3. **Comparison Operations**
- `__eq__(self, other)`: Defines behavior for the `==` operator.
- `__lt__(self, other)`: Defines behavior for the `<` operator.
- `__le__(self, other)`: Defines behavior for the `<=` operator.
**Example** :
```python
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
def __eq__(self, other):
return self.age == other.age
def __lt__(self, other):
return self.age < other.age
p1 = Person("Alice", 30)
p2 = Person("Bob", 25)
print(p1 == p2)
print(p1 < p2)
```
**Output** :
```python
False
False
```
5. **Container and Sequence Methods**
- `__len__(self)`: Defines behavior for the `len()` function.
- `__getitem__(self, key)`: Defines behavior for indexing (`self[key]`).
- `__setitem__(self, key, value)`: Defines behavior for item assignment (`self[key] = value`).
- `__delitem__(self, key)`: Defines behavior for item deletion (`del self[key]`).
**Example** :
```python
class CustomList:
def __init__(self, *args):
self.items = list(args)
def __len__(self):
return len(self.items)
def __getitem__(self, index):
return self.items[index]
def __setitem__(self, index, value):
self.items[index] = value
def __delitem__(self, index):
del self.items[index]
def __repr__(self):
return f"CustomList({self.items})"
cl = CustomList(1, 2, 3)
print(len(cl))
print(cl[1])
cl[1] = 5
print(cl)
del cl[1]
print(cl)
```
**Output** :
```python
3
2
CustomList([1, 5, 3])
CustomList([1, 3])
```
Magic methods provide powerful ways to customize the behavior of your objects and make them work seamlessly with Python's syntax and built-in functions.
Use them judiciously to enhance the functionality and readability of your classes.

Wyświetl plik

@ -0,0 +1,251 @@
# Match Case Statements
## Introduction
Match and case statements are introduced in Python 3.10 for structural pattern matching of patterns with associated actions. It offers more readible and
cleaniness to the code as opposed to the traditional `if-else` statements. They also have destructuring, pattern matching and checks for specific properties in
addition to the traditional `switch-case` statements in other languages, which makes them more versatile.
## Syntax
```
match <statement>:
case <pattern_1>:
<do_task_1>
case <pattern_2>:
<do_task_2>
case _:
<do_task_wildcard>
```
A match statement takes a statement which compares it to the various cases and their patterns. If any of the pattern is matched successively, the task is performed accordingly. If an exact match is not confirmed, the last case, a wildcard `_`, if provided, will be used as the matching case.
## Pattern Matching
As discussed earlier, match case statements use pattern matching where the patterns consist of sequences, mappings, primitive data types as well as class instances. The structural pattern matching uses declarative approach and it nexplicitly states the conditions for the patterns to match with the data.
### Patterns with a Literal
#### Generic Case
`sample text` is passed as a literal in the `match` block. There are two cases and a wildcard case mentioned.
```python
match 'sample text':
case 'sample text':
print('sample text')
case 'sample':
print('sample')
case _:
print('None found')
```
The `sample text` case is satisfied as it matches with the literal `sample text` described in the `match` block.
O/P:
```
sample text
```
#### Using OR
Taking another example, `|` can be used as OR to include multiple patterns in a single case statement where the multiple patterns all lead to a similar task.
The below code snippets can be used interchangebly and generate the similar output. The latter is more consive and readible.
```python
match 'e':
case 'a':
print('vowel')
case 'e':
print('vowel')
case 'i':
print('vowel')
case 'o':
print('vowel')
case 'u':
print('vowel')
case _:
print('consonant')
```
```python
match 'e':
case 'a' | 'e' | 'i' | 'o' | 'u':
print('vowel')
case _:
print('consonant')
```
O/P:
```
vowel
```
#### Without wildcard
When in a `match` block, there is no wildcard case present there are be two cases of match being present or not. If the match doesn't exist, the behaviour is a no-op.
```python
match 'c':
case 'a' | 'e' | 'i' | 'o' | 'u':
print('vowel')
```
The output will be blank as a no-op occurs.
### Patterns with a Literal and a Variable
Pattern matching can be done by unpacking the assignments and also bind variables with it.
```python
def get_names(names: str) -> None:
match names:
case ('Bob', y):
print(f'Hello {y}')
case (x, 'John'):
print(f'Hello {x}')
case (x, y):
print(f'Hello {x} and {y}')
case _:
print('Invalid')
```
Here, the `names` is a tuple that contains two names. The `match` block unpacks the tuple and binds `x` and `y` based on the patterns. A wildcard case prints `Invalid` if the condition is not satisfied.
O/P:
In this example, the above code snippet with the parameter `names` as below and the respective output.
```
>>> get_names(('Bob', 'Max'))
Hello Max
>>> get_names(('Rob', 'John'))
Hello Rob
>>> get_names(('Rob', 'Max'))
Hello Rob and Max
>>> get_names(('Rob', 'Max', 'Bob'))
Invalid
```
### Patterns with Classes
Class structures can be used in `match` block for pattern matching. The class members can also be binded with a variable to perform certain operations. For the class structure:
```python
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
```
The match case example illustrates the generic working as well as the binding of variables with the class members.
```python
def get_class(cls: Person) -> None:
match cls:
case Person(name='Bob', age=18):
print('Hello Bob with age 18')
case Person(name='Max', age=y):
print(f'Age is {y}')
case Person(name=x, age=18):
print(f'Name is {x}')
case Person(name=x, age=y):
print(f'Name and age is {x} and {y}')
case _:
print('Invalid')
```
O/P:
```
>>> get_class(Person('Bob', 18))
Hello Bob with age 18
>>> get_class(Person('Max', 21))
Age is 21
>>> get_class(Person('Rob', 18))
Name is Rob
>>> get_class(Person('Rob', 21))
Name and age is Rob and 21
```
Now, if a new class is introduced in the above code snippet like below.
```python
class Pet:
def __init__(self, name, animal):
self.name = name
self.animal = animal
```
The patterns will not match the cases and will trigger the wildcard case for the original code snippet above with `get_class` function.
```
>>> get_class(Pet('Tommy', 'Dog'))
Invalid
```
### Nested Patterns
The patterns can be nested via various means. It can include the mix of the patterns mentioned earlier or can be symmetrical across. A basic of the nested pattern of a list with Patterns with a Literal and Variable is taken. Classes and Iterables can laso be included.
```python
def get_points(points: list) -> None:
match points:
case []:
print('Empty')
case [x]:
print(f'One point {x}')
case [x, y]:
print(f'Two points {x} and {y}')
case _:
print('More than two points')
```
O/P:
```
>>> get_points([])
Empty
>>> get_points([1])
One point 1
>>> get_points([1, 2])
Two points 1 and 2
>>> get_points([1, 2, 3])
More than two points
```
### Complex Patterns
Complex patterns are also supported in the pattern matching sequence. The complex does not mean complex numbers but rather the structure which makes the readibility to seem complex.
#### Wildcard
The wildcard used till now are in the form of `case _` where the wildcard case is used if no match is found. Furthermore, the wildcard `_` can also be used as a placeholder in complex patterns.
```python
def wildcard(value: tuple) -> None:
match value:
case ('Bob', age, 'Mechanic'):
print(f'Bob is mechanic of age {age}')
case ('Bob', age, _):
print(f'Bob is not a mechanic of age {age}')
```
O/P:
The value in the above snippet is a tuple with `(Name, Age, Job)`. If the job is Mechanic and the name is Bob, the first case is triggered. But if the job is different and not a mechanic, then the other case is triggered with the wildcard.
```
>>> wildcard(('Bob', 18, 'Mechanic'))
Bob is mechanic of age 18
>>> wildcard(('Bob', 21, 'Engineer'))
Bob is not a mechanic of age 21
```
#### Guard
A `guard` is when an `if` is added to a pattern. The evaluation depends on the truth value of the guard.
`nums` is the tuple which contains two integers. A guard is the first case where it checks whether the first number is greater or equal to the second number in the tuple. If it is false, then it moves to the second case, where it concludes that the first number is smaller than the second number.
```python
def guard(nums: tuple) -> None:
match nums:
case (x, y) if x >= y:
print(f'{x} is greater or equal than {y}')
case (x, y):
print(f'{x} is smaller than {y}')
case _:
print('Invalid')
```
O/P:
```
>>> guard((1, 2))
1 is smaller than 2
>>> guard((2, 1))
2 is greater or equal than 1
>>> guard((1, 1))
1 is greater or equal than 1
```
## Summary
The match case statements provide an elegant and readible format to perform operations on pattern matching as compared to `if-else` statements. They are also more versatile as they provide additional functionalities on the pattern matching operations like unpacking, class matching, iterables and iterators. It can also use positional arguments for checking the patterns. They provide a powerful and concise way to handle multiple conditions and perform pattern matching
## Further Reading
This article provides a brief introduction to the match case statements and the overview on the pattern matching operations. To know more, the below articles can be used for in-depth understanding of the topic.
- [PEP 634 – Structural Pattern Matching: Specification](https://peps.python.org/pep-0634/)
- [PEP 636 – Structural Pattern Matching: Tutorial](https://peps.python.org/pep-0636/)

Wyświetl plik

@ -0,0 +1,185 @@
# AVL Tree
In Data Structures and Algorithms, an **AVL Tree** is a self-balancing binary search tree (BST) where the difference between heights of left and right subtrees cannot be more than one for all nodes. It ensures that the tree remains balanced, providing efficient search, insertion, and deletion operations.
## Points to be Remembered
- **Balance Factor**: The difference in heights between the left and right subtrees of a node. It should be -1, 0, or +1 for all nodes in an AVL tree.
- **Rotations**: Tree rotations (left, right, left-right, right-left) are used to maintain the balance factor within the allowed range.
## Real Life Examples of AVL Trees
- **Databases**: AVL trees can be used to maintain large indexes for database tables, ensuring quick data retrieval.
- **File Systems**: Some file systems use AVL trees to keep track of free and used memory blocks.
## Applications of AVL Trees
AVL trees are used in various applications in Computer Science:
- **Database Indexing**
- **Memory Allocation**
- **Network Routing Algorithms**
Understanding these applications is essential for Software Development.
## Operations in AVL Tree
Key operations include:
- **INSERT**: Insert a new element into the AVL tree.
- **SEARCH**: Find the position of an element in the AVL tree.
- **DELETE**: Remove an element from the AVL tree.
## Implementing AVL Tree in Python
```python
class AVLTreeNode:
def __init__(self, key):
self.key = key
self.left = None
self.right = None
self.height = 1
class AVLTree:
def insert(self, root, key):
if not root:
return AVLTreeNode(key)
if key < root.key:
root.left = self.insert(root.left, key)
else:
root.right = self.insert(root.right, key)
root.height = 1 + max(self.getHeight(root.left), self.getHeight(root.right))
balance = self.getBalance(root)
if balance > 1 and key < root.left.key:
return self.rotateRight(root)
if balance < -1 and key > root.right.key:
return self.rotateLeft(root)
if balance > 1 and key > root.left.key:
root.left = self.rotateLeft(root.left)
return self.rotateRight(root)
if balance < -1 and key < root.right.key:
root.right = self.rotateRight(root.right)
return self.rotateLeft(root)
return root
def search(self, root, key):
if not root or root.key == key:
return root
if key < root.key:
return self.search(root.left, key)
return self.search(root.right, key)
def delete(self, root, key):
if not root:
return root
if key < root.key:
root.left = self.delete(root.left, key)
elif key > root.key:
root.right = self.delete(root.right, key)
else:
if root.left is None:
temp = root.right
root = None
return temp
elif root.right is None:
temp = root.left
root = None
return temp
temp = self.getMinValueNode(root.right)
root.key = temp.key
root.right = self.delete(root.right, temp.key)
if root is None:
return root
root.height = 1 + max(self.getHeight(root.left), self.getHeight(root.right))
balance = self.getBalance(root)
if balance > 1 and self.getBalance(root.left) >= 0:
return self.rotateRight(root)
if balance < -1 and self.getBalance(root.right) <= 0:
return self.rotateLeft(root)
if balance > 1 and self.getBalance(root.left) < 0:
root.left = self.rotateLeft(root.left)
return self.rotateRight(root)
if balance < -1 and self.getBalance(root.right) > 0:
root.right = self.rotateRight(root.right)
return self.rotateLeft(root)
return root
def rotateLeft(self, z):
y = z.right
T2 = y.left
y.left = z
z.right = T2
z.height = 1 + max(self.getHeight(z.left), self.getHeight(z.right))
y.height = 1 + max(self.getHeight(y.left), self.getHeight(y.right))
return y
def rotateRight(self, z):
y = z.left
T3 = y.right
y.right = z
z.left = T3
z.height = 1 + max(self.getHeight(z.left), self.getHeight(z.right))
y.height = 1 + max(self.getHeight(y.left), self.getHeight(y.right))
return y
def getHeight(self, root):
if not root:
return 0
return root.height
def getBalance(self, root):
if not root:
return 0
return self.getHeight(root.left) - self.getHeight(root.right)
def getMinValueNode(self, root):
if root is None or root.left is None:
return root
return self.getMinValueNode(root.left)
def preOrder(self, root):
if not root:
return
print(root.key, end=' ')
self.preOrder(root.left)
self.preOrder(root.right)
#Example usage
avl_tree = AVLTree()
root = None
root = avl_tree.insert(root, 10)
root = avl_tree.insert(root, 20)
root = avl_tree.insert(root, 30)
root = avl_tree.insert(root, 40)
root = avl_tree.insert(root, 50)
root = avl_tree.insert(root, 25)
print("Preorder traversal of the AVL tree is:")
avl_tree.preOrder(root)
```
## Output
```markdown
Preorder traversal of the AVL tree is:
30 20 10 25 40 50
```
## Complexity Analysis
- **Insertion**: O(logn). Inserting a node involves traversing the height of the tree, which is logarithmic due to the balancing property.
- **Search**: O(logn). Searching for a node involves traversing the height of the tree.
- **Deletion**: O(logn). Deleting a node involves traversing and potentially rebalancing the tree, maintaining the logarithmic height.

Wyświetl plik

@ -0,0 +1,231 @@
# Binary Tree
A binary tree is a non-linear data structure in which each node can have atmost two children, known as the left and the right child. It is a heirarchial data structure represented in the following way:
```
A...................Level 0
/ \
B C.................Level 1
/ \ \
D E G...............Level 2
```
## Basic Terminologies
- **Root node:** The topmost node in a tree is the root node. The root node does not have any parent. In the above example, **A** is the root node.
- **Parent node:** The predecessor of a node is called the parent of that node. **A** is the parent of **B** and **C**, **B** is the parent of **D** and **E** and **C** is the parent of **G**.
- **Child node:** The successor of a node is called the child of that node. **B** and **C** are children of **A**, **D** and **E** are children of **B** and **G** is the right child of **C**.
- **Leaf node:** Nodes without any children are called the leaf nodes. **D**, **E** and **G** are the leaf nodes.
- **Ancestor node:** Predecessor nodes on the path from the root to that node are called ancestor nodes. **A** and **B** are the ancestors of **E**.
- **Descendant node:** Successor nodes on the path from the root to that node are called descendant nodes. **B** and **E** are descendants of **A**.
- **Sibling node:** Nodes having the same parent are called sibling nodes. **B** and **C** are sibling nodes and so are **D** and **E**.
- **Level (Depth) of a node:** Number of edges in the path from the root to that node is the level of that node. The root node is always at level 0. The depth of root node is the depth of the tree.
- **Height of a node:** Number of edges in the path from that node to the deepest leaf is the height of that node. The height of the root is the height of a tree. Height of node **A** is 2, nodes **B** and **C** is 1 and nodes **D**, **E** and **G** is 0.
## Types Of Binary Trees
- **Full Binary Tree:** A binary tree where each node has 0 or 2 children is a full binary tree.
```
A
/ \
B C
/ \
D E
```
- **Complete Binary Tree:** A binary tree in which all levels are completely filled except the last level is a complete binary tree. Whenever new nodes are inserted, they are inserted from the left side.
```
A
/ \
/ \
B C
/ \ /
D E F
```
- **Perfect Binary Tree:** A binary tree in which all nodes are completely filled, i.e., each node has two children is called a perfect binary tree.
```
A
/ \
/ \
B C
/ \ / \
D E F G
```
- **Skewed Binary Tree:** A binary tree in which each node has either 0 or 1 child is called a skewed binary tree. It is of two types - left skewed binary tree and right skewed binary tree.
```
A A
\ /
B B
\ /
C C
Right skewed binary tree Left skewed binary tree
```
- **Balanced Binary Tree:** A binary tree in which the height difference between the left and right subtree is not more than one and the subtrees are also balanced is a balanced binary tree.
```
A
/ \
B C
/ \
D E
```
## Real Life Applications Of Binary Tree
- **File Systems:** File systems employ binary trees to organize the folders and files, facilitating efficient search and access of files.
- **Decision Trees:** Decision tree, a supervised learning algorithm, utilizes binary trees, with each node representing a decision and its edges showing the possible outcomes.
- **Routing Algorithms:** In routing algorithms, binary trees are used to efficiently transfer data packets from the source to destination through a network of nodes.
- **Searching and sorting Algorithms:** Searching algorithms like binary search and sorting algorithms like heapsort heavily rely on binary trees.
## Implementation of Binary Tree
```python
from collections import deque
class Node:
def __init__(self, data):
self.data = data
self.left = None
self.right = None
class Binary_tree:
@staticmethod
def insert(root, data):
if root is None:
return Node(data)
q = deque()
q.append(root)
while q:
temp = q.popleft()
if temp.left is None:
temp.left = Node(data)
break
else:
q.append(temp.left)
if temp.right is None:
temp.right = Node(data)
break
else:
q.append(temp.right)
return root
@staticmethod
def inorder(root):
if not root:
return
b.inorder(root.left)
print(root.data, end=" ")
b.inorder(root.right)
@staticmethod
def preorder(root):
if not root:
return
print(root.data, end=" ")
b.preorder(root.left)
b.preorder(root.right)
@staticmethod
def postorder(root):
if not root:
return
b.postorder(root.left)
b.postorder(root.right)
print(root.data, end=" ")
@staticmethod
def levelorder(root):
if not root:
return
q = deque()
q.append(root)
while q:
temp = q.popleft()
print(temp.data, end=" ")
if temp.left is not None:
q.append(temp.left)
if temp.right is not None:
q.append(temp.right)
@staticmethod
def delete(root, value):
q = deque()
q.append(root)
while q:
temp = q.popleft()
if temp is value:
temp = None
return
if temp.right:
if temp.right is value:
temp.right = None
return
else:
q.append(temp.right)
if temp.left:
if temp.left is value:
temp.left = None
return
else:
q.append(temp.left)
@staticmethod
def delete_value(root, value):
if root is None:
return None
if root.left is None and root.right is None:
if root.data == value:
return None
else:
return root
x = None
q = deque()
q.append(root)
temp = None
while q:
temp = q.popleft()
if temp.data == value:
x = temp
if temp.left:
q.append(temp.left)
if temp.right:
q.append(temp.right)
if x:
y = temp.data
x.data = y
b.delete(root, temp)
return root
b = Binary_tree()
root = None
root = b.insert(root, 10)
root = b.insert(root, 20)
root = b.insert(root, 30)
root = b.insert(root, 40)
root = b.insert(root, 50)
root = b.insert(root, 60)
print("Preorder traversal:", end=" ")
b.preorder(root)
print("\nInorder traversal:", end=" ")
b.inorder(root)
print("\nPostorder traversal:", end=" ")
b.postorder(root)
print("\nLevel order traversal:", end=" ")
b.levelorder(root)
root = b.delete_value(root, 20)
print("\nLevel order traversal after deletion:", end=" ")
b.levelorder(root)
```
#### OUTPUT
```
Preorder traversal: 10 20 40 50 30 60
Inorder traversal: 40 20 50 10 60 30
Postorder traversal: 40 50 20 60 30 10
Level order traversal: 10 20 30 40 50 60
Level order traversal after deletion: 10 60 30 40 50
```

Wyświetl plik

@ -0,0 +1,90 @@
# Dijkstra's Algorithm
Dijkstra's algorithm is a graph algorithm that gives the shortest distance of each node from the given node in a weighted, undirected graph. It operates by continually choosing the closest unvisited node and determining the distance to all its unvisited neighboring nodes. This algorithm is similar to BFS in graphs, with the difference being it gives priority to nodes with shorter distances by using a priority queue(min-heap) instead of a FIFO queue. The data structures required would be a distance list (to store the minimum distance of each node), a priority queue or a set, and we assume the adjacency list will be provided.
## Working
- We will store the minimum distance of each node in the distance list, which has a length equal to the number of nodes in the graph. Thus, the minimum distance of the 2nd node will be stored in the 2nd index of the distance list. We initialize the list with the maximum number possible, say infinity.
- We now start the traversal from the starting node given and mark its distance as 0. We push this node to the priority queue along with its minimum distance, which is 0, so the structure pushed will be (0, node), a tuple.
- Now, with the help of the adjacency list, we will add the neighboring nodes to the priority queue with the distance equal to (edge weight + current node distance), and this should be less than the distance list value. We will also update the distance list in the process.
- When all the nodes are added, we will select the node with the shortest distance and repeat the process.
## Dry Run
We will now do a manual simulation using an example graph given. First, (0, a) is pushed to the priority queue (pq).
![Photo 1](images/Dijkstra's_algorithm_photo1.png)
- **Step1:** The lowest element is popped from the pq, which is (0, a), and all its neighboring nodes are added to the pq while simultaneously checking the distance list. Thus (3, b), (7, c), (1, d) are added to the pq.
![Photo 2](images/Dijkstra's_algorithm_photo2.png)
- **Step2:** Again, the lowest element is popped from the pq, which is (1, d). It has two neighboring nodes, a and e, from which
(0 + 1, a) will not be added to the pq as dist[a] = 0 is less than 1.
![Photo 3](images/Dijkstra's_algorithm_photo3.png)
- **Step3:** Now, the lowest element is popped from the pq, which is (3, b). It has two neighboring nodes, a and c, from which
(0 + 1, a) will not be added to the pq. But the new distance to reach c is 5 (3 + 2), which is less than dist[c] = 7. So (5, c) is added to the pq.
![Photo 4](images/Dijkstra's_algorithm_photo4.png)
- **Step4:** The next smallest element is (5, c). It has neighbors a and e. The new distance to reach a will be 5 + 7 = 12, which is more than dist[a], so it will not be considered. Similarly, the new distance for e is 5 + 3 = 8, which again will not be considered. So, no new tuple has been added to the pq.
![Photo 5](images/Dijkstra's_algorithm_photo5.png)
- **Step5:** Similarly, both the elements of the pq will be popped one by one without any new addition.
![Photo 6](images/Dijkstra's_algorithm_photo6.png)
![Photo 7](images/Dijkstra's_algorithm_photo7.png)
- The distance list we get at the end will be our answer.
- `Output` `dist=[1, 3, 7, 1, 6]`
## Python Code
```python
import heapq
def dijkstra(graph, start):
# Create a priority queue
pq = []
heapq.heappush(pq, (0, start))
# Create a dictionary to store distances to each node
dist = {node: float('inf') for node in graph}
dist[start] = 0
while pq:
# Get the node with the smallest distance
current_distance, current_node = heapq.heappop(pq)
# If the current distance is greater than the recorded distance, skip it
if current_distance > dist[current_node]:
continue
# Update the distances to the neighboring nodes
for neighbor, weight in graph[current_node].items():
distance = current_distance + weight
# Only consider this new path if it's better
if distance < dist[neighbor]:
dist[neighbor] = distance
heapq.heappush(pq, (distance, neighbor))
return dist
# Example usage:
graph = {
'A': {'B': 1, 'C': 4},
'B': {'A': 1, 'C': 2, 'D': 5},
'C': {'A': 4, 'B': 2, 'D': 1},
'D': {'B': 5, 'C': 1}
}
start_node = 'A'
dist = dijkstra(graph, start_node)
print(dist)
```
## Complexity Analysis
- **Time Complexity**: \(O((V + E) log V)\)
- **Space Complexity**: \(O(V + E)\)

Wyświetl plik

@ -51,10 +51,6 @@ print(f"The {n}th Fibonacci number is: {fibonacci(n)}.")
- **Time Complexity**: O(n) for both approaches
- **Space Complexity**: O(n) for the top-down approach (due to memoization), O(1) for the bottom-up approach
</br>
<hr>
</br>
# 2. Longest Common Subsequence
The longest common subsequence (LCS) problem is to find the longest subsequence common to two sequences. A subsequence is a sequence that appears in the same relative order but not necessarily contiguous.
@ -84,13 +80,33 @@ Y = "GXTXAYB"
print("Length of Longest Common Subsequence:", longest_common_subsequence(X, Y, len(X), len(Y)))
```
## Complexity Analysis
- **Time Complexity**: O(m * n) for the top-down approach, where m and n are the lengths of the input sequences
- **Space Complexity**: O(m * n) for the memoization table
## Longest Common Subsequence Code in Python (Bottom-Up Approach)
</br>
<hr>
</br>
```python
def longestCommonSubsequence(X, Y, m, n):
L = [[None]*(n+1) for i in range(m+1)]
for i in range(m+1):
for j in range(n+1):
if i == 0 or j == 0:
L[i][j] = 0
elif X[i-1] == Y[j-1]:
L[i][j] = L[i-1][j-1]+1
else:
L[i][j] = max(L[i-1][j], L[i][j-1])
return L[m][n]
S1 = "AGGTAB"
S2 = "GXTXAYB"
m = len(S1)
n = len(S2)
print("Length of LCS is", longestCommonSubsequence(S1, S2, m, n))
```
## Complexity Analysis
- **Time Complexity**: O(m * n) for both approaches, where m and n are the lengths of the input sequences
- **Space Complexity**: O(m * n) for the memoization table
# 3. 0-1 Knapsack Problem
@ -123,10 +139,98 @@ n = len(weights)
print("Maximum value that can be obtained:", knapsack(weights, values, capacity, n))
```
## 0-1 Knapsack Problem Code in Python (Bottom-up Approach)
```python
def knapSack(capacity, weights, values, n):
K = [[0 for x in range(capacity + 1)] for x in range(n + 1)]
for i in range(n + 1):
for w in range(capacity + 1):
if i == 0 or w == 0:
K[i][w] = 0
elif weights[i-1] <= w:
K[i][w] = max(values[i-1]
+ K[i-1][w-weights[i-1]],
K[i-1][w])
else:
K[i][w] = K[i-1][w]
return K[n][capacity]
values = [60, 100, 120]
weights = [10, 20, 30]
capacity = 50
n = len(weights)
print(knapSack(capacity, weights, values, n))
```
## Complexity Analysis
- **Time Complexity**: O(n * W) for the top-down approach, where n is the number of items and W is the capacity of the knapsack
- **Time Complexity**: O(n * W) for both approaches, where n is the number of items and W is the capacity of the knapsack
- **Space Complexity**: O(n * W) for the memoization table
</br>
<hr>
</br>
# 4. Longest Increasing Subsequence
The Longest Increasing Subsequence (LIS) is a task is to find the longest subsequence that is strictly increasing, meaning each element in the subsequence is greater than the one before it. This subsequence must maintain the order of elements as they appear in the original sequence but does not need to be contiguous. The goal is to identify the subsequence with the maximum possible length.
**Algorithm Overview:**
- **Base cases:** If the sequence is empty, the LIS length is 0.
- **Memoization:** Store the results of previously computed subproblems to avoid redundant computations.
- **Recurrence relation:** Compute the LIS length by comparing characters of the sequences and making decisions based on their values.
## Longest Increasing Subsequence Code in Python (Top-Down Approach using Memoization)
```python
import sys
def f(idx, prev_idx, n, a, dp):
if (idx == n):
return 0
if (dp[idx][prev_idx + 1] != -1):
return dp[idx][prev_idx + 1]
notTake = 0 + f(idx + 1, prev_idx, n, a, dp)
take = -sys.maxsize - 1
if (prev_idx == -1 or a[idx] > a[prev_idx]):
take = 1 + f(idx + 1, idx, n, a, dp)
dp[idx][prev_idx + 1] = max(take, notTake)
return dp[idx][prev_idx + 1]
def longestSubsequence(n, a):
dp = [[-1 for i in range(n + 1)]for j in range(n + 1)]
return f(0, -1, n, a, dp)
a = [3, 10, 2, 1, 20]
n = len(a)
print("Length of lis is", longestSubsequence(n, a))
```
## Longest Increasing Subsequence Code in Python (Bottom-Up Approach)
```python
def lis(arr):
n = len(arr)
lis = [1]*n
for i in range(1, n):
for j in range(0, i):
if arr[i] > arr[j] and lis[i] < lis[j] + 1:
lis[i] = lis[j]+1
maximum = 0
for i in range(n):
maximum = max(maximum, lis[i])
return maximum
arr = [10, 22, 9, 33, 21, 50, 41, 60]
print("Length of lis is", lis(arr))
```
## Complexity Analysis
- **Time Complexity**: O(n * n) for both approaches, where n is the length of the array.
- **Space Complexity**: O(n * n) for the memoization table in Top-Down Approach, O(n) in Bottom-Up Approach.

Wyświetl plik

@ -0,0 +1,212 @@
# Data Structures: Hash Tables, Hash Sets, and Hash Maps
## Table of Contents
- [Introduction](#introduction)
- [Hash Tables](#hash-tables)
- [Overview](#overview)
- [Operations](#operations)
- [Hash Sets](#hash-sets)
- [Overview](#overview-1)
- [Operations](#operations-1)
- [Hash Maps](#hash-maps)
- [Overview](#overview-2)
- [Operations](#operations-2)
- [Conclusion](#conclusion)
## Introduction
This document provides an overview of three fundamental data structures in computer science: hash tables, hash sets, and hash maps. These structures are widely used for efficient data storage and retrieval operations.
## Hash Tables
### Overview
A **hash table** is a data structure that stores key-value pairs. It uses a hash function to compute an index into an array of buckets or slots, from which the desired value can be found.
### Operations
1. **Insertion**: Add a new key-value pair to the hash table.
2. **Deletion**: Remove a key-value pair from the hash table.
3. **Search**: Find the value associated with a given key.
4. **Update**: Modify the value associated with a given key.
**Example Code (Python):**
```python
class Node:
def __init__(self, key, value):
self.key = key
self.value = value
self.next = None
class HashTable:
def __init__(self, capacity):
self.capacity = capacity
self.size = 0
self.table = [None] * capacity
def _hash(self, key):
return hash(key) % self.capacity
def insert(self, key, value):
index = self._hash(key)
if self.table[index] is None:
self.table[index] = Node(key, value)
self.size += 1
else:
current = self.table[index]
while current:
if current.key == key:
current.value = value
return
current = current.next
new_node = Node(key, value)
new_node.next = self.table[index]
self.table[index] = new_node
self.size += 1
def search(self, key):
index = self._hash(key)
current = self.table[index]
while current:
if current.key == key:
return current.value
current = current.next
raise KeyError(key)
def remove(self, key):
index = self._hash(key)
previous = None
current = self.table[index]
while current:
if current.key == key:
if previous:
previous.next = current.next
else:
self.table[index] = current.next
self.size -= 1
return
previous = current
current = current.next
raise KeyError(key)
def __len__(self):
return self.size
def __contains__(self, key):
try:
self.search(key)
return True
except KeyError:
return False
# Driver code
if __name__ == '__main__':
ht = HashTable(5)
ht.insert("apple", 3)
ht.insert("banana", 2)
ht.insert("cherry", 5)
print("apple" in ht)
print("durian" in ht)
print(ht.search("banana"))
ht.insert("banana", 4)
print(ht.search("banana")) # 4
ht.remove("apple")
print(len(ht)) # 3
```
# Insert elements
hash_table["key1"] = "value1"
hash_table["key2"] = "value2"
# Search for an element
value = hash_table.get("key1")
# Delete an element
del hash_table["key2"]
# Update an element
hash_table["key1"] = "new_value1"
## Hash Sets
### Overview
A **hash set** is a collection of unique elements. It is implemented using a hash table where each bucket can store only one element.
### Operations
1. **Insertion**: Add a new element to the set.
2. **Deletion**: Remove an element from the set.
3. **Search**: Check if an element exists in the set.
4. **Union**: Combine two sets to form a new set with elements from both.
5. **Intersection**: Find common elements between two sets.
6. **Difference**: Find elements present in one set but not in the other.
**Example Code (Python):**
```python
# Create a hash set
hash_set = set()
# Insert elements
hash_set.add("element1")
hash_set.add("element2")
# Search for an element
exists = "element1" in hash_set
# Delete an element
hash_set.remove("element2")
# Union of sets
another_set = {"element3", "element4"}
union_set = hash_set.union(another_set)
# Intersection of sets
intersection_set = hash_set.intersection(another_set)
# Difference of sets
difference_set = hash_set.difference(another_set)
```
## Hash Maps
### Overview
A **hash map** is similar to a hash table but often provides additional functionalities and more user-friendly interfaces for developers. It is a collection of key-value pairs where each key is unique.
### Operations
1. **Insertion**: Add a new key-value pair to the hash map.
2. **Deletion**: Remove a key-value pair from the hash map.
3. **Search**: Retrieve the value associated with a given key.
4. **Update**: Change the value associated with a given key.
**Example Code (Python):**
```python
# Create a hash map
hash_map = {}
# Insert elements
hash_map["key1"] = "value1"
hash_map["key2"] = "value2"
# Search for an element
value = hash_map.get("key1")
# Delete an element
del hash_map["key2"]
# Update an element
hash_map["key1"] = "new_value1"
```
## Conclusion
Hash tables, hash sets, and hash maps are powerful data structures that provide efficient means of storing and retrieving data. Understanding these structures and their operations is crucial for developing optimized algorithms and applications.

Wyświetl plik

@ -0,0 +1,169 @@
# Heaps
## Definition:
Heaps are a crucial data structure that support efficient priority queue operations. They come in two main types: min heaps and max heaps. Python's heapq module provides a robust implementation for min heaps, and with some minor adjustments, it can also be used to implement max heaps.
## Overview:
A heap is a specialized binary tree-based data structure that satisfies the heap property:
- **Min Heap:** The key at the root must be the minimum among all keys present in the Binary Heap. This property must be recursively true for all nodes in the Binary Tree.
- **Max Heap:** The key at the root must be the maximum among all keys present in the Binary Heap. This property must be recursively true for all nodes in the Binary Tree.
## Python heapq Module:
The heapq module provides an implementation of the heap queue algorithm, also known as the priority queue algorithm.
- **Min Heap:** In a min heap, the smallest element is always at the root. Here's how to use heapq to create and manipulate a min heap:
```python
import heapq
# Create an empty heap
min_heap = []
# Adding elements to the heap
heapq.heappush(min_heap, 10)
heapq.heappush(min_heap, 5)
heapq.heappush(min_heap, 3)
heapq.heappush(min_heap, 12)
print("Min Heap:", min_heap)
# Pop the smallest element
smallest = heapq.heappop(min_heap)
print("Smallest element:", smallest)
print("Min Heap after pop:", min_heap)
```
**Output:**
```
Min Heap: [3, 5, 10, 12]
Smallest element: 3
Min Heap after pop: [5, 12, 10]
```
- **Max Heap:** To create a max heap, we can store negative values.
```python
import heapq
# Create an empty heap
max_heap = []
# Adding elements to the heap by pushing negative values
heapq.heappush(max_heap, -10)
heapq.heappush(max_heap, -5)
heapq.heappush(max_heap, -3)
heapq.heappush(max_heap, -12)
# Convert back to positive values for display
print("Max Heap:", [-x for x in max_heap])
# Pop the largest element
largest = -heapq.heappop(max_heap)
print("Largest element:", largest)
print("Max Heap after pop:", [-x for x in max_heap])
```
**Output:**
```
Max Heap: [12, 10, 3, 5]
Largest element: 12
Max Heap after pop: [10, 5, 3]
```
## Heap Operations:
1. **Push Operation:** Adds an element to the heap, maintaining the heap property.
```python
heapq.heappush(heap, item)
```
2. **Pop Operation:** Removes and returns the smallest element from the heap.
```python
smallest = heapq.heappop(heap)
```
3. **Heapify Operation:** Converts a list into a heap in-place.
```python
heapq.heapify(list)
```
4. **Peek Operation:** To get the smallest element without popping it (not directly available, but can be done by accessing the first element).
```python
smallest = heap[0]
```
## Example:
```python
# importing "heapq" to implement heap queue
import heapq
# initializing list
li = [15, 77, 90, 1, 3]
# using heapify to convert list into heap
heapq.heapify(li)
# printing created heap
print("The created heap is : ", end="")
print(list(li))
# using heappush() to push elements into heap
# pushes 4
heapq.heappush(li, 4)
# printing modified heap
print("The modified heap after push is : ", end="")
print(list(li))
# using heappop() to pop smallest element
print("The popped and smallest element is : ", end="")
print(heapq.heappop(li))
```
Output:
```
The created heap is : [1, 3, 15, 77, 90]
The modified heap after push is : [1, 3, 4, 15, 77, 90]
The popped and smallest element is : 1
```
## Advantages and Disadvantages of Heaps:
## Advantages:
**Efficient:** Heap queues, implemented in Python's heapq module, offer remarkable efficiency in managing priority queues and heaps. With logarithmic time complexity for key operations, they are widely favored in various applications for their performance.
**Space-efficient:** Leveraging an array-based representation, heap queues optimize memory usage compared to node-based structures like linked lists. This design minimizes overhead, enhancing efficiency in memory management.
**Ease of Use:** Python's heap queues boast a user-friendly API, simplifying fundamental operations such as insertion, deletion, and retrieval. This simplicity contributes to rapid development and code maintenance.
**Flexibility:** Beyond their primary use in priority queues and heaps, Python's heap queues lend themselves to diverse applications. They can be adapted to implement various data structures, including binary trees, showcasing their versatility and broad utility across different domains.
## Disadvantages:
**Limited functionality:** Heap queues are primarily designed for managing priority queues and heaps, and may not be suitable for more complex data structures and algorithms.
**No random access:** Heap queues do not support random access to elements, making it difficult to access elements in the middle of the heap or modify elements that are not at the top of the heap.
**No sorting:** Heap queues do not support sorting, so if you need to sort elements in a specific order, you will need to use a different data structure or algorithm.
**Not thread-safe:** Heap queues are not thread-safe, meaning that they may not be suitable for use in multi-threaded applications where data synchronization is critical.
## Real-Life Examples of Heaps:
1. **Priority Queues:**
Heaps are commonly used to implement priority queues, which are used in various algorithms like Dijkstra's shortest path algorithm and Prim's minimum spanning tree algorithm.
2. **Scheduling Algorithms:**
Heaps are used in job scheduling algorithms where tasks with the highest priority need to be processed first.
3. **Merge K Sorted Lists:**
Heaps can be used to efficiently merge multiple sorted lists into a single sorted list.
4. **Real-Time Event Simulation:**
Heaps are used in event-driven simulators to manage events scheduled to occur at future times.
5. **Median Finding Algorithm:**
Heaps can be used to maintain a dynamic set of numbers to find the median efficiently.

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 36 KiB

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 31 KiB

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 29 KiB

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 33 KiB

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 34 KiB

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 27 KiB

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 26 KiB

Wyświetl plik

@ -15,4 +15,10 @@
- [Trie](trie.md)
- [Two Pointer Technique](two-pointer-technique.md)
- [Hashing through Linear Probing](hashing-linear-probing.md)
- [Hashing through Chaining](hashing-chaining.md)
- [Hashing through Chaining](hashing-chaining.md)
- [Heaps](heaps.md)
- [Hash Tables, Sets, Maps](hash-tables.md)
- [Binary Tree](binary-tree.md)
- [AVL Trees](avl-trees.md)
- [Splay Trees](splay-trees.md)
- [Dijkstra's Algorithm](dijkstra.md)

Wyświetl plik

@ -0,0 +1,162 @@
# Splay Tree
In Data Structures and Algorithms, a **Splay Tree** is a self-adjusting binary search tree with the additional property that recently accessed elements are quick to access again. It performs basic operations such as insertion, search, and deletion in O(log n) amortized time. This is achieved by a process called **splaying**, where the accessed node is moved to the root through a series of tree rotations.
## Points to be Remembered
- **Splaying**: Moving the accessed node to the root using rotations.
- **Rotations**: Tree rotations (left and right) are used to balance the tree during splaying.
- **Self-adjusting**: The tree adjusts itself with each access, keeping frequently accessed nodes near the root.
## Real Life Examples of Splay Trees
- **Cache Implementation**: Frequently accessed data is kept near the top of the tree, making repeated accesses faster.
- **Networking**: Routing tables in network switches can use splay trees to prioritize frequently accessed routes.
## Applications of Splay Trees
Splay trees are used in various applications in Computer Science:
- **Cache Implementations**
- **Garbage Collection Algorithms**
- **Data Compression Algorithms (e.g., LZ78)**
Understanding these applications is essential for Software Development.
## Operations in Splay Tree
Key operations include:
- **INSERT**: Insert a new element into the splay tree.
- **SEARCH**: Find the position of an element in the splay tree.
- **DELETE**: Remove an element from the splay tree.
## Implementing Splay Tree in Python
```python
class SplayTreeNode:
def __init__(self, key):
self.key = key
self.left = None
self.right = None
class SplayTree:
def __init__(self):
self.root = None
def insert(self, key):
self.root = self.splay_insert(self.root, key)
def search(self, key):
self.root = self.splay_search(self.root, key)
return self.root
def splay(self, root, key):
if not root or root.key == key:
return root
if root.key > key:
if not root.left:
return root
if root.left.key > key:
root.left.left = self.splay(root.left.left, key)
root = self.rotateRight(root)
elif root.left.key < key:
root.left.right = self.splay(root.left.right, key)
if root.left.right:
root.left = self.rotateLeft(root.left)
return root if not root.left else self.rotateRight(root)
else:
if not root.right:
return root
if root.right.key > key:
root.right.left = self.splay(root.right.left, key)
if root.right.left:
root.right = self.rotateRight(root.right)
elif root.right.key < key:
root.right.right = self.splay(root.right.right, key)
root = self.rotateLeft(root)
return root if not root.right else self.rotateLeft(root)
def splay_insert(self, root, key):
if not root:
return SplayTreeNode(key)
root = self.splay(root, key)
if root.key == key:
return root
new_node = SplayTreeNode(key)
if root.key > key:
new_node.right = root
new_node.left = root.left
root.left = None
else:
new_node.left = root
new_node.right = root.right
root.right = None
return new_node
def splay_search(self, root, key):
return self.splay(root, key)
def rotateRight(self, node):
temp = node.left
node.left = temp.right
temp.right = node
return temp
def rotateLeft(self, node):
temp = node.right
node.right = temp.left
temp.left = node
return temp
def preOrder(self, root):
if root:
print(root.key, end=' ')
self.preOrder(root.left)
self.preOrder(root.right)
#Example usage:
splay_tree = SplayTree()
splay_tree.insert(50)
splay_tree.insert(30)
splay_tree.insert(20)
splay_tree.insert(40)
splay_tree.insert(70)
splay_tree.insert(60)
splay_tree.insert(80)
print("Preorder traversal of the Splay tree is:")
splay_tree.preOrder(splay_tree.root)
splay_tree.search(60)
print("\nSplay tree after search operation for key 60:")
splay_tree.preOrder(splay_tree.root)
```
## Output
```markdown
Preorder traversal of the Splay tree is:
50 30 20 40 70 60 80
Splay tree after search operation for key 60:
60 50 30 20 40 70 80
```
## Complexity Analysis
The worst-case time complexities of the main operations in a Splay Tree are as follows:
- **Insertion**: (O(n)). In the worst case, insertion may take linear time if the tree is highly unbalanced.
- **Search**: (O(n)). In the worst case, searching for a node may take linear time if the tree is highly unbalanced.
- **Deletion**: (O(n)). In the worst case, deleting a node may take linear time if the tree is highly unbalanced.
While these operations can take linear time in the worst case, the splay operation ensures that the tree remains balanced over a sequence of operations, leading to better average-case performance.

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 541 KiB

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 12 KiB

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 13 KiB

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 6.0 KiB

Wyświetl plik

@ -3,6 +3,7 @@
- [Introduction to scikit-learn](sklearn-introduction.md)
- [Binomial Distribution](binomial-distribution.md)
- [Regression in Machine Learning](regression.md)
- [Polynomial Regression](polynomial-regression.md)
- [Confusion Matrix](confusion-matrix.md)
- [Decision Tree Learning](decision-tree.md)
- [Random Forest](random-forest.md)
@ -19,5 +20,8 @@
- [Hierarchical Clustering](hierarchical-clustering.md)
- [Grid Search](grid-search.md)
- [Transformers](transformers.md)
- [K-Means](kmeans.md)
- [K-nearest neighbor (KNN)](knn.md)
- [Naive Bayes](naive-bayes.md)
- [Neural network regression](neural-network-regression.md)
- [PyTorch Fundamentals](pytorch-fundamentals.md)

Wyświetl plik

@ -0,0 +1,92 @@
# K-Means Clustering
Unsupervised Learning Algorithm for Grouping Similar Data.
## Introduction
K-means clustering is a fundamental unsupervised machine learning algorithm that excels at grouping similar data points together. It's a popular choice due to its simplicity and efficiency in uncovering hidden patterns within unlabeled datasets.
## Unsupervised Learning
Unlike supervised learning algorithms that rely on labeled data for training, unsupervised algorithms, like K-means, operate solely on input data (without predefined categories). Their objective is to discover inherent structures or groupings within the data.
## The K-Means Objective
Organize similar data points into clusters to unveil underlying patterns. The main objective is to minimize total intra-cluster variance or the squared function.
![image](assets/knm.png)
## Clusters and Centroids
A cluster represents a collection of data points that share similar characteristics. K-means identifies a pre-determined number (k) of clusters within the dataset. Each cluster is represented by a centroid, which acts as its central point (imaginary or real).
## Minimizing In-Cluster Variation
The K-means algorithm strategically assigns each data point to a cluster such that the total variation within each cluster (measured by the sum of squared distances between points and their centroid) is minimized. In simpler terms, K-means strives to create clusters where data points are close to their respective centroids.
## The Meaning Behind "K-Means"
The "means" in K-means refers to the averaging process used to compute the centroid, essentially finding the center of each cluster.
## K-Means Algorithm in Action
![image](assets/km_.png)
The K-means algorithm follows an iterative approach to optimize cluster formation:
1. **Initial Centroid Placement:** The process begins with randomly selecting k centroids to serve as initial reference points for each cluster.
2. **Data Point Assignment:** Each data point is assigned to the closest centroid, effectively creating a preliminary clustering.
3. **Centroid Repositioning:** Once data points are assigned, the centroids are recalculated by averaging the positions of the points within their respective clusters. These new centroids represent the refined centers of the clusters.
4. **Iteration Until Convergence:** Steps 2 and 3 are repeated iteratively until a stopping criterion is met. This criterion can be either:
- **Centroid Stability:** No significant change occurs in the centroids' positions, indicating successful clustering.
- **Reaching Maximum Iterations:** A predefined number of iterations is completed.
## Code
Following is a simple implementation of K-Means.
```python
# Generate and Visualize Sample Data
# import the necessary Libraries
import numpy as np
import matplotlib.pyplot as plt
# Create data points for cluster 1 and cluster 2
X = -2 * np.random.rand(100, 2)
X1 = 1 + 2 * np.random.rand(50, 2)
# Combine data points from both clusters
X[50:100, :] = X1
# Plot data points and display the plot
plt.scatter(X[:, 0], X[:, 1], s=50, c='b')
plt.show()
# K-Means Model Creation and Training
from sklearn.cluster import KMeans
# Create KMeans object with 2 clusters
kmeans = KMeans(n_clusters=2)
kmeans.fit(X) # Train the model on the data
# Visualize Data Points with Centroids
centroids = kmeans.cluster_centers_ # Get centroids (cluster centers)
plt.scatter(X[:, 0], X[:, 1], s=50, c='b') # Plot data points again
plt.scatter(centroids[0, 0], centroids[0, 1], s=200, c='g', marker='s') # Plot centroid 1
plt.scatter(centroids[1, 0], centroids[1, 1], s=200, c='r', marker='s') # Plot centroid 2
plt.show() # Display the plot with centroids
# Predict Cluster Label for New Data Point
new_data = np.array([-3.0, -3.0])
new_data_reshaped = new_data.reshape(1, -1)
predicted_cluster = kmeans.predict(new_data_reshaped)
print("Predicted cluster for new data:", predicted_cluster)
```
### Output:
Before Implementing K-Means Clustering
![Before Implementing K-Means Clustering](assets/km_2.png)
After Implementing K-Means Clustering
![After Implementing K-Means Clustering](assets/km_3.png)
Predicted cluster for new data: `[0]`
## Conclusion
**K-Means** can be applied to data that has a smaller number of dimensions, is numeric, and is continuous or can be used to find groups that have not been explicitly labeled in the data. As an example, it can be used for Document Classification, Delivery Store Optimization, or Customer Segmentation.
## References
- [Survey of Machine Learning and Data Mining Techniques used in Multimedia System](https://www.researchgate.net/publication/333457161_Survey_of_Machine_Learning_and_Data_Mining_Techniques_used_in_Multimedia_System?_tp=eyJjb250ZXh0Ijp7ImZpcnN0UGFnZSI6Il9kaXJlY3QiLCJwYWdlIjoiX2RpcmVjdCJ9fQ)
- [A Clustering Approach for Outliers Detection in a Big Point-of-Sales Database](https://www.researchgate.net/publication/339267868_A_Clustering_Approach_for_Outliers_Detection_in_a_Big_Point-of-Sales_Database?_tp=eyJjb250ZXh0Ijp7ImZpcnN0UGFnZSI6Il9kaXJlY3QiLCJwYWdlIjoiX2RpcmVjdCJ9fQ)

Wyświetl plik

@ -0,0 +1,369 @@
# Naive Bayes
## Introduction
The Naive Bayes model uses probabilities to predict an outcome.It is a supervised machine learning technique, i.e. it reqires labelled data for training. It is used for classification and is based on the Bayes' Theorem. The basic assumption of this model is the independence among the features, i.e. a feature is unaffected by any other feture.
## Bayes' Theorem
Bayes' theorem is given by:
$$
P(a|b) = \frac{P(b|a)*P(a)}{P(b)}
$$
where:
- $P(a|b)$ is the posterior probability, i.e. probability of 'a' given that 'b' is true,
- $P(b|a)$ is the likelihood probability i.e. probability of 'b' given that 'a' is true,
- $P(a)$ and $P(b)$ are the probabilities of 'a' and 'b' respectively, independent of each other.
## Applications
Naive Bayes classifier has numerous applications including :
1. Text classification.
2. Sentiment analysis.
3. Spam filtering.
4. Multiclass classification (eg. Weather prediction).
5. Recommendation Systems.
6. Healthcare sector.
7. Document categorization.
## Advantages
1. Easy to implement.
2. Useful even if training dataset is limited (where a decision tree would not be recommended).
3. Supports multiclass classification which is not supported by some machine learning algorithms like SVM and logistic regression.
4. Scalable, fast and efficient.
## Disadvantages
1. Assumes features to be independent, which may not be true in certain scenarios.
2. Zero probability error.
3. Sensitive to noise.
## Zero Probability Error
Zero probability error is said to occur if in some case the number of occurances of an event given another event is zero.
To handle zero probability error, Laplace's correction is used by adding a small constant .
**Example:**
Given the data below, find whether tennis can be played if ( outlook=overcast, wind=weak ).
**Data**
---
| SNo | Outlook (A) | Wind (B) | PlayTennis (R) |
|-----|--------------|------------|-------------------|
| 1 | Rain | Weak | No |
| 2 | Rain | Strong | No |
| 3 | Overcast | Weak | Yes |
| 4 | Rain | Weak | Yes |
| 5 | Overcast | Weak | Yes |
| 6 | Rain | Strong | No |
| 7 | Overcast | Strong | Yes |
| 8 | Rain | Weak | No |
| 9 | Overcast | Weak | Yes |
| 10 | Rain | Weak | Yes |
---
- **Calculate prior probabilities**
$$
P(Yes) = \frac{6}{10} = 0.6
$$
$$
P(No) = \frac{4}{10} = 0.4
$$
- **Calculate likelihoods**
1.**Outlook (A):**
---
| A\R | Yes | No |
|-----------|-------|-----|
| Rain | 2 | 4 |
| Overcast | 4 | 0 |
| Total | 6 | 4 |
---
- Rain:
$$
P(Rain|Yes) = \frac{2}{6}
$$
$$
P(Rain|No) = \frac{4}{4}
$$
- Overcast:
$$
P(Overcast|Yes) = \frac{4}{6}
$$
$$
P(Overcast|No) = \frac{0}{4}
$$
Here, we can see that
$$
P(Overcast|No) = 0
$$
This is a zero probability error!
Since probability is 0, naive bayes model fails to predict.
**Applying Laplace's correction:**
In Laplace's correction, we scale the values for 1000 instances.
- **Calculate prior probabilities**
$$
P(Yes) = \frac{600}{1002}
$$
$$
P(No) = \frac{402}{1002}
$$
- **Calculate likelihoods**
1. **Outlook (A):**
( Converted to 1000 instances )
We will add 1 instance each to the (PlayTennis|No) column {Laplace's correction}
---
| A\R | Yes | No |
|-----------|-------|---------------|
| Rain | 200 | (400+1)=401 |
| Overcast | 400 | (0+1)=1 |
| Total | 600 | 402 |
---
- **Rain:**
$$
P(Rain|Yes) = \frac{200}{600}
$$
$$
P(Rain|No) = \frac{401}{402}
$$
- **Overcast:**
$$
P(Overcast|Yes) = \frac{400}{600}
$$
$$
P(Overcast|No) = \frac{1}{402}
$$
2. **Wind (B):**
---
| B\R | Yes | No |
|-----------|---------|-------|
| Weak | 500 | 200 |
| Strong | 100 | 200 |
| Total | 600 | 400 |
---
- **Weak:**
$$
P(Weak|Yes) = \frac{500}{600}
$$
$$
P(Weak|No) = \frac{200}{400}
$$
- **Strong:**
$$
P(Strong|Yes) = \frac{100}{600}
$$
$$
P(Strong|No) = \frac{200}{400}
$$
- **Calculting probabilities:**
$$
P(PlayTennis|Yes) = P(Yes) * P(Overcast|Yes) * P(Weak|Yes)
$$
$$
= \frac{600}{1002} * \frac{400}{600} * \frac{500}{600}
$$
$$
= 0.3326
$$
$$
P(PlayTennis|No) = P(No) * P(Overcast|No) * P(Weak|No)
$$
$$
= \frac{402}{1002} * \frac{1}{402} * \frac{200}{400}
$$
$$
= 0.000499 = 0.0005
$$
Since ,
$$
P(PlayTennis|Yes) > P(PlayTennis|No)
$$
we can conclude that tennis can be played if outlook is overcast and wind is weak.
# Types of Naive Bayes classifier
## Guassian Naive Bayes
It is used when the dataset has **continuous data**. It assumes that the data is distributed normally (also known as guassian distribution).
A guassian distribution can be characterized by a bell-shaped curve.
**Continuous data features :** Features which can take any real values within a certain range. These features have an infinite number of possible values.They are generally measured, not counted.
eg. weight, height, temperature, etc.
**Code**
```python
#import libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn import metrics
from sklearn.metrics import confusion_matrix
#read data
d=pd.read_csv("data.csv")
df=pd.DataFrame(d)
X = df.iloc[:,1:7:1]
y = df.iloc[:,7:8:1]
# splitting X and y into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42)
# training the model on training set
obj = GaussianNB()
obj.fit(X_train, y_train)
#making predictions on the testing set
y_pred = obj.predict(X_train)
#comparing y_test and y_pred
print("Gaussian Naive Bayes model accuracy:", metrics.accuracy_score(y_train, y_pred))
print("Confusion matrix: \n",confusion_matrix(y_train,y_pred))
```
## Multinomial Naive Bayes
Appropriate when the features are categorical or countable. It models the likelihood of each feature as a multinomial distribution.
Multinomial distribution is used to find probabilities of each category, given multiple categories (eg. Text classification).
**Code**
```python
#import libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB
from sklearn import metrics
from sklearn.metrics import confusion_matrix
#read data
d=pd.read_csv("data.csv")
df=pd.DataFrame(d)
X = df.iloc[:,1:7:1]
y = df.iloc[:,7:8:1]
# splitting X and y into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42)
# training the model on training set
obj = MultinomialNB()
obj.fit(X_train, y_train)
#making predictions on the testing set
y_pred = obj.predict(X_train)
#comparing y_test and y_pred
print("Gaussian Naive Bayes model accuracy:", metrics.accuracy_score(y_train, y_pred))
print("Confusion matrix: \n",confusion_matrix(y_train,y_pred))
```
## Bernoulli Naive Bayes
It is specifically designed for binary features (eg. Yes or No). It models the likelihood of each feature as a Bernoulli distribution.
Bernoulli distribution is used when there are only two possible outcomes (eg. success or failure of an event).
**Code**
```python
#import libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import BernoulliNB
from sklearn import metrics
from sklearn.metrics import confusion_matrix
#read data
d=pd.read_csv("data.csv")
df=pd.DataFrame(d)
X = df.iloc[:,1:7:1]
y = df.iloc[:,7:8:1]
# splitting X and y into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42)
# training the model on training set
obj = BernoulliNB()
obj.fit(X_train, y_train)
#making predictions on the testing set
y_pred = obj.predict(X_train)
#comparing y_test and y_pred
print("Gaussian Naive Bayes model accuracy:", metrics.accuracy_score(y_train, y_pred))
print("Confusion matrix: \n",confusion_matrix(y_train,y_pred))
```
## Evaluation
1. Confusion matrix.
2. Accuracy.
3. ROC curve.
## Conclusion
We can conclude that naive bayes may limit in some cases due to the assumption that the features are independent of each other but still reliable in many cases. Naive Bayes is an efficient classifier and works even on small datasets.

Wyświetl plik

@ -0,0 +1,84 @@
# Neural Network Regression in Python using Scikit-learn
## Overview
Neural Network Regression is used to predict continuous values based on input features. Scikit-learn provides an easy-to-use interface for implementing neural network models, specifically through the `MLPRegressor` class, which stands for Multi-Layer Perceptron Regressor.
## When to Use Neural Network Regression
### Suitable Scenarios
1. **Complex Relationships**: Ideal when the relationship between features and the target variable is complex and non-linear.
2. **Sufficient Data**: Works well with large datasets that can support training deep learning models.
3. **Feature Extraction**: Useful in cases where the neural network's feature extraction capabilities can be leveraged, such as with image or text data.
### Unsuitable Scenarios
1. **Small Datasets**: Less effective with small datasets due to overfitting and inability to learn complex patterns.
2. **Low-latency Predictions**: Might not be suitable for real-time applications with strict latency requirements.
3. **Interpretability**: Not ideal when model interpretability is crucial, as neural networks are often seen as "black-box" models.
## Implementing Neural Network Regression in Python with Scikit-learn
### Step-by-Step Implementation
1. **Import Libraries**
```python
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neural_network import MLPRegressor
from sklearn.metrics import mean_absolute_error
```
2. **Load and Prepare Data**
For illustration, let's use a synthetic dataset.
```python
# Generate synthetic data
np.random.seed(42)
X = np.random.rand(1000, 3)
y = X[:, 0] * 3 + X[:, 1] * -2 + X[:, 2] * 0.5 + np.random.randn(1000) * 0.1
# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Standardize the data
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
```
3. **Build and Train the Neural Network Model**
```python
# Create the MLPRegressor model
mlp = MLPRegressor(hidden_layer_sizes=(64, 64), activation='relu', solver='adam', max_iter=500, random_state=42)
# Train the model
mlp.fit(X_train, y_train)
```
4. **Evaluate the Model**
```python
# Make predictions
y_pred = mlp.predict(X_test)
# Calculate the Mean Absolute Error
mae = mean_absolute_error(y_test, y_pred)
print(f"Test Mean Absolute Error: {mae}")
```
### Explanation
- **Data Generation and Preparation**: Synthetic data is created, split into training and test sets, and standardized to improve the efficiency of the neural network training process.
- **Model Construction and Training**: An `MLPRegressor` is created with two hidden layers, each containing 64 neurons and ReLU activation functions. The model is trained using the Adam optimizer for a maximum of 500 iterations.
- **Evaluation**: The model's performance is evaluated on the test set using Mean Absolute Error (MAE) as the performance metric.
## Conclusion
Neural Network Regression with Scikit-learn's `MLPRegressor` is a powerful method for predicting continuous values in complex, non-linear scenarios. However, it's essential to ensure that you have enough data to train the model effectively and consider the computational resources required. Simpler models may be more appropriate for small datasets or when model interpretability is necessary. By following the steps outlined, you can build, train, and evaluate a neural network for regression tasks in Python using Scikit-learn.

Wyświetl plik

@ -0,0 +1,102 @@
# Polynomial Regression
Polynomial Regression is a form of regression analysis in which the relationship between the independent variable $x$ and the dependent variable $y$ is modeled as an $nth$ degree polynomial. This guide provides an overview of polynomial regression, including its fundamental concepts, assumptions, and how to implement it using Python.
## Introduction
Polynomial Regression is used when the data shows a non-linear relationship between the independent variable $x$ and the dependent variable $y$ is modeled as an $nth$ degree polynomial. It extends the simple linear regression model by considering polynomial terms of the independent variable, allowing for a more flexible fit to the data.
## Concepts
### Polynomial Equation
The polynomial regression model is based on the following polynomial equation:
$$
\[ y = \beta_0 + \beta_1 x + \beta_2 x^2 + \beta_3 x^3 + \cdots + \beta_n x^n + \epsilon \]
$$
Where:
- $y$ is the dependent variable.
- $x$ is the independent variable.
- $\beta_0, \beta_1, \ldots, \beta_n$ are the coefficients of the polynomial.
- $\epsilon$ is the error term.
### Degree of Polynomial
The degree of the polynomial (n) determines the flexibility of the model. A higher degree allows the model to fit more complex, non-linear relationships, but it also increases the risk of overfitting.
### Overfitting and Underfitting
- **Overfitting**: When the model fits the noise in the training data too closely, resulting in poor generalization to new data.
- **Underfitting**: When the model is too simple to capture the underlying pattern in the data.
## Assumptions
1. **Independence**: Observations are independent of each other.
2. **Homoscedasticity**: The variance of the residuals (errors) is constant across all levels of the independent variable.
3. **Normality**: The residuals of the model are normally distributed.
4. **No Multicollinearity**: The predictor variables are not highly correlated with each other.
## Implementation
### Using Scikit-learn
Scikit-learn is a popular machine learning library in Python that provides tools for polynomial regression.
### Code Example
```python
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
# Load dataset
data = pd.read_csv('path/to/your/dataset.csv')
# Define features and target variable
X = data[['feature']]
y = data['target']
# Transform features to polynomial features
poly = PolynomialFeatures(degree=3)
X_poly = poly.fit_transform(X)
# Initialize and train polynomial regression model
model = LinearRegression()
model.fit(X_poly, y)
# Make predictions
y_pred = model.predict(X_poly)
# Evaluate the model
mse = mean_squared_error(y, y_pred)
r2 = r2_score(y, y_pred)
print("Mean Squared Error:", mse)
print("R^2 Score:", r2)
# Visualize the results
plt.scatter(X, y, color='blue')
plt.plot(X, y_pred, color='red')
plt.xlabel('Feature')
plt.ylabel('Target')
plt.title('Polynomial Regression')
plt.show()
```
## Evaluation Metrics
- **Mean Squared Error (MSE)**: The average of the squared differences between actual and predicted values.
- **R-squared (R²) Score**: A statistical measure that represents the proportion of the variance for the dependent variable that is explained by the independent variables in the model.
## Conclusion
Polynomial Regression is a powerful tool for modeling non-linear relationships between variables. It is important to choose the degree of the polynomial carefully to balance between underfitting and overfitting. Understanding and properly evaluating the model using appropriate metrics ensures its effectiveness.
## References
- [Scikit-learn Documentation](https://scikit-learn.org/stable/modules/linear_model.html#polynomial-regression)
- [Wikipedia: Polynomial Regression](https://en.wikipedia.org/wiki/Polynomial_reg)

Wyświetl plik

@ -11,3 +11,6 @@
- [Sorting NumPy Arrays](sorting-array.md)
- [NumPy Array Iteration](array-iteration.md)
- [Concatenation of Arrays](concatenation-of-arrays.md)
- [Splitting of Arrays](splitting-arrays.md)
- [Universal Functions (Ufunc)](universal-functions.md)
- [Statistical Functions on Arrays](statistical-functions.md)

Wyświetl plik

@ -0,0 +1,135 @@
# Splitting Arrays
Splitting a NumPy array refers to dividing the array into smaller sub-arrays. This can be done in various ways, along specific rows, columns, or even based on conditions applied to the elements.
There are several ways to split a NumPy array in Python using different functions. Some of these methods include:
- Splitting a NumPy array using `numpy.split()`
- Splitting a NumPy array using `numpy.array_split()`
- Splitting a NumPy array using `numpy.vsplit()`
- Splitting a NumPy array using `numpy.hsplit()`
- Splitting a NumPy array using `numpy.dsplit()`
## NumPy split()
The `numpy.split()` function divides an array into equal parts along a specified axis.
**Code**
```python
import numpy as np
array = np.array([1,2,3,4,5,6])
#Splitting the array into 3 equal parts along axis=0
result = np.split(array,3)
print(result)
```
**Output**
```
[array([1, 2]), array([3, 4]), array([5, 6])]
```
## NumPy array_split()
The `numpy.array_split()` function divides an array into equal or nearly equal sub-arrays. Unlike `numpy.split()`, it allows for uneven splitting, making it useful when the array cannot be evenly divided by the specified number of splits.
**Code**
```python
import numpy as np
array = np.array([1,2,3,4,5,6,7,8])
#Splitting the array into 3 unequal parts along axis=0
result = np.array_split(array,3)
print(result)
```
**Output**
```
[array([1, 2, 3]), array([4, 5, 6]), array([7, 8])]
```
## NumPy vsplit()
The `numpy.vsplit()`, which is vertical splitting (row-wise), divides an array along the vertical axis (axis=0).
**Code**
```python
import numpy as np
array = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[10, 11, 12]])
#Vertically Splitting the array into 2 subarrays along axis=0
result = np.vsplit(array,2)
print(result)
```
**Output**
```
[array([[1, 2, 3],
[4, 5, 6]]), array([[ 7, 8, 9],
[10, 11, 12]])]
```
## NumPy hsplit()
The `numpy.hsplit()`, which is horizontal splitting (column-wise), divides an array along the horizontal axis (axis=1).
**Code**
```python
import numpy as np
array = np.array([[1, 2, 3, 4],
[5, 7, 8, 9],
[11,12,13,14]])
#Horizontally Splitting the array into 4 subarrays along axis=1
result = np.hsplit(array,4)
print(result)
```
**Output**
```
[array([[ 1],
[ 5],
[11]]), array([[ 2],
[ 7],
[12]]), array([[ 3],
[ 8],
[13]]), array([[ 4],
[ 9],
[14]])]
```
## NumPy dsplit()
The`numpy.dsplit()` is employed for splitting arrays along the third axis (axis=2), which is applicable for 3D arrays and beyond.
**Code**
```python
import numpy as np
#3D array
array = np.array([[[ 1, 2, 3, 4,],
[ 5, 6, 7, 8,],
[ 9, 10, 11, 12]],
[[13, 14, 15, 16,],
[17, 18, 19, 20,],
[21, 22, 23, 24]]])
#Splitting the array along axis=2
result = np.dsplit(array,2)
print(result)
```
**Output**
```
[array([[[ 1, 2],
[ 5, 6],
[ 9, 10]],
[[13, 14],
[17, 18],
[21, 22]]]), array([[[ 3, 4],
[ 7, 8],
[11, 12]],
[[15, 16],
[19, 20],
[23, 24]]])]
```

Wyświetl plik

@ -0,0 +1,154 @@
# Statistical Operations on Arrays
Statistics involves collecting data, analyzing it, and drawing conclusions from the gathered information.
NumPy provides powerful statistical functions to perform efficient data analysis on arrays, including `minimum`, `maximum`, `mean`, `median`, `variance`, `standard deviation`, and more.
## Minimum
In NumPy, the minimum value of an array is the smallest element present.
The smallest element of an array is calculated using the `np.min()` function.
**Code**
```python
import numpy as np
array = np.array([100,20,300,400])
#Calculating the minimum
result = np.min(array)
print("Minimum :", result)
```
**Output**
```
Minimum : 20
```
## Maximum
In NumPy, the maximum value of an array is the largest element present.
The largest element of an array is calculated using the `np.max()` function.
**Code**
```python
import numpy as np
array = np.array([100,20,300,400])
#Calculating the maximum
result = np.max(array)
print("Maximum :", result)
```
**Output**
```
Maximum : 400
```
## Mean
The mean value of a NumPy array is the average of all its elements.
It is calculated by summing all the elements and then dividing by the total number of elements.
The mean of an array is calculated using the `np.mean()` function.
**Code**
```python
import numpy as np
array = np.array([10,20,30,40])
#Calculating the mean
result = np.mean(array)
print("Mean :", result)
```
**Output**
```
Mean : 25.0
```
## Median
The median value of a NumPy array is the middle value in a sorted array.
It separates the higher half of the data from the lower half.
The median of an array is calculated using the `np.median()` function.
It is important to note that:
- If the number of elements is `odd`, the median is the middle element.
- If the number of elements is `even`, the median is the average of the two middle elements.
**Code**
```python
import numpy as np
#The number of elements is odd
array = np.array([5,6,7,8,9])
#Calculating the median
result = np.median(array)
print("Median :", result)
```
**Output**
```
Median : 7.0
```
**Code**
```python
import numpy as np
#The number of elements is even
array = np.array([1,2,3,4,5,6])
#Calculating the median
result = np.median(array)
print("Median :", result)
```
**Output**
```
Median : 3.5
```
## Variance
Variance in a NumPy array measures the spread or dispersion of data points.
Calculated as the average of the squared differences from the mean.
The variance of an array is calculated using the `np.var()` function.
**Code**
```python
import numpy as np
array = np.array([10,70,80,50,30])
#Calculating the variance
result = np.var(array)
print("Variance :", result)
```
**Output**
```
Variance : 656.0
```
## Standard Deviation
The standard deviation of a NumPy array measures the amount of variation or dispersion of the elements in the array.
It is calculated as the square root of the average of the squared differences from the mean, providing insight into how spread out the values are around the mean.
The standard deviation of an array is calculated using the `np.std()` function.
**Code**
```python
import numpy as np
array = np.array([25,30,40,55,75,100])
#Calculating the standard deviation
result = np.std(array)
print("Standard Deviation :", result)
```
**Output**
```
Standard Deviation : 26.365486699260625
```

Wyświetl plik

@ -0,0 +1,130 @@
# Universal functions (ufunc)
---
A `ufunc`, short for "`universal function`," is a fundamental concept in NumPy, a powerful library for numerical computing in Python. Universal functions are highly optimized, element-wise functions designed to perform operations on data stored in NumPy arrays.
## Uses of Ufuncs in NumPy
Universal functions (ufuncs) in NumPy provide a wide range of functionalities for efficient and powerful numerical computations. Below is a detailed explanation of their uses:
### 1. **Element-wise Operations**
Ufuncs perform operations on each element of the arrays independently.
```python
import numpy as np
A = np.array([1, 2, 3, 4])
B = np.array([5, 6, 7, 8])
# Element-wise addition
np.add(A, B) # Output: array([ 6, 8, 10, 12])
```
### 2. **Broadcasting**
Ufuncs support broadcasting, allowing operations on arrays with different shapes, making it possible to perform operations without explicitly reshaping arrays.
```python
C = np.array([1, 2, 3])
D = np.array([[1], [2], [3]])
# Broadcasting addition
np.add(C, D) # Output: array([[2, 3, 4], [3, 4, 5], [4, 5, 6]])
```
### 3. **Vectorization**
Ufuncs are vectorized, meaning they are implemented in low-level C code, allowing for fast execution and avoiding the overhead of Python loops.
```python
# Vectorized square root
np.sqrt(A) # Output: array([1., 1.41421356, 1.73205081, 2.])
```
### 4. **Type Flexibility**
Ufuncs handle various data types and perform automatic type casting as needed.
```python
E = np.array([1.0, 2.0, 3.0])
F = np.array([4, 5, 6])
# Addition with type casting
np.add(E, F) # Output: array([5., 7., 9.])
```
### 5. **Reduction Operations**
Ufuncs support reduction operations, such as summing all elements of an array or finding the product of all elements.
```python
# Summing all elements
np.add.reduce(A) # Output: 10
# Product of all elements
np.multiply.reduce(A) # Output: 24
```
### 6. **Accumulation Operations**
Ufuncs can perform accumulation operations, which keep a running tally of the computation.
```python
# Cumulative sum
np.add.accumulate(A) # Output: array([ 1, 3, 6, 10])
```
### 7. **Reduceat Operations**
Ufuncs can perform segmented reductions using the `reduceat` method, which applies the ufunc at specified intervals.
```python
G = np.array([0, 1, 2, 3, 4, 5, 6, 7])
indices = [0, 2, 5]
np.add.reduceat(G, indices) # Output: array([ 1, 9, 18])
```
### 8. **Outer Product**
Ufuncs can compute the outer product of two arrays, producing a matrix where each element is the result of applying the ufunc to each pair of elements from the input arrays.
```python
# Outer product
np.multiply.outer([1, 2, 3], [4, 5, 6])
# Output: array([[ 4, 5, 6],
# [ 8, 10, 12],
# [12, 15, 18]])
```
### 9. **Out Parameter**
Ufuncs can use the `out` parameter to store results in a pre-allocated array, saving memory and improving performance.
```python
result = np.empty_like(A)
np.multiply(A, B, out=result) # Output: array([ 5, 12, 21, 32])
```
# Create Your Own Ufunc
You can create custom ufuncs for specific needs using np.frompyfunc or np.vectorize, allowing Python functions to behave like ufuncs.
Here, we are using `frompyfunc()` which takes three argument:
1. function - the name of the function.
2. inputs - the number of input (arrays).
3. outputs - the number of output arrays.
```python
def my_add(x, y):
return x + y
my_add_ufunc = np.frompyfunc(my_add, 2, 1)
my_add_ufunc(A, B) # Output: array([ 6, 8, 10, 12], dtype=object)
```
# Some Common Ufunc are
Here are some commonly used ufuncs in NumPy:
- **Arithmetic**: `np.add`, `np.subtract`, `np.multiply`, `np.divide`
- **Trigonometric**: `np.sin`, `np.cos`, `np.tan`
- **Exponential and Logarithmic**: `np.exp`, `np.log`, `np.log10`
- **Comparison**: `np.maximum`, `np.minimum`, `np.greater`, `np.less`
- **Logical**: `np.logical_and`, `np.logical_or`, `np.logical_not`
For more such Ufunc, address to [Universal functions (ufunc) — NumPy](https://numpy.org/doc/stable/reference/ufuncs.html)

Wyświetl plik

@ -1,4 +1,5 @@
# List of sections
- [Installation of Scipy and its key uses](installation_features.md)
- [SciPy Graphs](scipy-graphs.md)

Wyświetl plik

@ -0,0 +1,165 @@
# SciPy Graphs
Graphs are also a type of data structure, SciPy provides a module called scipy.sparse.csgraph for working with graphs.
## Adjacency Matrix
An adjacency matrix is a way of representing a graph using a square matrix. In the matrix, the element at the i-th row and j-th column indicates whether there is an edge from vertex
i to vertex j.
```python
import numpy as np
from scipy.sparse import csr_matrix
adj_matrix = np.array([
[0, 1, 0, 0],
[1, 0, 1, 0],
[0, 1, 0, 1],
[0, 0, 1, 0]
])
sparse_matrix = csr_matrix(adj_matrix)
print(sparse_matrix)
```
In this example:
1. The graph has 4 nodes.
2. is an edge between node 0 and node 1, node 1 and node 2, and node 2 and node 3.
3. The csr_matrix function converts the dense adjacency matrix into a compressed sparse row (CSR) format, which is efficient for storing large, sparse matrices.
## Floyd Warshall
The Floyd-Warshall algorithm is a classic algorithm used to find the shortest paths between all pairs of nodes in a weighted graph.
```python
import numpy as np
from scipy.sparse.csgraph import floyd_warshall
from scipy.sparse import csr_matrix
arr = np.array([
[0, 1, 2],
[1, 0, 0],
[2, 0, 0]
])
newarr = csr_matrix(arr)
print(floyd_warshall(newarr, return_predecessors=True))
```
#### Output
```
(array([[0., 1., 2.],
[1., 0., 3.],
[2., 3., 0.]]), array([[-9999, 0, 0],
[ 1, -9999, 0],
[ 2, 0, -9999]], dtype=int32))
```
## Dijkstra
Dijkstra's algorithm is used to find the shortest path from a source node to all other nodes in a graph with non-negative edge weights.
```python
import numpy as np
from scipy.sparse.csgraph import dijkstra
from scipy.sparse import csr_matrix
arr = np.array([
[0, 1, 2],
[1, 0, 0],
[2, 0, 0]
])
newarr = csr_matrix(arr)
print(dijkstra(newarr, return_predecessors=True, indices=0))
```
#### Output
```
(array([ 0., 1., 2.]), array([-9999, 0, 0], dtype=int32))
```
## Bellman Ford
The Bellman-Ford algorithm is used to find the shortest path from a single source vertex to all other vertices in a weighted graph. It can handle graphs with negative weights, and it also detects negative weight cycles.
```python
import numpy as np
from scipy.sparse.csgraph import bellman_ford
from scipy.sparse import csr_matrix
arr = np.array([
[0, -1, 2],
[1, 0, 0],
[2, 0, 0]
])
newarr = csr_matrix(arr)
print(bellman_ford(newarr, return_predecessors=True, indices=0))
```
#### Output
```
(array([ 0., -1., 2.]), array([-9999, 0, 0], dtype=int32))
```
## Depth First Order
Depth-First Search (DFS) is an algorithm for traversing or searching tree or graph data structures. The algorithm starts at the root and explores as far as possible along each branch before backtracking.
```python
import numpy as np
from scipy.sparse.csgraph import depth_first_order
from scipy.sparse import csr_matrix
arr = np.array([
[0, 1, 0, 1],
[1, 1, 1, 1],
[2, 1, 1, 0],
[0, 1, 0, 1]
])
newarr = csr_matrix(arr)
print(depth_first_order(newarr, 1))
```
#### Output
```
(array([1, 0, 3, 2], dtype=int32), array([ 1, -9999, 1, 0], dtype=int32))
```
## Breadth First Order
Breadth-First Search (BFS) is an algorithm for traversing or searching tree or graph data structures. It starts at the root present depth level before moving on to nodes at the next depth level.
```python
import numpy as np
from scipy.sparse.csgraph import breadth_first_order
from scipy.sparse import csr_matrix
arr = np.array([
[0, 1, 0, 1],
[1, 1, 1, 1],
[2, 1, 1, 0],
[0, 1, 0, 1]
])
newarr = csr_matrix(arr)
print(breadth_first_order(newarr, 1))
```
### Output
```
(array([1, 0, 2, 3], dtype=int32), array([ 1, -9999, 1, 1], dtype=int32))
```