Merge branch 'main' into heaps

pull/1117/head
Ashita Prasad 2024-06-22 16:11:20 +05:30 zatwierdzone przez GitHub
commit 95468029b1
Nie znaleziono w bazie danych klucza dla tego podpisu
ID klucza GPG: B5690EEEBB952194
12 zmienionych plików z 1349 dodań i 0 usunięć

Wyświetl plik

@ -18,3 +18,4 @@
- [Reduce](reduce-function.md)
- [List Comprehension](list-comprehension.md)
- [Eval Function](eval_function.md)
- [Magic Methods](magic-methods.md)

Wyświetl plik

@ -0,0 +1,151 @@
# Magic Methods
Magic methods, also known as dunder (double underscore) methods, are special methods in Python that start and end with double underscores (`__`).
These methods allow you to define the behavior of objects for built-in operations and functions, enabling you to customize how your objects interact with the
language's syntax and built-in features. Magic methods make your custom classes integrate seamlessly with Pythons built-in data types and operations.
**Commonly Used Magic Methods**
1. **Initialization and Representation**
- `__init__(self, ...)`: Called when an instance of the class is created. Used for initializing the object's attributes.
- `__repr__(self)`: Returns a string representation of the object, useful for debugging and logging.
- `__str__(self)`: Returns a human-readable string representation of the object.
**Example** :
```python
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
def __repr__(self):
return f"Person({self.name}, {self.age})"
def __str__(self):
return f"{self.name}, {self.age} years old"
p = Person("Alice", 30)
print(repr(p))
print(str(p))
```
**Output** :
```python
Person("Alice",30)
Alice, 30 years old
```
2. **Arithmetic Operations**
- `__add__(self, other)`: Defines behavior for the `+` operator.
- `__sub__(self, other)`: Defines behavior for the `-` operator.
- `__mul__(self, other)`: Defines behavior for the `*` operator.
- `__truediv__(self, other)`: Defines behavior for the `/` operator.
**Example** :
```python
class Vector:
def __init__(self, x, y):
self.x = x
self.y = y
def __add__(self, other):
return Vector(self.x + other.x, self.y + other.y)
def __repr__(self):
return f"Vector({self.x}, {self.y})"
v1 = Vector(2, 3)
v2 = Vector(1, 1)
v3 = v1 + v2
print(v3)
```
**Output** :
```python
Vector(3, 4)
```
3. **Comparison Operations**
- `__eq__(self, other)`: Defines behavior for the `==` operator.
- `__lt__(self, other)`: Defines behavior for the `<` operator.
- `__le__(self, other)`: Defines behavior for the `<=` operator.
**Example** :
```python
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
def __eq__(self, other):
return self.age == other.age
def __lt__(self, other):
return self.age < other.age
p1 = Person("Alice", 30)
p2 = Person("Bob", 25)
print(p1 == p2)
print(p1 < p2)
```
**Output** :
```python
False
False
```
5. **Container and Sequence Methods**
- `__len__(self)`: Defines behavior for the `len()` function.
- `__getitem__(self, key)`: Defines behavior for indexing (`self[key]`).
- `__setitem__(self, key, value)`: Defines behavior for item assignment (`self[key] = value`).
- `__delitem__(self, key)`: Defines behavior for item deletion (`del self[key]`).
**Example** :
```python
class CustomList:
def __init__(self, *args):
self.items = list(args)
def __len__(self):
return len(self.items)
def __getitem__(self, index):
return self.items[index]
def __setitem__(self, index, value):
self.items[index] = value
def __delitem__(self, index):
del self.items[index]
def __repr__(self):
return f"CustomList({self.items})"
cl = CustomList(1, 2, 3)
print(len(cl))
print(cl[1])
cl[1] = 5
print(cl)
del cl[1]
print(cl)
```
**Output** :
```python
3
2
CustomList([1, 5, 3])
CustomList([1, 3])
```
Magic methods provide powerful ways to customize the behavior of your objects and make them work seamlessly with Python's syntax and built-in functions.
Use them judiciously to enhance the functionality and readability of your classes.

Wyświetl plik

@ -0,0 +1,231 @@
# Binary Tree
A binary tree is a non-linear data structure in which each node can have atmost two children, known as the left and the right child. It is a heirarchial data structure represented in the following way:
```
A...................Level 0
/ \
B C.................Level 1
/ \ \
D E G...............Level 2
```
## Basic Terminologies
- **Root node:** The topmost node in a tree is the root node. The root node does not have any parent. In the above example, **A** is the root node.
- **Parent node:** The predecessor of a node is called the parent of that node. **A** is the parent of **B** and **C**, **B** is the parent of **D** and **E** and **C** is the parent of **G**.
- **Child node:** The successor of a node is called the child of that node. **B** and **C** are children of **A**, **D** and **E** are children of **B** and **G** is the right child of **C**.
- **Leaf node:** Nodes without any children are called the leaf nodes. **D**, **E** and **G** are the leaf nodes.
- **Ancestor node:** Predecessor nodes on the path from the root to that node are called ancestor nodes. **A** and **B** are the ancestors of **E**.
- **Descendant node:** Successor nodes on the path from the root to that node are called descendant nodes. **B** and **E** are descendants of **A**.
- **Sibling node:** Nodes having the same parent are called sibling nodes. **B** and **C** are sibling nodes and so are **D** and **E**.
- **Level (Depth) of a node:** Number of edges in the path from the root to that node is the level of that node. The root node is always at level 0. The depth of root node is the depth of the tree.
- **Height of a node:** Number of edges in the path from that node to the deepest leaf is the height of that node. The height of the root is the height of a tree. Height of node **A** is 2, nodes **B** and **C** is 1 and nodes **D**, **E** and **G** is 0.
## Types Of Binary Trees
- **Full Binary Tree:** A binary tree where each node has 0 or 2 children is a full binary tree.
```
A
/ \
B C
/ \
D E
```
- **Complete Binary Tree:** A binary tree in which all levels are completely filled except the last level is a complete binary tree. Whenever new nodes are inserted, they are inserted from the left side.
```
A
/ \
/ \
B C
/ \ /
D E F
```
- **Perfect Binary Tree:** A binary tree in which all nodes are completely filled, i.e., each node has two children is called a perfect binary tree.
```
A
/ \
/ \
B C
/ \ / \
D E F G
```
- **Skewed Binary Tree:** A binary tree in which each node has either 0 or 1 child is called a skewed binary tree. It is of two types - left skewed binary tree and right skewed binary tree.
```
A A
\ /
B B
\ /
C C
Right skewed binary tree Left skewed binary tree
```
- **Balanced Binary Tree:** A binary tree in which the height difference between the left and right subtree is not more than one and the subtrees are also balanced is a balanced binary tree.
```
A
/ \
B C
/ \
D E
```
## Real Life Applications Of Binary Tree
- **File Systems:** File systems employ binary trees to organize the folders and files, facilitating efficient search and access of files.
- **Decision Trees:** Decision tree, a supervised learning algorithm, utilizes binary trees, with each node representing a decision and its edges showing the possible outcomes.
- **Routing Algorithms:** In routing algorithms, binary trees are used to efficiently transfer data packets from the source to destination through a network of nodes.
- **Searching and sorting Algorithms:** Searching algorithms like binary search and sorting algorithms like heapsort heavily rely on binary trees.
## Implementation of Binary Tree
```python
from collections import deque
class Node:
def __init__(self, data):
self.data = data
self.left = None
self.right = None
class Binary_tree:
@staticmethod
def insert(root, data):
if root is None:
return Node(data)
q = deque()
q.append(root)
while q:
temp = q.popleft()
if temp.left is None:
temp.left = Node(data)
break
else:
q.append(temp.left)
if temp.right is None:
temp.right = Node(data)
break
else:
q.append(temp.right)
return root
@staticmethod
def inorder(root):
if not root:
return
b.inorder(root.left)
print(root.data, end=" ")
b.inorder(root.right)
@staticmethod
def preorder(root):
if not root:
return
print(root.data, end=" ")
b.preorder(root.left)
b.preorder(root.right)
@staticmethod
def postorder(root):
if not root:
return
b.postorder(root.left)
b.postorder(root.right)
print(root.data, end=" ")
@staticmethod
def levelorder(root):
if not root:
return
q = deque()
q.append(root)
while q:
temp = q.popleft()
print(temp.data, end=" ")
if temp.left is not None:
q.append(temp.left)
if temp.right is not None:
q.append(temp.right)
@staticmethod
def delete(root, value):
q = deque()
q.append(root)
while q:
temp = q.popleft()
if temp is value:
temp = None
return
if temp.right:
if temp.right is value:
temp.right = None
return
else:
q.append(temp.right)
if temp.left:
if temp.left is value:
temp.left = None
return
else:
q.append(temp.left)
@staticmethod
def delete_value(root, value):
if root is None:
return None
if root.left is None and root.right is None:
if root.data == value:
return None
else:
return root
x = None
q = deque()
q.append(root)
temp = None
while q:
temp = q.popleft()
if temp.data == value:
x = temp
if temp.left:
q.append(temp.left)
if temp.right:
q.append(temp.right)
if x:
y = temp.data
x.data = y
b.delete(root, temp)
return root
b = Binary_tree()
root = None
root = b.insert(root, 10)
root = b.insert(root, 20)
root = b.insert(root, 30)
root = b.insert(root, 40)
root = b.insert(root, 50)
root = b.insert(root, 60)
print("Preorder traversal:", end=" ")
b.preorder(root)
print("\nInorder traversal:", end=" ")
b.inorder(root)
print("\nPostorder traversal:", end=" ")
b.postorder(root)
print("\nLevel order traversal:", end=" ")
b.levelorder(root)
root = b.delete_value(root, 20)
print("\nLevel order traversal after deletion:", end=" ")
b.levelorder(root)
```
#### OUTPUT
```
Preorder traversal: 10 20 40 50 30 60
Inorder traversal: 40 20 50 10 60 30
Postorder traversal: 40 50 20 60 30 10
Level order traversal: 10 20 30 40 50 60
Level order traversal after deletion: 10 60 30 40 50
```

Wyświetl plik

@ -0,0 +1,212 @@
# Data Structures: Hash Tables, Hash Sets, and Hash Maps
## Table of Contents
- [Introduction](#introduction)
- [Hash Tables](#hash-tables)
- [Overview](#overview)
- [Operations](#operations)
- [Hash Sets](#hash-sets)
- [Overview](#overview-1)
- [Operations](#operations-1)
- [Hash Maps](#hash-maps)
- [Overview](#overview-2)
- [Operations](#operations-2)
- [Conclusion](#conclusion)
## Introduction
This document provides an overview of three fundamental data structures in computer science: hash tables, hash sets, and hash maps. These structures are widely used for efficient data storage and retrieval operations.
## Hash Tables
### Overview
A **hash table** is a data structure that stores key-value pairs. It uses a hash function to compute an index into an array of buckets or slots, from which the desired value can be found.
### Operations
1. **Insertion**: Add a new key-value pair to the hash table.
2. **Deletion**: Remove a key-value pair from the hash table.
3. **Search**: Find the value associated with a given key.
4. **Update**: Modify the value associated with a given key.
**Example Code (Python):**
```python
class Node:
def __init__(self, key, value):
self.key = key
self.value = value
self.next = None
class HashTable:
def __init__(self, capacity):
self.capacity = capacity
self.size = 0
self.table = [None] * capacity
def _hash(self, key):
return hash(key) % self.capacity
def insert(self, key, value):
index = self._hash(key)
if self.table[index] is None:
self.table[index] = Node(key, value)
self.size += 1
else:
current = self.table[index]
while current:
if current.key == key:
current.value = value
return
current = current.next
new_node = Node(key, value)
new_node.next = self.table[index]
self.table[index] = new_node
self.size += 1
def search(self, key):
index = self._hash(key)
current = self.table[index]
while current:
if current.key == key:
return current.value
current = current.next
raise KeyError(key)
def remove(self, key):
index = self._hash(key)
previous = None
current = self.table[index]
while current:
if current.key == key:
if previous:
previous.next = current.next
else:
self.table[index] = current.next
self.size -= 1
return
previous = current
current = current.next
raise KeyError(key)
def __len__(self):
return self.size
def __contains__(self, key):
try:
self.search(key)
return True
except KeyError:
return False
# Driver code
if __name__ == '__main__':
ht = HashTable(5)
ht.insert("apple", 3)
ht.insert("banana", 2)
ht.insert("cherry", 5)
print("apple" in ht)
print("durian" in ht)
print(ht.search("banana"))
ht.insert("banana", 4)
print(ht.search("banana")) # 4
ht.remove("apple")
print(len(ht)) # 3
```
# Insert elements
hash_table["key1"] = "value1"
hash_table["key2"] = "value2"
# Search for an element
value = hash_table.get("key1")
# Delete an element
del hash_table["key2"]
# Update an element
hash_table["key1"] = "new_value1"
## Hash Sets
### Overview
A **hash set** is a collection of unique elements. It is implemented using a hash table where each bucket can store only one element.
### Operations
1. **Insertion**: Add a new element to the set.
2. **Deletion**: Remove an element from the set.
3. **Search**: Check if an element exists in the set.
4. **Union**: Combine two sets to form a new set with elements from both.
5. **Intersection**: Find common elements between two sets.
6. **Difference**: Find elements present in one set but not in the other.
**Example Code (Python):**
```python
# Create a hash set
hash_set = set()
# Insert elements
hash_set.add("element1")
hash_set.add("element2")
# Search for an element
exists = "element1" in hash_set
# Delete an element
hash_set.remove("element2")
# Union of sets
another_set = {"element3", "element4"}
union_set = hash_set.union(another_set)
# Intersection of sets
intersection_set = hash_set.intersection(another_set)
# Difference of sets
difference_set = hash_set.difference(another_set)
```
## Hash Maps
### Overview
A **hash map** is similar to a hash table but often provides additional functionalities and more user-friendly interfaces for developers. It is a collection of key-value pairs where each key is unique.
### Operations
1. **Insertion**: Add a new key-value pair to the hash map.
2. **Deletion**: Remove a key-value pair from the hash map.
3. **Search**: Retrieve the value associated with a given key.
4. **Update**: Change the value associated with a given key.
**Example Code (Python):**
```python
# Create a hash map
hash_map = {}
# Insert elements
hash_map["key1"] = "value1"
hash_map["key2"] = "value2"
# Search for an element
value = hash_map.get("key1")
# Delete an element
del hash_map["key2"]
# Update an element
hash_map["key1"] = "new_value1"
```
## Conclusion
Hash tables, hash sets, and hash maps are powerful data structures that provide efficient means of storing and retrieving data. Understanding these structures and their operations is crucial for developing optimized algorithms and applications.

Wyświetl plik

@ -17,6 +17,8 @@
- [Hashing through Linear Probing](hashing-linear-probing.md)
- [Hashing through Chaining](hashing-chaining.md)
- [Heaps](heaps.md)
- [Hash Tables, Sets, Maps](hash-tables.md)
- [Binary Tree](binary-tree.md)
- [AVL Trees](avl-trees.md)
- [Splay Trees](splay-trees.md)
- [Dijkstra's Algorithm](dijkstra.md)

Wyświetl plik

@ -21,3 +21,5 @@
- [Transformers](transformers.md)
- [K-Means](kmeans.md)
- [K-nearest neighbor (KNN)](knn.md)
- [Naive Bayes](naive-bayes.md)
- [Neural network regression](neural-network-regression.md)

Wyświetl plik

@ -0,0 +1,369 @@
# Naive Bayes
## Introduction
The Naive Bayes model uses probabilities to predict an outcome.It is a supervised machine learning technique, i.e. it reqires labelled data for training. It is used for classification and is based on the Bayes' Theorem. The basic assumption of this model is the independence among the features, i.e. a feature is unaffected by any other feture.
## Bayes' Theorem
Bayes' theorem is given by:
$$
P(a|b) = \frac{P(b|a)*P(a)}{P(b)}
$$
where:
- $P(a|b)$ is the posterior probability, i.e. probability of 'a' given that 'b' is true,
- $P(b|a)$ is the likelihood probability i.e. probability of 'b' given that 'a' is true,
- $P(a)$ and $P(b)$ are the probabilities of 'a' and 'b' respectively, independent of each other.
## Applications
Naive Bayes classifier has numerous applications including :
1. Text classification.
2. Sentiment analysis.
3. Spam filtering.
4. Multiclass classification (eg. Weather prediction).
5. Recommendation Systems.
6. Healthcare sector.
7. Document categorization.
## Advantages
1. Easy to implement.
2. Useful even if training dataset is limited (where a decision tree would not be recommended).
3. Supports multiclass classification which is not supported by some machine learning algorithms like SVM and logistic regression.
4. Scalable, fast and efficient.
## Disadvantages
1. Assumes features to be independent, which may not be true in certain scenarios.
2. Zero probability error.
3. Sensitive to noise.
## Zero Probability Error
Zero probability error is said to occur if in some case the number of occurances of an event given another event is zero.
To handle zero probability error, Laplace's correction is used by adding a small constant .
**Example:**
Given the data below, find whether tennis can be played if ( outlook=overcast, wind=weak ).
**Data**
---
| SNo | Outlook (A) | Wind (B) | PlayTennis (R) |
|-----|--------------|------------|-------------------|
| 1 | Rain | Weak | No |
| 2 | Rain | Strong | No |
| 3 | Overcast | Weak | Yes |
| 4 | Rain | Weak | Yes |
| 5 | Overcast | Weak | Yes |
| 6 | Rain | Strong | No |
| 7 | Overcast | Strong | Yes |
| 8 | Rain | Weak | No |
| 9 | Overcast | Weak | Yes |
| 10 | Rain | Weak | Yes |
---
- **Calculate prior probabilities**
$$
P(Yes) = \frac{6}{10} = 0.6
$$
$$
P(No) = \frac{4}{10} = 0.4
$$
- **Calculate likelihoods**
1.**Outlook (A):**
---
| A\R | Yes | No |
|-----------|-------|-----|
| Rain | 2 | 4 |
| Overcast | 4 | 0 |
| Total | 6 | 4 |
---
- Rain:
$$
P(Rain|Yes) = \frac{2}{6}
$$
$$
P(Rain|No) = \frac{4}{4}
$$
- Overcast:
$$
P(Overcast|Yes) = \frac{4}{6}
$$
$$
P(Overcast|No) = \frac{0}{4}
$$
Here, we can see that
$$
P(Overcast|No) = 0
$$
This is a zero probability error!
Since probability is 0, naive bayes model fails to predict.
**Applying Laplace's correction:**
In Laplace's correction, we scale the values for 1000 instances.
- **Calculate prior probabilities**
$$
P(Yes) = \frac{600}{1002}
$$
$$
P(No) = \frac{402}{1002}
$$
- **Calculate likelihoods**
1. **Outlook (A):**
( Converted to 1000 instances )
We will add 1 instance each to the (PlayTennis|No) column {Laplace's correction}
---
| A\R | Yes | No |
|-----------|-------|---------------|
| Rain | 200 | (400+1)=401 |
| Overcast | 400 | (0+1)=1 |
| Total | 600 | 402 |
---
- **Rain:**
$$
P(Rain|Yes) = \frac{200}{600}
$$
$$
P(Rain|No) = \frac{401}{402}
$$
- **Overcast:**
$$
P(Overcast|Yes) = \frac{400}{600}
$$
$$
P(Overcast|No) = \frac{1}{402}
$$
2. **Wind (B):**
---
| B\R | Yes | No |
|-----------|---------|-------|
| Weak | 500 | 200 |
| Strong | 100 | 200 |
| Total | 600 | 400 |
---
- **Weak:**
$$
P(Weak|Yes) = \frac{500}{600}
$$
$$
P(Weak|No) = \frac{200}{400}
$$
- **Strong:**
$$
P(Strong|Yes) = \frac{100}{600}
$$
$$
P(Strong|No) = \frac{200}{400}
$$
- **Calculting probabilities:**
$$
P(PlayTennis|Yes) = P(Yes) * P(Overcast|Yes) * P(Weak|Yes)
$$
$$
= \frac{600}{1002} * \frac{400}{600} * \frac{500}{600}
$$
$$
= 0.3326
$$
$$
P(PlayTennis|No) = P(No) * P(Overcast|No) * P(Weak|No)
$$
$$
= \frac{402}{1002} * \frac{1}{402} * \frac{200}{400}
$$
$$
= 0.000499 = 0.0005
$$
Since ,
$$
P(PlayTennis|Yes) > P(PlayTennis|No)
$$
we can conclude that tennis can be played if outlook is overcast and wind is weak.
# Types of Naive Bayes classifier
## Guassian Naive Bayes
It is used when the dataset has **continuous data**. It assumes that the data is distributed normally (also known as guassian distribution).
A guassian distribution can be characterized by a bell-shaped curve.
**Continuous data features :** Features which can take any real values within a certain range. These features have an infinite number of possible values.They are generally measured, not counted.
eg. weight, height, temperature, etc.
**Code**
```python
#import libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn import metrics
from sklearn.metrics import confusion_matrix
#read data
d=pd.read_csv("data.csv")
df=pd.DataFrame(d)
X = df.iloc[:,1:7:1]
y = df.iloc[:,7:8:1]
# splitting X and y into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42)
# training the model on training set
obj = GaussianNB()
obj.fit(X_train, y_train)
#making predictions on the testing set
y_pred = obj.predict(X_train)
#comparing y_test and y_pred
print("Gaussian Naive Bayes model accuracy:", metrics.accuracy_score(y_train, y_pred))
print("Confusion matrix: \n",confusion_matrix(y_train,y_pred))
```
## Multinomial Naive Bayes
Appropriate when the features are categorical or countable. It models the likelihood of each feature as a multinomial distribution.
Multinomial distribution is used to find probabilities of each category, given multiple categories (eg. Text classification).
**Code**
```python
#import libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB
from sklearn import metrics
from sklearn.metrics import confusion_matrix
#read data
d=pd.read_csv("data.csv")
df=pd.DataFrame(d)
X = df.iloc[:,1:7:1]
y = df.iloc[:,7:8:1]
# splitting X and y into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42)
# training the model on training set
obj = MultinomialNB()
obj.fit(X_train, y_train)
#making predictions on the testing set
y_pred = obj.predict(X_train)
#comparing y_test and y_pred
print("Gaussian Naive Bayes model accuracy:", metrics.accuracy_score(y_train, y_pred))
print("Confusion matrix: \n",confusion_matrix(y_train,y_pred))
```
## Bernoulli Naive Bayes
It is specifically designed for binary features (eg. Yes or No). It models the likelihood of each feature as a Bernoulli distribution.
Bernoulli distribution is used when there are only two possible outcomes (eg. success or failure of an event).
**Code**
```python
#import libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import BernoulliNB
from sklearn import metrics
from sklearn.metrics import confusion_matrix
#read data
d=pd.read_csv("data.csv")
df=pd.DataFrame(d)
X = df.iloc[:,1:7:1]
y = df.iloc[:,7:8:1]
# splitting X and y into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42)
# training the model on training set
obj = BernoulliNB()
obj.fit(X_train, y_train)
#making predictions on the testing set
y_pred = obj.predict(X_train)
#comparing y_test and y_pred
print("Gaussian Naive Bayes model accuracy:", metrics.accuracy_score(y_train, y_pred))
print("Confusion matrix: \n",confusion_matrix(y_train,y_pred))
```
## Evaluation
1. Confusion matrix.
2. Accuracy.
3. ROC curve.
## Conclusion
We can conclude that naive bayes may limit in some cases due to the assumption that the features are independent of each other but still reliable in many cases. Naive Bayes is an efficient classifier and works even on small datasets.

Wyświetl plik

@ -0,0 +1,84 @@
# Neural Network Regression in Python using Scikit-learn
## Overview
Neural Network Regression is used to predict continuous values based on input features. Scikit-learn provides an easy-to-use interface for implementing neural network models, specifically through the `MLPRegressor` class, which stands for Multi-Layer Perceptron Regressor.
## When to Use Neural Network Regression
### Suitable Scenarios
1. **Complex Relationships**: Ideal when the relationship between features and the target variable is complex and non-linear.
2. **Sufficient Data**: Works well with large datasets that can support training deep learning models.
3. **Feature Extraction**: Useful in cases where the neural network's feature extraction capabilities can be leveraged, such as with image or text data.
### Unsuitable Scenarios
1. **Small Datasets**: Less effective with small datasets due to overfitting and inability to learn complex patterns.
2. **Low-latency Predictions**: Might not be suitable for real-time applications with strict latency requirements.
3. **Interpretability**: Not ideal when model interpretability is crucial, as neural networks are often seen as "black-box" models.
## Implementing Neural Network Regression in Python with Scikit-learn
### Step-by-Step Implementation
1. **Import Libraries**
```python
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neural_network import MLPRegressor
from sklearn.metrics import mean_absolute_error
```
2. **Load and Prepare Data**
For illustration, let's use a synthetic dataset.
```python
# Generate synthetic data
np.random.seed(42)
X = np.random.rand(1000, 3)
y = X[:, 0] * 3 + X[:, 1] * -2 + X[:, 2] * 0.5 + np.random.randn(1000) * 0.1
# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Standardize the data
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
```
3. **Build and Train the Neural Network Model**
```python
# Create the MLPRegressor model
mlp = MLPRegressor(hidden_layer_sizes=(64, 64), activation='relu', solver='adam', max_iter=500, random_state=42)
# Train the model
mlp.fit(X_train, y_train)
```
4. **Evaluate the Model**
```python
# Make predictions
y_pred = mlp.predict(X_test)
# Calculate the Mean Absolute Error
mae = mean_absolute_error(y_test, y_pred)
print(f"Test Mean Absolute Error: {mae}")
```
### Explanation
- **Data Generation and Preparation**: Synthetic data is created, split into training and test sets, and standardized to improve the efficiency of the neural network training process.
- **Model Construction and Training**: An `MLPRegressor` is created with two hidden layers, each containing 64 neurons and ReLU activation functions. The model is trained using the Adam optimizer for a maximum of 500 iterations.
- **Evaluation**: The model's performance is evaluated on the test set using Mean Absolute Error (MAE) as the performance metric.
## Conclusion
Neural Network Regression with Scikit-learn's `MLPRegressor` is a powerful method for predicting continuous values in complex, non-linear scenarios. However, it's essential to ensure that you have enough data to train the model effectively and consider the computational resources required. Simpler models may be more appropriate for small datasets or when model interpretability is necessary. By following the steps outlined, you can build, train, and evaluate a neural network for regression tasks in Python using Scikit-learn.

Wyświetl plik

@ -11,3 +11,4 @@
- [Sorting NumPy Arrays](sorting-array.md)
- [NumPy Array Iteration](array-iteration.md)
- [Concatenation of Arrays](concatenation-of-arrays.md)
- [Universal Functions (Ufunc)](universal-functions.md)

Wyświetl plik

@ -0,0 +1,130 @@
# Universal functions (ufunc)
---
A `ufunc`, short for "`universal function`," is a fundamental concept in NumPy, a powerful library for numerical computing in Python. Universal functions are highly optimized, element-wise functions designed to perform operations on data stored in NumPy arrays.
## Uses of Ufuncs in NumPy
Universal functions (ufuncs) in NumPy provide a wide range of functionalities for efficient and powerful numerical computations. Below is a detailed explanation of their uses:
### 1. **Element-wise Operations**
Ufuncs perform operations on each element of the arrays independently.
```python
import numpy as np
A = np.array([1, 2, 3, 4])
B = np.array([5, 6, 7, 8])
# Element-wise addition
np.add(A, B) # Output: array([ 6, 8, 10, 12])
```
### 2. **Broadcasting**
Ufuncs support broadcasting, allowing operations on arrays with different shapes, making it possible to perform operations without explicitly reshaping arrays.
```python
C = np.array([1, 2, 3])
D = np.array([[1], [2], [3]])
# Broadcasting addition
np.add(C, D) # Output: array([[2, 3, 4], [3, 4, 5], [4, 5, 6]])
```
### 3. **Vectorization**
Ufuncs are vectorized, meaning they are implemented in low-level C code, allowing for fast execution and avoiding the overhead of Python loops.
```python
# Vectorized square root
np.sqrt(A) # Output: array([1., 1.41421356, 1.73205081, 2.])
```
### 4. **Type Flexibility**
Ufuncs handle various data types and perform automatic type casting as needed.
```python
E = np.array([1.0, 2.0, 3.0])
F = np.array([4, 5, 6])
# Addition with type casting
np.add(E, F) # Output: array([5., 7., 9.])
```
### 5. **Reduction Operations**
Ufuncs support reduction operations, such as summing all elements of an array or finding the product of all elements.
```python
# Summing all elements
np.add.reduce(A) # Output: 10
# Product of all elements
np.multiply.reduce(A) # Output: 24
```
### 6. **Accumulation Operations**
Ufuncs can perform accumulation operations, which keep a running tally of the computation.
```python
# Cumulative sum
np.add.accumulate(A) # Output: array([ 1, 3, 6, 10])
```
### 7. **Reduceat Operations**
Ufuncs can perform segmented reductions using the `reduceat` method, which applies the ufunc at specified intervals.
```python
G = np.array([0, 1, 2, 3, 4, 5, 6, 7])
indices = [0, 2, 5]
np.add.reduceat(G, indices) # Output: array([ 1, 9, 18])
```
### 8. **Outer Product**
Ufuncs can compute the outer product of two arrays, producing a matrix where each element is the result of applying the ufunc to each pair of elements from the input arrays.
```python
# Outer product
np.multiply.outer([1, 2, 3], [4, 5, 6])
# Output: array([[ 4, 5, 6],
# [ 8, 10, 12],
# [12, 15, 18]])
```
### 9. **Out Parameter**
Ufuncs can use the `out` parameter to store results in a pre-allocated array, saving memory and improving performance.
```python
result = np.empty_like(A)
np.multiply(A, B, out=result) # Output: array([ 5, 12, 21, 32])
```
# Create Your Own Ufunc
You can create custom ufuncs for specific needs using np.frompyfunc or np.vectorize, allowing Python functions to behave like ufuncs.
Here, we are using `frompyfunc()` which takes three argument:
1. function - the name of the function.
2. inputs - the number of input (arrays).
3. outputs - the number of output arrays.
```python
def my_add(x, y):
return x + y
my_add_ufunc = np.frompyfunc(my_add, 2, 1)
my_add_ufunc(A, B) # Output: array([ 6, 8, 10, 12], dtype=object)
```
# Some Common Ufunc are
Here are some commonly used ufuncs in NumPy:
- **Arithmetic**: `np.add`, `np.subtract`, `np.multiply`, `np.divide`
- **Trigonometric**: `np.sin`, `np.cos`, `np.tan`
- **Exponential and Logarithmic**: `np.exp`, `np.log`, `np.log10`
- **Comparison**: `np.maximum`, `np.minimum`, `np.greater`, `np.less`
- **Logical**: `np.logical_and`, `np.logical_or`, `np.logical_not`
For more such Ufunc, address to [Universal functions (ufunc) — NumPy](https://numpy.org/doc/stable/reference/ufuncs.html)

Wyświetl plik

@ -1,4 +1,5 @@
# List of sections
- [Installation of Scipy and its key uses](installation_features.md)
- [SciPy Graphs](scipy-graphs.md)

Wyświetl plik

@ -0,0 +1,165 @@
# SciPy Graphs
Graphs are also a type of data structure, SciPy provides a module called scipy.sparse.csgraph for working with graphs.
## Adjacency Matrix
An adjacency matrix is a way of representing a graph using a square matrix. In the matrix, the element at the i-th row and j-th column indicates whether there is an edge from vertex
i to vertex j.
```python
import numpy as np
from scipy.sparse import csr_matrix
adj_matrix = np.array([
[0, 1, 0, 0],
[1, 0, 1, 0],
[0, 1, 0, 1],
[0, 0, 1, 0]
])
sparse_matrix = csr_matrix(adj_matrix)
print(sparse_matrix)
```
In this example:
1. The graph has 4 nodes.
2. is an edge between node 0 and node 1, node 1 and node 2, and node 2 and node 3.
3. The csr_matrix function converts the dense adjacency matrix into a compressed sparse row (CSR) format, which is efficient for storing large, sparse matrices.
## Floyd Warshall
The Floyd-Warshall algorithm is a classic algorithm used to find the shortest paths between all pairs of nodes in a weighted graph.
```python
import numpy as np
from scipy.sparse.csgraph import floyd_warshall
from scipy.sparse import csr_matrix
arr = np.array([
[0, 1, 2],
[1, 0, 0],
[2, 0, 0]
])
newarr = csr_matrix(arr)
print(floyd_warshall(newarr, return_predecessors=True))
```
#### Output
```
(array([[0., 1., 2.],
[1., 0., 3.],
[2., 3., 0.]]), array([[-9999, 0, 0],
[ 1, -9999, 0],
[ 2, 0, -9999]], dtype=int32))
```
## Dijkstra
Dijkstra's algorithm is used to find the shortest path from a source node to all other nodes in a graph with non-negative edge weights.
```python
import numpy as np
from scipy.sparse.csgraph import dijkstra
from scipy.sparse import csr_matrix
arr = np.array([
[0, 1, 2],
[1, 0, 0],
[2, 0, 0]
])
newarr = csr_matrix(arr)
print(dijkstra(newarr, return_predecessors=True, indices=0))
```
#### Output
```
(array([ 0., 1., 2.]), array([-9999, 0, 0], dtype=int32))
```
## Bellman Ford
The Bellman-Ford algorithm is used to find the shortest path from a single source vertex to all other vertices in a weighted graph. It can handle graphs with negative weights, and it also detects negative weight cycles.
```python
import numpy as np
from scipy.sparse.csgraph import bellman_ford
from scipy.sparse import csr_matrix
arr = np.array([
[0, -1, 2],
[1, 0, 0],
[2, 0, 0]
])
newarr = csr_matrix(arr)
print(bellman_ford(newarr, return_predecessors=True, indices=0))
```
#### Output
```
(array([ 0., -1., 2.]), array([-9999, 0, 0], dtype=int32))
```
## Depth First Order
Depth-First Search (DFS) is an algorithm for traversing or searching tree or graph data structures. The algorithm starts at the root and explores as far as possible along each branch before backtracking.
```python
import numpy as np
from scipy.sparse.csgraph import depth_first_order
from scipy.sparse import csr_matrix
arr = np.array([
[0, 1, 0, 1],
[1, 1, 1, 1],
[2, 1, 1, 0],
[0, 1, 0, 1]
])
newarr = csr_matrix(arr)
print(depth_first_order(newarr, 1))
```
#### Output
```
(array([1, 0, 3, 2], dtype=int32), array([ 1, -9999, 1, 0], dtype=int32))
```
## Breadth First Order
Breadth-First Search (BFS) is an algorithm for traversing or searching tree or graph data structures. It starts at the root present depth level before moving on to nodes at the next depth level.
```python
import numpy as np
from scipy.sparse.csgraph import breadth_first_order
from scipy.sparse import csr_matrix
arr = np.array([
[0, 1, 0, 1],
[1, 1, 1, 1],
[2, 1, 1, 0],
[0, 1, 0, 1]
])
newarr = csr_matrix(arr)
print(breadth_first_order(newarr, 1))
```
### Output
```
(array([1, 0, 2, 3], dtype=int32), array([ 1, -9999, 1, 1], dtype=int32))
```