Merge branch 'main' into main

pull/583/head
Ankit Mahato 2024-05-27 08:42:22 +05:30 zatwierdzone przez GitHub
commit ce2f710835
Nie znaleziono w bazie danych klucza dla tego podpisu
ID klucza GPG: B5690EEEBB952194
14 zmienionych plików z 1206 dodań i 0 usunięć

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 136 KiB

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 35 KiB

Wyświetl plik

@ -0,0 +1,289 @@
# FastAPI
## Table of Contents
- [Introduction](#introduction)
- [Features](#features)
- [Installation](#installation)
- [Making First API](#making-first-api)
- [GET Method](#get-method)
- [Running Server and calling API](#running-server-and-calling-api)
- [Path Parameters](#pata-parameters)
- [Query Parameters](#query-parameters)
- [POST Method](#post-method)
- [PUT Method](#put-method)
- [Additional Content](#additional-content)
- [Swagger UI](#swagger-ui)
## Introduction
FastAPI is a modern, web-framework for building APIs with Python.
It uses python 3.7+
## Features
1. **Speed ⚡:** FastAPI is built on top of Starlette, a lightweight ASGI framework. It's designed for high performance and handles thousands of requests per second .
2. **Easy to use 😃:** FastAPI is designed to be intuitive and easy to use, especially for developers familiar with Python. It uses standard Python type hints for request and response validation, making it easy to understand and write code.
3. **Automatic Interactive API Documentation generation 🤩:** FastAPI automatically generates interactive API documentation (Swagger UI or ReDoc) based on your code and type annotations. Swagger UI also allows you to test API endpoints.
4. **Asynchronous Support 🔁:** FastAPI fully supports asynchronous programming, allowing you to write asynchronous code with async/await syntax. This enables handling high-concurrency scenarios and improves overall performance.
Now, lets get hands-on with FastAPI.
## Installation
Make sure that you have python version 3.7 or greater.
Then, simply open your command shell and give the following command.
```bash
pip install fastapi
```
After this, you need to install uvicorn. uvicorn is an ASGI server on which we will be running our API.
```bash
pip install uvicorn
```
## Making First API
After successful installation we will be moving towards making an API and seeing how to use it.
Firstly, the first thing in an API is its root/index page which is sent as response when API is called.
Follow the given steps to make your first FastAPI🫨
First, lets import FastAPI to get things started.
```python
from fastapi import FastAPI
app = FastAPI()
```
Now, we will write the ``GET`` method for the root of the API. As you have already seen, the GET method is ``HTTP request`` method used to fetch data from a source. In web development, it is primarily used to *retrieve data* from server.
The root of the app is ``"/"`` When the API will be called, response will be generated by on this url: ```localhost:8000```
### GET method
Following is the code to write GET method which will be calling API.
When the API is called, the ``read_root()`` function will be hit and the JSON response will be returned which will be shown on your web browser.
```python
@app.get("/")
def read_root():
return {"Hello": "World"}
```
Tadaaa! you have made your first FastAPI! Now lets run it!
### Running Server and calling API
Open your terminal and give following command:
```bash
uvicorn myapi:app --reload
```
Here, ``myapi`` is the name of your API which is name of your python file. ``app`` is the name you have given to your API in assignment ``app = FastAPI()``
After running this command, uvicorn server will be live and you can access your API.
As right now we have only written root ``GET`` method, only its corresponding response will be displayed.
On running this API, we get the response in JSON form:
```json
{
"Hello": "World"
}
```
## Path Parameters
Path parameters are a way to send variables to an API endpoint so that an operation may be perfomed on it.
This feature is particularly useful for defining routes that need to operate on resources identified by unique identifiers, such as user IDs, product IDs, or any other unique value.
### Example
Lets take an example to make it understandable.
Assume that we have some Students 🧑‍🎓 in our class and we have saved their data in form of dictionary in our API (in practical scenarios they will be saved in a database and API will query database).
So we have a student dictionary that looks something like this:
```python
students = {
1: {
"name": "John",
"age": 17,
"class": "year 12"
},
2: {
"name": "Jane",
"age": 16,
"class": "year 11"
},
3: {
"name": "Alice",
"age": 17,
"class": "year 12"
}
}
```
Here, keys are ``student_id``.
Let's say user wants the data of the student whose ID is 2. Here, we will take ID as **path parameter** from the user and return the data of that ID.
Lets see how it will be done!
```python
@app.get("/students/{student_id}")
def read_student(student_id: int):
return students[student_id]
```
Here is the explanatory breakdown of the method:
- ``/students`` is the URL of students endpoint in API.
- ``{student_id}`` is the path parameter, which is a dynamic variable the user will give to fetch the record of a particular student.
- ``def read_student(student_id: int)`` is the signature of function which takes the student_id we got from path parameter. Its type is defined as ``int`` as our ID will be an integer.
**Note that there will be automatic type checking of the parameter. If it is not same as type defined in method, an Error response ⛔ will be generated.**
- ``return students[student_id]`` will return the data of required student from dictionary.
When the user passes the URL ``http://127.0.0.1:8000/students/1`` the data of student with student_id=1 is fetched and displayed.
In this case following output will be displayed:
```json
{
"name": "John",
"age": 17,
"class": "year 12"
}
```
## Query Parameters
Query parameters in FastAPI allow you to pass data to your API endpoints via the URL's query string. This is useful for filtering, searching, and other operations that do not fit well with the path parameters.
Query parameters are specified after the ``?`` symbol in the URL and are typically used for optional parameters.
### Example
Lets continue the example of students to understand the query parameters.
Assume that we want to search students by name. In this case, we will be sending datat in query parameter which will be read by our method and respective result will be returned.
Lets see the method:
```python
@app.get("/get-by-name")
def read_student(name: str):
for student_id in students:
if students[student_id]["name"] == name:
return students[student_id]
return {"Error": "Student not found"}
```
Here is the explanatory breakdown of this process:
- ``/get-by-name`` is the URL of the endpoint. After this URL, client will enter the query parameter(s).
- ``http://127.0.0.1:8000/get-by-name?name=Jane`` In this URL, ``name=Jane`` is the query parameter. It means that user needs to search the student whose name is Jane. When you hit this URL, ``read_student(name:str)`` method is called and respective response is returned.
In this case, the output will be:
```json
{
"name": "Jane",
"age": 16,
"class": "year 11"
}
```
If we pass a name that doesn't exist in dictionary, Error response will be returned.
## POST Method
The ``POST`` method in FastAPI is used to **create resources** or submit data to an API endpoint. This method typically involves sending data in the request body, which the server processes to create or modify resources.
**⛔ In case of ``GET`` method, sent data is part of URL, but in case of ``POST`` metohod, sent data is part of request body.**
### Example
Again continuing with the example of student. Now, lets assume we need to add student. Following is the ``POST`` method to do this:
```python
@app.post("/create-student/{student_id}")
def create_student(student_id: int, student: dict):
if student_id in students:
return {"Error": "Student exists"}
students[student_id] = student
return students
```
Here is the explanation of process:
- ``/create-student/{student_id}`` shows that only student_id will be part of URL, rest of the data will be sent in request body.
- Data in the request body will be in JSON format and will be received in ``student: dict``
- Data sent in JSON format is given as:
```json
{
"name":"Seerat",
"age":22,
"class":"8 sem"
}
```
*Note:* I have used Swagger UI to send data in request body to test my ``POST`` method but you may use any other API tesing tool like Postman etc.
- This new student will be added in the dictionary, and if operation is successful, new dictionary will be returned as response.
Following is the output of this ``POST`` method call:
```json
{
"1": {
"name": "John",
"age": 17,
"class": "year 12"
},
"2": {
"name": "Jane",
"age": 16,
"class": "year 11"
},
"3": {
"name": "Alice",
"age": 17,
"class": "year 12"
},
"4": {
"name": "Seerat",
"age": 22,
"class": "8 sem"
}
}
```
## PUT Method
The ``PUT`` method in FastAPI is used to **update** existing resources or create resources if they do not already exist. It is one of the standard HTTP methods and is idempotent, meaning that multiple identical requests should have the same effect as a single request.
### Example
Let's update the record of a student.
```python
@app.put("/update-student/{student_id}")
def update_student(student_id: int, student: dict):
if student_id not in students:
return {"Error": "Student does not exist"}
students[student_id] = student
return students
```
``PUT`` method is nearly same as ``POST`` method but ``PUT`` is indempotent while ``POST`` is not.
The given method will update an existing student record and if student doesnt exist, it'll send error response.
## Additional Content
### Swagger UI
Swagger UI automatically generates UI for API tesing. Just write ``/docs`` with the URL and UI mode of Swagger UI will be launched.
Following Screenshot shows the Swagger UI
![App Screenshot](assets/image.png)
Here is how I tested ``POST`` method in UI:
![Screenshot](assets/image2.png)
That's all for FastAPI for now.... Happy Learning!

Wyświetl plik

@ -1,3 +1,4 @@
# List of sections
- [API Methods](api-methods.md)
- [FastAPI](fast-api.md)

Wyświetl plik

@ -8,3 +8,4 @@
- [Searching Algorithms](searching-algorithms.md)
- [Greedy Algorithms](greedy-algorithms.md)
- [Dynamic Programming](dynamic-programming.md)
- [Linked list](linked-list.md)

Wyświetl plik

@ -0,0 +1,222 @@
# Linked List Data Structure
Link list is a linear data Structure which can be defined as collection of objects called nodes that are randomly stored in the memory.
A node contains two types of metadata i.e. data stored at that particular address and the pointer which contains the address of the next node in the memory.
The last element in a linked list features a null pointer.
## Why use linked list over array?
From the beginning, we are using array data structure to organize the group of elements that are stored individually in the memory.
However, there are some advantage and disadvantage of array which should be known to decide which data structure will used throughout the program.
limitations
1. Before an array can be utilized in a program, its size must be established in advance.
2. Expanding an array's size is a lengthy process and is almost impossible to achieve during runtime.
3. Array elements must be stored in contiguous memory locations. To insert an element, all subsequent elements must be shifted
So we introduce a new data structure to overcome these limitations.
Linked list is used because,
1. Dynamic Memory Management: Linked lists allocate memory dynamically, meaning nodes can be located anywhere in memory and are connected through pointers, rather than being stored contiguously.
2. Adaptive Sizing: There is no need to predefine the size of a linked list. It can expand or contract during runtime, adapting to the program's requirements within the constraints of the available memory.
Let's code something
The smallest Unit: Node
```python
class Node:
def __init__(self, data):
self.data = data # Assigns the given data to the node
self.next = None # Initialize the next attribute to null
```
Now, we will see the types of linked list.
There are mainly four types of linked list,
1. Singly Link list
2. Doubly link list
3. Circular link list
4. Doubly circular link list
## 1. Singly linked list.
Simply think it is a chain of nodes in which each node remember(contains) the addresses of it next node.
### Creating a linked list class
```python
class LinkedList:
def __init__(self):
self.head = None # Initialize head as None
```
### Inserting a new node at the beginning of a linked list
```python
def insertAtBeginning(self, new_data):
new_node = Node(new_data) # Create a new node
new_node.next = self.head # Next for new node becomes the current head
self.head = new_node # Head now points to the new node
```
### Inserting a new node at the end of a linked list
```python
def insertAtEnd(self, new_data):
new_node = Node(new_data) # Create a new node
if self.head is None:
self.head = new_node # If the list is empty, make the new node the head
return
last = self.head
while last.next: # Otherwise, traverse the list to find the last node
last = last.next
last.next = new_node # Make the new node the next node of the last node
```
### Inserting a new node at the middle of a linked list
```python
def insertAtPosition(self, data, position):
new_node = Node(data)
if position <= 0: #check if position is valid or not
print("Position should be greater than 0")
return
if position == 1:
new_node.next = self.head
self.head = new_node
return
current_node = self.head
current_position = 1
while current_node and current_position < position - 1: #Iterating to behind of the postion.
current_node = current_node.next
current_position += 1
if not current_node: #Check if Position is out of bound or not
print("Position is out of bounds")
return
new_node.next = current_node.next #connect the intermediate node
current_node.next = new_node
```
### Printing the Linked list
```python
def printList(self):
temp = self.head # Start from the head of the list
while temp:
print(temp.data,end=' ') # Print the data in the current node
temp = temp.next # Move to the next node
print() # Ensures the output is followed by a new line
```
Lets complete the code and create a linked list.
Connect all the code.
```python
if __name__ == '__main__':
llist = LinkedList()
# Insert words at the beginning
llist.insertAtBeginning(4) # <4>
llist.insertAtBeginning(3) # <3> 4
llist.insertAtBeginning(2) # <2> 3 4
llist.insertAtBeginning(1) # <1> 2 3 4
# Insert a word at the end
llist.insertAtEnd(10) # 1 2 3 4 <10>
llist.insertAtEnd(7) # 1 2 3 4 10 <7>
#Insert at a random position
llist.insertAtPosition(9,4) ## 1 2 3 <9> 4 10 7
# Print the list
llist.printList()
```
## output:
1 2 3 9 4 10 7
### Deleting a node from the beginning of a linked list
check the list is empty otherwise shift the head to next node.
```python
def deleteFromBeginning(self):
if self.head is None:
return "The list is empty" # If the list is empty, return this string
self.head = self.head.next # Otherwise, remove the head by making the next node the new head
```
### Deleting a node from the end of a linked list
```python
def deleteFromEnd(self):
if self.head is None:
return "The list is empty"
if self.head.next is None:
self.head = None # If there's only one node, remove the head by making it None
return
temp = self.head
while temp.next.next: # Otherwise, go to the second-last node
temp = temp.next
temp.next = None # Remove the last node by setting the next pointer of the second-last node to None
```
### Search in a linked list
```python
def search(self, value):
current = self.head # Start with the head of the list
position = 0 # Counter to keep track of the position
while current: # Traverse the list
if current.data == value: # Compare the list's data to the search value
return f"Value '{value}' found at position {position}" # Print the value if a match is found
current = current.next
position += 1
return f"Value '{value}' not found in the list"
```
```python
if __name__ == '__main__':
llist = LinkedList()
# Insert words at the beginning
llist.insertAtBeginning(4) # <4>
llist.insertAtBeginning(3) # <3> 4
llist.insertAtBeginning(2) # <2> 3 4
llist.insertAtBeginning(1) # <1> 2 3 4
# Insert a word at the end
llist.insertAtEnd(10) # 1 2 3 4 <10>
llist.insertAtEnd(7) # 1 2 3 4 10 <7>
#Insert at a random position
llist.insertAtPosition(9,4) # 1 2 3 <9> 4 10 7
llist.insertAtPositon(56,4) # 1 2 3 <56> 9 4 10 7
#delete at the beginning
llist.deleteFromBeginning() # 2 3 56 9 4 10 7
#delete at the end
llist.deleteFromEnd() # 2 3 56 9 4 10
# Print the list
llist.printList()
```
## Output:
2 3 56 9 4 10
## Real Life uses of Linked List
Here are a few practical applications of linked lists in various fields:
1. **Music Player**: In a music player, songs are often linked to the previous and next tracks. This allows for seamless navigation between songs, enabling you to play tracks either from the beginning or the end of the playlist. This is akin to a doubly linked list where each song node points to both the previous and the next song, enhancing the flexibility of song selection.
2. **GPS Navigation Systems**: Linked lists can be highly effective for managing lists of locations and routes in GPS navigation systems. Each location or waypoint can be represented as a node, making it easy to add or remove destinations and to navigate smoothly from one location to another. This is similar to how you might plan a road trip, plotting stops along the way in a flexible, dynamic manner.
3. **Task Scheduling**: Operating systems utilize linked lists to manage task scheduling. Each process waiting to be executed is represented as a node in a linked list. This organization allows the system to efficiently keep track of which processes need to be run, enabling fair and systematic scheduling of tasks. Think of it like a to-do list where each task is a node, and the system executes tasks in a structured order.
4. **Speech Recognition**: Speech recognition software uses linked lists to represent possible phonetic pronunciations of words. Each potential pronunciation is a node, allowing the software to dynamically explore different pronunciation paths as it processes spoken input. This method helps in accurately recognizing and understanding speech by considering multiple possibilities in a flexible manner, much like evaluating various potential meanings in a conversation.
These examples illustrate how linked lists provide a flexible, dynamic data structure that can be adapted to a wide range of practical applications, making them a valuable tool in both software development and real-world problem-solving.

Wyświetl plik

@ -0,0 +1,357 @@
---
# Optimizers in Machine Learning
Optimizers are algorithms or methods used to change the attributes of your neural network such as weights and learning rate in order to reduce the losses. Optimization algorithms help to minimize (or maximize) an objective function (also called a loss function) which is simply a mathematical function dependent on the model's internal learnable parameters which are used in computing the target values from the set of features.
## Types of Optimizers
### 1. Gradient Descent
**Explanation:**
Gradient Descent is the simplest and most commonly used optimization algorithm. It works by iteratively updating the model parameters in the opposite direction of the gradient of the objective function with respect to the parameters. The idea is to find the minimum of a function by taking steps proportional to the negative of the gradient of the function at the current point.
**Mathematical Formulation:**
The update rule for the parameter vector θ in gradient descent is represented by the equation:
- $$\theta_{\text{new}} = \theta_{\text{old}} - \alpha \cdot \nabla J(\theta)$$
Where:
- θold is the old parameter vector.
- θnew is the updated parameter vector.
- alpha(α) is the learning rate.
- ∇J(θ) is the gradient of the objective function with respect to the parameters.
**Intuition:**
- At each iteration, we calculate the gradient of the cost function.
- The parameters are updated in the opposite direction of the gradient.
- The size of the step is controlled by the learning rate α.
**Advantages:**
- Simple to implement.
- Suitable for convex problems.
**Disadvantages:**
- Can be slow for large datasets.
- May get stuck in local minima for non-convex problems.
- Requires careful tuning of the learning rate.
**Python Implementation:**
```python
import numpy as np
def gradient_descent(X, y, lr=0.01, epochs=1000):
m, n = X.shape
theta = np.zeros(n)
for epoch in range(epochs):
gradient = np.dot(X.T, (np.dot(X, theta) - y)) / m
theta -= lr * gradient
return theta
```
### 2. Stochastic Gradient Descent (SGD)
**Explanation:**
SGD is a variation of gradient descent where we use only one training example to calculate the gradient and update the parameters. This introduces noise into the parameter updates, which can help to escape local minima but may cause the loss to fluctuate.
**Mathematical Formulation:**
- $$θ = θ - α \cdot \frac{∂J (θ; xᵢ, yᵢ)}{∂θ}$$
- xᵢ, yᵢ are a single training example and its target.
**Intuition:**
- At each iteration, a random training example is selected.
- The gradient is calculated and the parameters are updated for this single example.
- This process is repeated for a specified number of epochs.
**Advantages:**
- Faster updates compared to batch gradient descent.
- Can handle large datasets.
- Helps to escape local minima due to the noise in updates.
**Disadvantages:**
- Loss function may fluctuate.
- Requires more iterations to converge.
**Python Implementation:**
```python
def stochastic_gradient_descent(X, y, lr=0.01, epochs=1000):
m, n = X.shape
theta = np.zeros(n)
for epoch in range(epochs):
for i in range(m):
rand_index = np.random.randint(0, m)
xi = X[rand_index:rand_index+1]
yi = y[rand_index:rand_index+1]
gradient = np.dot(xi.T, (np.dot(xi, theta) - yi))
theta -= lr * gradient
return theta
```
### 3. Mini-Batch Gradient Descent
**Explanation:**
Mini-Batch Gradient Descent is a variation where instead of a single training example or the whole dataset, a mini-batch of examples is used to compute the gradient. This reduces the variance of the parameter updates, leading to more stable convergence.
**Mathematical Formulation:**
- $$θ = θ - α \cdot \frac{1}{k} \sum_{i=1}^{k} \frac{∂J (θ; xᵢ, yᵢ)}{∂θ}$$
Where:
- \( k \) is the batch size.
**Intuition:**
- At each iteration, a mini-batch of training examples is selected.
- The gradient is calculated for this mini-batch.
- The parameters are updated based on the average gradient of the mini-batch.
**Advantages:**
- More stable updates compared to SGD.
- Faster convergence than batch gradient descent.
- Efficient on large datasets.
**Disadvantages:**
- Requires tuning of batch size.
- Computationally more expensive than SGD per iteration.
**Python Implementation:**
```python
def mini_batch_gradient_descent(X, y, lr=0.01, epochs=1000, batch_size=32):
m, n = X.shape
theta = np.zeros(n)
for epoch in range(epochs):
indices = np.random.permutation(m)
X_shuffled = X[indices]
y_shuffled = y[indices]
for i in range(0, m, batch_size):
X_i = X_shuffled[i:i+batch_size]
y_i = y_shuffled[i:i+batch_size]
gradient = np.dot(X_i.T, (np.dot(X_i, theta) - y_i)) / batch_size
theta -= lr * gradient
return theta
```
### 4. Momentum
**Explanation:**
Momentum helps accelerate gradient vectors in the right directions, thus leading to faster converging. It accumulates a velocity vector in directions of persistent reduction in the objective function, which helps to smooth the path towards the minimum.
**Mathematical Formulation:**
- $$v_t = γ \cdot v_{t-1} + α \cdot ∇J(θ)$$
- $$θ = θ - v_t$$
where:
- \( v_t \) is the velocity.
- γ is the momentum term, typically set between 0.9 and 0.99.
**Intuition:**
- At each iteration, the gradient is calculated.
- The velocity is updated based on the current gradient and the previous velocity.
- The parameters are updated based on the velocity.
**Advantages:**
- Faster convergence.
- Reduces oscillations in the parameter updates.
**Disadvantages:**
- Requires tuning of the momentum term.
**Python Implementation:**
```python
def momentum_gradient_descent(X, y, lr=0.01, epochs=1000, gamma=0.9):
m, n = X.shape
theta = np.zeros(n)
v = np.zeros(n)
for epoch in range(epochs):
gradient = np.dot(X.T, (np.dot(X, theta) - y)) / m
v = gamma * v + lr * gradient
theta -= v
return theta
```
### 5. Nesterov Accelerated Gradient (NAG)
**Explanation:**
NAG is a variant of the gradient descent with momentum. It looks ahead by a step and calculates the gradient at that point, thus providing more accurate updates. This method helps to correct the overshooting problem seen in standard momentum.
**Mathematical Formulation:**
- $$v_t = γv_{t-1} + α \cdot ∇J(θ - γ \cdot v_{t-1})$$
- $$θ = θ - v_t$$
**Intuition:**
- At each iteration, the parameters are temporarily updated using the previous velocity.
- The gradient is calculated at this lookahead position.
- The velocity and parameters are then updated based on this gradient.
**Advantages:**
- More accurate updates compared to standard momentum.
- Faster convergence.
**Disadvantages:**
- Requires tuning of the momentum term.
**Python Implementation:**
```python
def nesterov_accelerated_gradient(X, y, lr=0.01, epochs=1000, gamma=0.9):
m, n = X.shape
theta = np.zeros(n)
v = np.zeros(n)
for epoch in range(epochs):
lookahead_theta = theta - gamma * v
gradient = np.dot(X.T, (np.dot(X, lookahead_theta) - y)) / m
v = gamma * v + lr * gradient
theta -= v
return theta
```
### 6. AdaGrad
**Explanation:**
AdaGrad adapts the learning rate to the parameters, performing larger updates for infrequent and smaller updates for frequent parameters. It scales the learning rate inversely proportional to the square root of the sum of all historical squared values of the gradient.
**Mathematical Formulation:**
- $$G_t = G_{t-1} + (∂J(θ)/∂θ)^2$$
- $$θ = θ - \frac{α}{\sqrt{G_t + ε}} \cdot ∇J(θ)$$
Where:
- \(G_t\) is the sum of squares of the gradients up to time step \( t \).
- ε is a small constant to avoid division by zero.
**Intuition:**
- Accumulates the sum of the squares of the gradients for each parameter.
- Uses this accumulated
sum to scale the learning rate.
- Parameters with large gradients in the past have smaller learning rates.
**Advantages:**
- Effective for sparse data.
- Automatically adjusts learning rate.
**Disadvantages:**
- Learning rate decreases continuously, which can lead to premature convergence.
**Python Implementation:**
```python
def adagrad(X, y, lr=0.01, epochs=1000, epsilon=1e-8):
m, n = X.shape
theta = np.zeros(n)
G = np.zeros(n)
for epoch in range(epochs):
gradient = np.dot(X.T, (np.dot(X, theta) - y)) / m
G += gradient**2
adjusted_lr = lr / (np.sqrt(G) + epsilon)
theta -= adjusted_lr * gradient
return theta
```
### 7. RMSprop
**Explanation:**
RMSprop modifies AdaGrad to perform well in non-convex settings by using a moving average of squared gradients to scale the learning rate. It helps to keep the learning rate in check, especially in the presence of noisy gradients.
**Mathematical Formulation:**
- E[g²]ₜ = βE[g²]ₜ₋₁ + (1 - β)(∂J(θ) / ∂θ)²
- $$θ = θ - \frac{α}{\sqrt{E[g^2]_t + ε}} \cdot ∇J(θ)$$
Where:
- \( E[g^2]_t \) is the exponentially decaying average of past squared gradients.
- β is the decay rate.
**Intuition:**
- Keeps a running average of the squared gradients.
- Uses this average to scale the learning rate.
- Parameters with large gradients have their learning rates reduced.
**Advantages:**
- Effective for non-convex problems.
- Reduces oscillations in parameter updates.
**Disadvantages:**
- Requires tuning of the decay rate.
**Python Implementation:**
```python
def rmsprop(X, y, lr=0.01, epochs=1000, beta=0.9, epsilon=1e-8):
m, n = X.shape
theta = np.zeros(n)
E_g = np.zeros(n)
for epoch in range(epochs):
gradient = np.dot(X.T, (np.dot(X, theta) - y)) / m
E_g = beta * E_g + (1 - beta) * gradient**2
adjusted_lr = lr / (np.sqrt(E_g) + epsilon)
theta -= adjusted_lr * gradient
return theta
```
### 8. Adam
**Explanation:**
Adam (Adaptive Moment Estimation) combines the advantages of both RMSprop and AdaGrad by keeping an exponentially decaying average of past gradients and past squared gradients.
**Mathematical Formulation:**
- $$m_t = β_1m_{t-1} + (1 - β_1)(∂J(θ)/∂θ)$$
- $$v_t = β_2v_{t-1} + (1 - β_2)(∂J(θ)/∂θ)^2$$
- $$\hat{m}_t = \frac{m_t}{1 - β_1^t}$$
- $$\hat{v}_t = \frac{v_t}{1 - β_2^t}$$
- $$θ = θ - \frac{α\hat{m}_t}{\sqrt{\hat{v}_t} + ε}$$
Where:
- \( m<sub>t \) is the first moment (mean) of the gradient.
- \( v<sub>t \) is the second moment (uncentered variance) of the gradient.
- β_1.β_2 are the decay rates for the moment estimates.
**Intuition:**
- Keeps track of both the mean and the variance of the gradients.
- Uses these to adaptively scale the learning rate.
- Provides a balance between AdaGrad and RMSprop.
**Advantages:**
- Efficient for large datasets.
- Well-suited for non-convex optimization.
- Handles sparse gradients well.
**Disadvantages:**
- Requires careful tuning of hyperparameters.
- Can be computationally intensive.
**Python Implementation:**
```python
def adam(X, y, lr=0.01, epochs=1000, beta1=0.9, beta2=0.999, epsilon=1e-8):
m, n = X.shape
theta = np.zeros(n)
m_t = np.zeros(n)
v_t = np.zeros(n)
for epoch in range(1, epochs+1):
gradient = np.dot(X.T, (np.dot(X, theta) - y)) / m
m_t = beta1 * m_t + (1 - beta1) * gradient
v_t = beta2 * v_t + (1 - beta2) * gradient**2
m_t_hat = m_t / (1 - beta1**epoch)
v_t_hat = v_t / (1 - beta2**epoch)
theta -= lr * m_t_hat / (np.sqrt(v_t_hat) + epsilon)
return theta
```
These implementations are basic examples of how these optimizers can be implemented in Python using NumPy. In practice, libraries like TensorFlow and PyTorch provide highly optimized and more sophisticated implementations of these and other optimization algorithms.
---

Wyświetl plik

@ -8,3 +8,4 @@
- [Artificial Neural Network from the Ground Up](ArtificialNeuralNetwork.md)
- [TensorFlow.md](tensorFlow.md)
- [PyTorch.md](pytorch.md)
- [Types of optimizers](Types_of_optimizers.md)

Wyświetl plik

@ -0,0 +1,11 @@
Make,Colour,Odometer,Doors,Price
Toyota,White,150043,4,"$4,000"
Honda,Red,87899,4,"$5,000"
Toyota,Blue,,3,"$7,000"
BMW,Black,11179,5,"$22,000"
Nissan,White,213095,4,"$3,500"
Toyota,Green,,4,"$4,500"
Honda,,,4,"$7,500"
Honda,Blue,,4,
Toyota,White,60000,,
,White,31600,4,"$9,700"
1 Make Colour Odometer Doors Price
2 Toyota White 150043 4 $4,000
3 Honda Red 87899 4 $5,000
4 Toyota Blue 3 $7,000
5 BMW Black 11179 5 $22,000
6 Nissan White 213095 4 $3,500
7 Toyota Green 4 $4,500
8 Honda 4 $7,500
9 Honda Blue 4
10 Toyota White 60000
11 White 31600 4 $9,700

Wyświetl plik

@ -0,0 +1,11 @@
Make,Colour,Odometer (KM),Doors,Price
Toyota,White,150043,4,"$4,000.00"
Honda,Red,87899,4,"$5,000.00"
Toyota,Blue,32549,3,"$7,000.00"
BMW,Black,11179,5,"$22,000.00"
Nissan,White,213095,4,"$3,500.00"
Toyota,Green,99213,4,"$4,500.00"
Honda,Blue,45698,4,"$7,500.00"
Honda,Blue,54738,4,"$7,000.00"
Toyota,White,60000,4,"$6,250.00"
Nissan,White,31600,4,"$9,700.00"
1 Make Colour Odometer (KM) Doors Price
2 Toyota White 150043 4 $4,000.00
3 Honda Red 87899 4 $5,000.00
4 Toyota Blue 32549 3 $7,000.00
5 BMW Black 11179 5 $22,000.00
6 Nissan White 213095 4 $3,500.00
7 Toyota Green 99213 4 $4,500.00
8 Honda Blue 45698 4 $7,500.00
9 Honda Blue 54738 4 $7,000.00
10 Toyota White 60000 4 $6,250.00
11 Nissan White 31600 4 $9,700.00

Wyświetl plik

@ -0,0 +1 @@
## This folder contains all the Datasets used in the content.

Wyświetl plik

@ -0,0 +1,264 @@
# Handling Missing Values in Pandas
In real life, many datasets arrive with missing data either because it exists and was not collected or it never existed.
In Pandas missing data is represented by two values:
* `None` : None is simply is `keyword` refer as empty or none.
* `NaN` : Acronym for `Not a Number`.
There are several useful functions for detecting, removing, and replacing null values in Pandas DataFrame:
1. `isnull()`
2. `notnull()`
3. `dropna()`
4. `fillna()`
5. `replace()`
## 2. Checking for missing values using `isnull()` and `notnull()`
Let's import pandas and our fancy car-sales dataset having some missing values.
```python
import pandas as pd
car_sales_missing_df = pd.read_csv("Datasets/car-sales-missing-data.csv")
print(car_sales_missing_df)
```
Make Colour Odometer Doors Price
0 Toyota White 150043.0 4.0 $4,000
1 Honda Red 87899.0 4.0 $5,000
2 Toyota Blue NaN 3.0 $7,000
3 BMW Black 11179.0 5.0 $22,000
4 Nissan White 213095.0 4.0 $3,500
5 Toyota Green NaN 4.0 $4,500
6 Honda NaN NaN 4.0 $7,500
7 Honda Blue NaN 4.0 NaN
8 Toyota White 60000.0 NaN NaN
9 NaN White 31600.0 4.0 $9,700
```python
## Using isnull()
print(car_sales_missing_df.isnull())
```
Make Colour Odometer Doors Price
0 False False False False False
1 False False False False False
2 False False True False False
3 False False False False False
4 False False False False False
5 False False True False False
6 False True True False False
7 False False True False True
8 False False False True True
9 True False False False False
Note here:
* `True` means for `NaN` values
* `False` means for no `Nan` values
If we want to find the number of missing values in each column use `isnull().sum()`.
```python
print(car_sales_missing_df.isnull().sum())
```
Make 1
Colour 1
Odometer 4
Doors 1
Price 2
dtype: int64
You can also check presense of null values in a single column.
```python
print(car_sales_missing_df["Odometer"].isnull())
```
0 False
1 False
2 True
3 False
4 False
5 True
6 True
7 True
8 False
9 False
Name: Odometer, dtype: bool
```python
## using notnull()
print(car_sales_missing_df.notnull())
```
Make Colour Odometer Doors Price
0 True True True True True
1 True True True True True
2 True True False True True
3 True True True True True
4 True True True True True
5 True True False True True
6 True False False True True
7 True True False True False
8 True True True False False
9 False True True True True
Note here:
* `True` means no `NaN` values
* `False` means for `NaN` values
`isnull()` means having null values so it gives boolean `True` for NaN values. And `notnull()` means having no null values so it gives `True` for no NaN value.
## 2. Filling missing values using `fillna()`, `replace()`.
```python
## Filling missing values with a single value using `fillna`
print(car_sales_missing_df.fillna(0))
```
Make Colour Odometer Doors Price
0 Toyota White 150043.0 4.0 $4,000
1 Honda Red 87899.0 4.0 $5,000
2 Toyota Blue 0.0 3.0 $7,000
3 BMW Black 11179.0 5.0 $22,000
4 Nissan White 213095.0 4.0 $3,500
5 Toyota Green 0.0 4.0 $4,500
6 Honda 0 0.0 4.0 $7,500
7 Honda Blue 0.0 4.0 0
8 Toyota White 60000.0 0.0 0
9 0 White 31600.0 4.0 $9,700
```python
## Filling missing values with the previous value using `ffill()`
print(car_sales_missing_df.ffill())
```
Make Colour Odometer Doors Price
0 Toyota White 150043.0 4.0 $4,000
1 Honda Red 87899.0 4.0 $5,000
2 Toyota Blue 87899.0 3.0 $7,000
3 BMW Black 11179.0 5.0 $22,000
4 Nissan White 213095.0 4.0 $3,500
5 Toyota Green 213095.0 4.0 $4,500
6 Honda Green 213095.0 4.0 $7,500
7 Honda Blue 213095.0 4.0 $7,500
8 Toyota White 60000.0 4.0 $7,500
9 Toyota White 31600.0 4.0 $9,700
```python
## illing null value with the next ones using 'bfill()'
print(car_sales_missing_df.bfill())
```
Make Colour Odometer Doors Price
0 Toyota White 150043.0 4.0 $4,000
1 Honda Red 87899.0 4.0 $5,000
2 Toyota Blue 11179.0 3.0 $7,000
3 BMW Black 11179.0 5.0 $22,000
4 Nissan White 213095.0 4.0 $3,500
5 Toyota Green 60000.0 4.0 $4,500
6 Honda Blue 60000.0 4.0 $7,500
7 Honda Blue 60000.0 4.0 $9,700
8 Toyota White 60000.0 4.0 $9,700
9 NaN White 31600.0 4.0 $9,700
#### Filling a null values using `replace()` method
Now we are going to replace the all `NaN` value in the data frame with -125 value
For this we will also need numpy
```python
import numpy as np
print(car_sales_missing_df.replace(to_replace = np.nan, value = -125))
```
Make Colour Odometer Doors Price
0 Toyota White 150043.0 4.0 $4,000
1 Honda Red 87899.0 4.0 $5,000
2 Toyota Blue -125.0 3.0 $7,000
3 BMW Black 11179.0 5.0 $22,000
4 Nissan White 213095.0 4.0 $3,500
5 Toyota Green -125.0 4.0 $4,500
6 Honda -125 -125.0 4.0 $7,500
7 Honda Blue -125.0 4.0 -125
8 Toyota White 60000.0 -125.0 -125
9 -125 White 31600.0 4.0 $9,700
## 3. Dropping missing values using `dropna()`
In order to drop a null values from a dataframe, we used `dropna()` function this function drop Rows/Columns of datasets with Null values in different ways.
#### Dropping rows with at least 1 null value.
```python
print(car_sales_missing_df.dropna(axis = 0)) ##Now we drop rows with at least one Nan value (Null value)
```
Make Colour Odometer Doors Price
0 Toyota White 150043.0 4.0 $4,000
1 Honda Red 87899.0 4.0 $5,000
3 BMW Black 11179.0 5.0 $22,000
4 Nissan White 213095.0 4.0 $3,500
#### Dropping rows if all values in that row are missing.
```python
print(car_sales_missing_df.dropna(how = 'all',axis = 0)) ## If not have leave the row as it is
```
Make Colour Odometer Doors Price
0 Toyota White 150043.0 4.0 $4,000
1 Honda Red 87899.0 4.0 $5,000
2 Toyota Blue NaN 3.0 $7,000
3 BMW Black 11179.0 5.0 $22,000
4 Nissan White 213095.0 4.0 $3,500
5 Toyota Green NaN 4.0 $4,500
6 Honda NaN NaN 4.0 $7,500
7 Honda Blue NaN 4.0 NaN
8 Toyota White 60000.0 NaN NaN
9 NaN White 31600.0 4.0 $9,700
#### Dropping columns with at least 1 null value
```python
print(car_sales_missing_df.dropna(axis = 1))
```
Empty DataFrame
Columns: []
Index: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Now we drop a columns which have at least 1 missing values.
Here the dataset becomes empty after `dropna()` because each column as atleast 1 null value so it remove that columns resulting in an empty dataframe.

Wyświetl plik

@ -0,0 +1,46 @@
# Importing and Exporting Data in Pandas
## Importing Data from a CSV
We can create `Series` and `DataFrame` in pandas, but often we have to import the data which is in the form of `.csv` (Comma Separated Values), a spreadsheet file or similar tabular data file format.
`pandas` allows for easy importing of this data using functions such as `read_csv()` and `read_excel()` for Microsoft Excel files.
*Note: In case you want to get the information from a **Google Sheet** you can export it as a .csv file.*
The `read_csv()` function can be used to import a CSV file into a pandas DataFrame. The path can be a file system path or a URL where the CSV is available.
```python
import pandas as pd
car_sales_df= pd.read_csv("Datasets/car-sales.csv")
print(car_sales_df)
```
```
Make Colour Odometer (KM) Doors Price
0 Toyota White 150043 4 $4,000.00
1 Honda Red 87899 4 $5,000.00
2 Toyota Blue 32549 3 $7,000.00
3 BMW Black 11179 5 $22,000.00
4 Nissan White 213095 4 $3,500.00
5 Toyota Green 99213 4 $4,500.00
6 Honda Blue 45698 4 $7,500.00
7 Honda Blue 54738 4 $7,000.00
8 Toyota White 60000 4 $6,250.00
9 Nissan White 31600 4 $9,700.00
```
You can find the dataset used above in the `Datasets` folder.
*Note: If you want to import the data from Github you can't directly use its link, you have to first obtain the raw file URL by clicking on the raw button present in the repo*
## Exporting Data to a CSV
`pandas` allows you to export `DataFrame` to `.csv` format using `.to_csv()`, or to a Excel spreadsheet using `.to_excel()`.
```python
car_sales_df.to_csv("exported_car_sales.csv")
```
Running this will save a file called ``exported_car_sales.csv`` to the current folder.

Wyświetl plik

@ -5,3 +5,5 @@
- [Pandas Descriptive Statistics](Descriptive_Statistics.md)
- [Group By Functions with Pandas](GroupBy_Functions_Pandas.md)
- [Excel using Pandas DataFrame](excel_with_pandas.md)
- [Importing and Exporting Data in Pandas](import-export.md)
- [Handling Missing Values in Pandas](handling-missing-values.md)