Merge branch 'main' into introduction-to-pie-charts-in-matplotlib

pull/712/head
Ankit Mahato 2024-05-31 07:29:20 +05:30 zatwierdzone przez GitHub
commit 5323d962ef
Nie znaleziono w bazie danych klucza dla tego podpisu
ID klucza GPG: B5690EEEBB952194
23 zmienionych plików z 1186 dodań i 27 usunięć

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 13 KiB

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 9.2 KiB

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 13 KiB

Wyświetl plik

@ -1,5 +1,6 @@
# List of sections # List of sections
- [Time & Space Complexity](time-space-complexity.md)
- [Queues in Python](Queues.md) - [Queues in Python](Queues.md)
- [Graphs](graph.md) - [Graphs](graph.md)
- [Sorting Algorithms](sorting-algorithms.md) - [Sorting Algorithms](sorting-algorithms.md)

Wyświetl plik

@ -2,7 +2,7 @@
When a function calls itself to solve smaller instances of the same problem until a specified condition is fulfilled is called recursion. It is used for tasks that can be divided into smaller sub-tasks. When a function calls itself to solve smaller instances of the same problem until a specified condition is fulfilled is called recursion. It is used for tasks that can be divided into smaller sub-tasks.
# How Recursion Works ## How Recursion Works
To solve a problem using recursion we must define: To solve a problem using recursion we must define:
- Base condition :- The condition under which recursion ends. - Base condition :- The condition under which recursion ends.
@ -17,43 +17,63 @@ When a recursive function is called, the following sequence of events occurs:
- Stack Management: Each recursive call is placed on the call stack. The stack keeps track of each function call, its argument, and the point to return to once the call completes. - Stack Management: Each recursive call is placed on the call stack. The stack keeps track of each function call, its argument, and the point to return to once the call completes.
- Unwinding the Stack: When the base case is eventually met, the function returns a value, and the stack starts unwinding, returning values to previous function calls until the initial call is resolved. - Unwinding the Stack: When the base case is eventually met, the function returns a value, and the stack starts unwinding, returning values to previous function calls until the initial call is resolved.
# What is Stack Overflow in Recursion ## Python Code: Factorial using Recursion
```python
def fact(n):
if n == 0 or n == 1:
return 1
return n * fact(n - 1)
if __name__ == "__main__":
n = int(input("Enter a positive number: "))
print("Factorial of", n, "is", fact(n))
```
### Explanation
This Python script calculates the factorial of a given number using recursion.
- **Function `fact(n)`:**
- The function takes an integer `n` as input and calculates its factorial.
- It checks if `n` is 0 or 1. If so, it returns 1 (since the factorial of 0 and 1 is 1).
- Otherwise, it returns `n * fact(n - 1)`, which means it recursively calls itself with `n - 1` until it reaches either 0 or 1.
- **Main Section:**
- The main section prompts the user to enter a positive number.
- It then calls the `fact` function with the input number and prints the result.
#### Example : Let n = 4
The recursion unfolds as follows:
1. When `fact(4)` is called, it computes `4 * fact(3)`.
2. Inside `fact(3)`, it computes `3 * fact(2)`.
3. Inside `fact(2)`, it computes `2 * fact(1)`.
4. `fact(1)` returns 1 (`if` statement executes), which is received by `fact(2)`, resulting in `2 * 1` i.e. `2`.
5. Back to `fact(3)`, it receives the value from `fact(2)`, giving `3 * 2` i.e. `6`.
6. `fact(4)` receives the value from `fact(3)`, resulting in `4 * 6` i.e. `24`.
7. Finally, `fact(4)` returns 24 to the main function.
#### So, the result is 24.
#### What is Stack Overflow in Recursion?
Stack overflow is an error that occurs when the call stack memory limit is exceeded. During execution of recursion calls they are simultaneously stored in a recursion stack waiting for the recursive function to be completed. Without a base case, the function would call itself indefinitely, leading to a stack overflow. Stack overflow is an error that occurs when the call stack memory limit is exceeded. During execution of recursion calls they are simultaneously stored in a recursion stack waiting for the recursive function to be completed. Without a base case, the function would call itself indefinitely, leading to a stack overflow.
# Example ## What is Backtracking
- Factorial of a Number
The factorial of i natural numbers is nth integer multiplied by factorial of (i-1) numbers. The base case is if i=0 we return 1 as factorial of 0 is 1.
```python
def factorial(i):
#base case
if i==0 :
return 1
#recursive case
else :
return i * factorial(i-1)
i = 6
print("Factorial of i is :", factorial(i)) # Output- Factorial of i is :720
```
# What is Backtracking
Backtracking is a recursive algorithmic technique used to solve problems by exploring all possible solutions and discarding those that do not meet the problem's constraints. It is particularly useful for problems involving combinations, permutations, and finding paths in a grid. Backtracking is a recursive algorithmic technique used to solve problems by exploring all possible solutions and discarding those that do not meet the problem's constraints. It is particularly useful for problems involving combinations, permutations, and finding paths in a grid.
# How Backtracking Works ## How Backtracking Works
- Incremental Solution Building: Solutions are built one step at a time. - Incremental Solution Building: Solutions are built one step at a time.
- Feasibility Check: At each step, a check is made to see if the current partial solution is valid. - Feasibility Check: At each step, a check is made to see if the current partial solution is valid.
- Backtracking: If a partial solution is found to be invalid, the algorithm backtracks by removing the last added part of the solution and trying the next possibility. - Backtracking: If a partial solution is found to be invalid, the algorithm backtracks by removing the last added part of the solution and trying the next possibility.
- Exploration of All Possibilities: The process continues recursively, exploring all possible paths, until a solution is found or all possibilities are exhausted. - Exploration of All Possibilities: The process continues recursively, exploring all possible paths, until a solution is found or all possibilities are exhausted.
# Example ## Example: Word Search
- Word Search Given a 2D grid of characters and a word, determine if the word exists in the grid. The word can be constructed from letters of sequentially adjacent cells, where "adjacent" cells are horizontally or vertically neighboring. The same letter cell may not be used more than once.
Given a 2D grid of characters and a word, determine if the word exists in the grid. The word can be constructed from letters of sequentially adjacent cells, where "adjacent" cells are horizontally or vertically neighboring. The same letter cell may not be used more than once.
Algorithm for Solving the Word Search Problem with Backtracking: Algorithm for Solving the Word Search Problem with Backtracking:
- Start at each cell: Attempt to find the word starting from each cell. - Start at each cell: Attempt to find the word starting from each cell.

Wyświetl plik

@ -0,0 +1,243 @@
# Time and Space Complexity
We can solve a problem using one or more algorithms. It's essential to learn how to compare the performance of different algorithms and select the best one for a specific task.
Therefore, it is highly required to use a method to compare the solutions in order to judge which one is more optimal.
The method must be:
- Regardless of the system or its settings on which the algorithm is executing.
- Demonstrate a direct relationship with the quantity of inputs.
- Able to discriminate between two methods with clarity and precision.
Two such methods use to analyze algorithms are `time complexity` and `space complexity`.
## What is Time Complexity?
The _number of operations an algorithm performs in proportion to the quantity of the input_ is measured by time complexity. It facilitates our investigation of how the performance of the algorithm scales with increasing input size. But in real life, **_time complexity does not refer to the time taken by the machine to execute a particular code_**.
## Order of Growth and Asymptotic Notations
The Order of Growth explains how an algorithm's space or running time expands as the amount of the input does. This increase is described via asymptotic language, such Big O notation, which concentrates on the dominating term as the input size approaches infinity and is independent of lower-order terms and machine-specific constants.
### Common Asymptotic Notation
1. `Big Oh (O)`: Provides the worst-case scenario for describing the upper bound of an algorithm's execution time.
2. `Big Omega (Ω)`: Provides the best-case scenario and describes the lower bound.
3. `Big Theta (Θ)`: Gives a tight constraint on the running time by describing both the upper and lower bounds.
### 1. Big Oh (O) Notation
Big O notation describes how an algorithm behaves as the input size gets closer to infinity and provides an upper bound on the time or space complexity of the method. It helps developers and computer scientists to evaluate the effectiveness of various algorithms without regard to the software or hardware environment.
To denote asymptotic upper bound, we use O-notation. For a given function `g(n)`, we denote by `O(g(n))` (pronounced "big-oh of g of n") the set of functions:
$$
O(g(n)) = \{ f(n) : \exists \text{ positive constants } c \text{ and } n_0 \text{ such that } 0 \leq f(n) \leq c \cdot g(n) \text{ for all } n \geq n_0 \}
$$
Graphical representation of Big Oh:
![BigOh Notation Graph](images/Time-And-Space-Complexity-BigOh.png)
### 2. Big Omega (Ω) Notation
Big Omega (Ω) notation is used to describe the lower bound of an algorithm's running time. It provides a way to express the minimum time complexity that an algorithm will take to complete. In other words, Big Omega gives us a guarantee that the algorithm will take at least a certain amount of time to run, regardless of other factors.
To denote asymptotic lower bound, we use Omega-notation. For a given function `g(n)`, we denote by `Ω(g(n))` (pronounced "big-omega of g of n") the set of functions:
$$
\Omega(g(n)) = \{ f(n) : \exists \text{ positive constants } c \text{ and } n_0 \text{ such that } 0 \leq c \cdot g(n) \leq f(n) \text{ for all } n \geq n_0 \}
$$
Graphical representation of Big Omega:
![BigOmega Notation Graph](images/Time-And-Space-Complexity-BigOmega.png)
### 3. Big Theta (Θ) Notation
Big Theta (Θ) notation provides a way to describe the asymptotic tight bound of an algorithm's running time. It offers a precise measure of the time complexity by establishing both an upper and lower bound, indicating that the running time of an algorithm grows at the same rate as a given function, up to constant factors.
To denote asymptotic tight bound, we use Theta-notation. For a given function `g(n)`, we denote by `Θ(g(n))` (pronounced "big-theta of g of n") the set of functions:
$$
\Theta(g(n)) = \{ f(n) : \exists \text{ positive constants } c_1, c_2, \text{ and } n_0 \text{ such that } 0 \leq c_1 \cdot g(n) \leq f(n) \leq c_2 \cdot g(n) \text{ for all } n \geq n_0 \}
$$
Graphical representation of Big Theta:
![Big Theta Notation Graph](images/Time-And-Space-Complexity-BigTheta.png)
## Best Case, Worst Case and Average Case
### 1. Best-Case Scenario:
The best-case scenario refers to the situation where an algorithm performs optimally, achieving the lowest possible time or space complexity. It represents the most favorable conditions under which an algorithm operates.
#### Characteristics:
- Represents the minimum time or space required by an algorithm to solve a problem.
- Occurs when the input data is structured in such a way that the algorithm can exploit its strengths fully.
- Often used to analyze the lower bound of an algorithm's performance.
#### Example:
Consider the `linear search algorithm` where we're searching for a `target element` in an array. The best-case scenario occurs when the target element is found `at the very beginning of the array`. In this case, the algorithm would only need to make one comparison, resulting in a time complexity of `O(1)`.
### 2. Worst-Case Scenario:
The worst-case scenario refers to the situation where an algorithm performs at its poorest, achieving the highest possible time or space complexity. It represents the most unfavorable conditions under which an algorithm operates.
#### Characteristics:
- Represents the maximum time or space required by an algorithm to solve a problem.
- Occurs when the input data is structured in such a way that the algorithm encounters the most challenging conditions.
- Often used to analyze the upper bound of an algorithm's performance.
#### Example:
Continuing with the `linear search algorithm`, the worst-case scenario occurs when the `target element` is either not present in the array or located `at the very end`. In this case, the algorithm would need to iterate through the entire array, resulting in a time complexity of `O(n)`, where `n` is the size of the array.
### 3. Average-Case Scenario:
The average-case scenario refers to the expected performance of an algorithm over all possible inputs, typically calculated as the arithmetic mean of the time or space complexity.
#### Characteristics:
- Represents the typical performance of an algorithm across a range of input data.
- Takes into account the distribution of inputs and their likelihood of occurrence.
- Provides a more realistic measure of an algorithm's performance compared to the best-case or worst-case scenarios.
#### Example:
For the `linear search algorithm`, the average-case scenario considers the probability distribution of the target element's position within the array. If the `target element is equally likely to be found at any position in the array`, the average-case time complexity would be `O(n/2)`, as the algorithm would, on average, need to search halfway through the array.
## Space Complexity
The memory space that a code utilizes as it is being run is often referred to as space complexity. Additionally, space complexity depends on the machine, therefore rather than using the typical memory units like MB, GB, etc., we will express space complexity using the Big O notation.
#### Examples of Space Complexity
1. `Constant Space Complexity (O(1))`: Algorithms that operate on a fixed-size array or use a constant number of variables have O(1) space complexity.
2. `Linear Space Complexity (O(n))`: Algorithms that store each element of the input array in a separate variable or data structure have O(n) space complexity.
3. `Quadratic Space Complexity (O(n^2))`: Algorithms that create a two-dimensional array or matrix with dimensions based on the input size have O(n^2) space complexity.
#### Analyzing Space Complexity
To analyze space complexity:
- Identify the variables, data structures, and recursive calls used by the algorithm.
- Determine how the space requirements scale with the input size.
- Express the space complexity using Big O notation, considering the dominant terms that contribute most to the overall space usage.
## Examples to calculate time and space complexity
#### 1. Print all elements of given array
Consider each line takes one unit of time to run. So, to simply iterate over an array to print all elements it will take `O(n)` time, where n is the size of array.
Code:
```python
arr = [1,2,3,4] #1
for x in arr: #2
print(x) #3
```
Here, the 1st statement executes only once. So, it takes one unit of time to run. The for loop consisting of 2nd and 3rd statements executes 4 times.
Also, as the code dosen't take any additional space except the input arr its Space Complexity is O(1) constant.
#### 2. Linear Search
Linear search is a simple algorithm for finding an element in an array by sequentially checking each element until a match is found or the end of the array is reached. Here's an example of calculating the time and space complexity of linear search:
```python
def linear_search(arr, target):
for x in arr: # n iterations in worst case
if x == target: # 1
return True # 1
return False # If element not found
# Example usage
arr = [1, 3, 5, 7, 9]
target = 5
print(linear_search(arr, target))
```
**Time Complexity Analysis**
The for loop iterates through the entire array, which takes O(n) time in the worst case, where n is the size of the array.
Inside the loop, each operation takes constant time (O(1)).
Therefore, the time complexity of linear search is `O(n)`.
**Space Complexity Analysis**
The space complexity of linear search is `O(1)` since it only uses a constant amount of additional space for variables regardless of the input size.
#### 3. Binary Search
Binary search is an efficient algorithm for finding an element in a sorted array by repeatedly dividing the search interval in half. Here's an example of calculating the time and space complexity of binary search:
```python
def binary_search(arr, target):
left = 0 # 1
right = len(arr) - 1 # 1
while left <= right: # log(n) iterations in worst case
mid = (left + right) // 2 # log(n)
if arr[mid] == target: # 1
return mid # 1
elif arr[mid] < target: # 1
left = mid + 1 # 1
else:
right = mid - 1 # 1
return -1 # If element not found
# Example usage
arr = [1, 3, 5, 7, 9]
target = 5
print(binary_search(arr, target))
```
**Time Complexity Analysis**
The initialization of left and right takes constant time (O(1)).
The while loop runs for log(n) iterations in the worst case, where n is the size of the array.
Inside the loop, each operation takes constant time (O(1)).
Therefore, the time complexity of binary search is `O(log n)`.
**Space Complexity Analysis**
The space complexity of binary search is `O(1)` since it only uses a constant amount of additional space for variables regardless of the input size.
#### 4. Fibbonaci Sequence
Let's consider an example of a function that generates Fibonacci numbers up to a given index and stores them in a list. In this case, the space complexity will not be constant because the size of the list grows with the Fibonacci sequence.
```python
def fibonacci_sequence(n):
fib_list = [0, 1] # Initial Fibonacci sequence with first two numbers
while len(fib_list) < n: # O(n) iterations in worst case
next_fib = fib_list[-1] + fib_list[-2] # Calculating next Fibonacci number
fib_list.append(next_fib) # Appending next Fibonacci number to list
return fib_list
# Example usage
n = 10
fib_sequence = fibonacci_sequence(n)
print(fib_sequence)
```
**Time Complexity Analysis**
The while loop iterates until the length of the Fibonacci sequence list reaches n, so it takes `O(n)` iterations in the `worst case`.Inside the loop, each operation takes constant time (O(1)).
**Space Complexity Analysis**
The space complexity of this function is not constant because it creates and stores a list of Fibonacci numbers.
As n grows, the size of the list also grows, so the space complexity is O(n), where n is the index of the last Fibonacci number generated.

Wyświetl plik

@ -9,3 +9,4 @@
- [TensorFlow.md](tensorFlow.md) - [TensorFlow.md](tensorFlow.md)
- [PyTorch.md](pytorch.md) - [PyTorch.md](pytorch.md)
- [Types of optimizers](Types_of_optimizers.md) - [Types of optimizers](Types_of_optimizers.md)
- [Logistic Regression](logistic-regression.md)

Wyświetl plik

@ -0,0 +1,115 @@
# Logistic Regression
Logistic Regression is a statistical method used for binary classification problems. It is a type of regression analysis where the dependent variable is categorical. This README provides an overview of logistic regression, including its fundamental concepts, assumptions, and how to implement it using Python.
## Table of Contents
1. [Introduction](#introduction)
2. [Concepts](#concepts)
3. [Assumptions](#assumptions)
4. [Implementation](#implementation)
- [Using Scikit-learn](#using-scikit-learn)
- [Code Example](#code-example)
5. [Evaluation Metrics](#evaluation-metrics)
6. [Conclusion](#conclusion)
7. [References](#references)
## Introduction
Logistic Regression is used to model the probability of a binary outcome based on one or more predictor variables (features). It is widely used in various fields such as medical research, social sciences, and machine learning for tasks such as spam detection, fraud detection, and predicting user behavior.
## Concepts
### Sigmoid Function
The logistic regression model uses the sigmoid function to map predicted values to probabilities. The sigmoid function is defined as:
$$
\sigma(z) = \frac{1}{1 + e^{-z}}
$$
Where \( z \) is a linear combination of the input features.
### Odds and Log-Odds
- **Odds**: The odds represent the ratio of the probability of an event occurring to the probability of it not occurring.
$$\text{Odds} = \frac{P(Y=1)}{P(Y=0)}$$
- **Log-Odds**: The log-odds is the natural logarithm of the odds.
$$\text{Log-Odds} = \log \left( \frac{P(Y=1)}{P(Y=0)} \right)$$
Logistic regression models the log-odds as a linear combination of the input features.
### Model Equation
The logistic regression model equation is:
$$
\log \left( \frac{P(Y=1)}{P(Y=0)} \right) = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \cdots + \beta_n X_n
$$
Where:
- &beta;₀ is the intercept.
- &beta;<sub>i</sub> are the coefficients for the predictor variables X<sub>i</sub>.
## Assumptions
1. **Linearity**: The log-odds of the response variable are a linear combination of the predictor variables.
2. **Independence**: Observations should be independent of each other.
3. **No Multicollinearity**: Predictor variables should not be highly correlated with each other.
4. **Large Sample Size**: Logistic regression requires a large sample size to provide reliable results.
## Implementation
### Using Scikit-learn
Scikit-learn is a popular machine learning library in Python that provides tools for logistic regression.
### Code Example
```python
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
# Load dataset
data = pd.read_csv('path/to/your/dataset.csv')
# Define features and target variable
X = data[['feature1', 'feature2', 'feature3']]
y = data['target']
# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Initialize and train logistic regression model
model = LogisticRegression()
model.fit(X_train, y_train)
# Make predictions
y_pred = model.predict(X_test)
# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
class_report = classification_report(y_test, y_pred)
print("Accuracy:", accuracy)
print("Confusion Matrix:\n", conf_matrix)
print("Classification Report:\n", class_report)
```
## Evaluation Metrics
- **Accuracy**: The proportion of correctly classified instances among all instances.
- **Confusion Matrix**: A table showing the number of true positives, true negatives, false positives, and false negatives.
- **Precision, Recall, and F1-Score**: Metrics to evaluate the performance of the classification model.
## Conclusion
Logistic regression is a fundamental classification technique that is easy to implement and interpret. It is a powerful tool for binary classification problems and provides a probabilistic framework for predicting binary outcomes.

Wyświetl plik

@ -0,0 +1,120 @@
# NumPy Array Iteration
Iterating over arrays in NumPy is a common task when processing data. NumPy provides several ways to iterate over elements of an array efficiently.
Understanding these methods is crucial for performing operations on array elements effectively.
## 1. Basic Iteration
- Iterating using basic `for` loop.
### Single-dimensional array
Iterating over a single-dimensional array is straightforward using a basic `for` loop
```python
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
for i in arr:
print(i)
```
#### Output
```python
1
2
3
4
5
```
### Multi-dimensional array
Iterating over multi-dimensional arrays, each iteration returns a sub-array along the first axis.
```python
marr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
for arr in marr:
print(arr)
```
#### Output
```python
[1 2 3]
[4 5 6]
[7 8 9]
```
## 2. Iterating with `nditer`
- `nditer` is a powerful iterator provided by NumPy for iterating over multi-dimensional arrays.
- In each interation it gives each element.
```python
import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])
for i in np.nditer(arr):
print(i)
```
#### Output
```python
1
2
3
4
5
6
```
## 3. Iterating with `ndenumerate`
- `ndenumerate` allows you to iterate with both the index and the value of each element.
- It gives index and value as output in each iteration
```python
import numpy as np
arr = np.array([[1, 2], [3, 4]])
for index,value in np.ndenumerate(arr):
print(index,value)
```
#### Output
```python
(0, 0) 1
(0, 1) 2
(1, 0) 3
(1, 1) 4
```
## 4. Iterating with flat
- The `flat` attribute returns a 1-D iterator over the array.
```python
import numpy as np
arr = np.array([[1, 2], [3, 4]])
for element in arr.flat:
print(element)
```
#### Output
```python
1
2
3
4
```
Understanding the various ways to iterate over NumPy arrays can significantly enhance your data processing efficiency.
Whether you are working with single-dimensional or multi-dimensional arrays, NumPy provides versatile tools to iterate and manipulate array elements effectively.

Wyświetl plik

@ -0,0 +1,223 @@
# Concatenation of Arrays
Concatenation of arrays in NumPy refers to combining multiple arrays into a single array, either along existing axes or by adding new axes. NumPy provides several functions for this purpose.
# Functions of Concatenation
## np.concatenate
Joins two or more arrays along an existing axis.
### Syntax
```python
numpy.concatenate((arr1, arr2, ...), axis)
```
Args:
- arr1, arr2, ...: Sequence of arrays to concatenate.
- axis: Axis along which the arrays will be joined. Default is 0.
### Example
#### Concatenate along axis 0
```python
import numpy as np
#creating 2 arrays
arr1 = np.array([1 2 3],[7 8 9])
arr2 = np.array([4 5 6],[10 11 12])
result_1 = np.concatenate((arr1, arr2), axis=0)
print(result_1)
```
#### Output
```
[[ 1 2 3]
[ 7 8 9]
[ 4 5 6]
[10 11 12]]
```
#### Concatenate along axis 1
```python
result_2 = np.concatenate((arr1, arr2), axis=1)
print(result_2)
```
#### Output
```
[[ 1 2 3 4 5 6 ]
[ 7 8 9 10 11 12]]
```
## np.vstack
Vertical stacking of arrays (row-wise).
### Syntax
```python
numpy.vstack(arrays)
```
Args:
- arrays: Sequence of arrays to stack.
### Example
```python
import numpy as np
#create arrays
arr1= np.array([1 2 3], [7 8 9])
arr2 = np.array([4 5 6],[10 11 12])
result = np.vstack((arr1, arr2))
print(result)
```
#### Output
```
[[ 1 2 3]
[ 7 8 9]
[ 4 5 6]
[10 11 12]]
```
## 3. np.hstack
Stacks arrays horizontally (column-wise).
### Syntax
```python
numpy.hstack(arrays)
```
Args:
- arrays: Sequence of arrays to stack.
### Example
```python
import numpy as np
#create arrays
arr1= np.array([1 2 3], [7 8 9])
arr2 = np.array([4 5 6],[10 11 12])
result = np.hstack((arr1, arr2))
print(result)
```
#### Output
```
[[ 1 2 3] [ 4 5 6]
[ 7 8 9] [10 11 12]]
```
## np.dstack
Stacks arrays along the third axis (depth-wise).
### Syntax
```python
numpy.dstack(arrays)
```
- arrays: Sequence of arrays to stack.
### Example
```python
import numpy as np
#create arrays
arr1= np.array([1 2 3], [7 8 9])
arr2 = np.array([4 5 6],[10 11 12])
result = np.dstack((arr1, arr2))
print(result)
```
#### Output
```
[[[ 1 4]
[ 2 5]
[ 3 6]]
[[ 7 10]
[ 8 11]
[ 9 12]]]
```
## np.stack
Joins a sequence of arrays along a new axis.
```python
numpy.stack(arrays, axis)
```
Args:
- arrays: Sequence of arrays to stack.
### Example
```python
import numpy as np
#create arrays
arr1= np.array([1 2 3], [7 8 9])
arr2 = np.array([4 5 6],[10 11 12])
result = np.stack((arr1, arr2), axis=0)
print(result)
```
#### Output
```
[[[ 1 2 3]
[ 7 8 9]]
[[ 4 5 6]
[10 11 12]]]
```
# Concatenation with Mixed Dimensions
When concatenating arrays with different shapes, it's often necessary to reshape them to have compatible dimensions.
## Example
#### Concatenate along axis 0
```python
arr1 = np.array([[1, 2, 3], [4, 5, 6]])
arr2 = np.array([7, 8, 9])
result_0= np.concatenate((arr1, arr2[np.newaxis, :]), axis=0)
print(result_0)
```
#### Output
```
[[1 2 3]
[4 5 6]
[7 8 9]]
```
#### Concatenate along axis 1
```python
result_1 = np.concatenate((arr1, arr2[:, np.newaxis]), axis=1)
print(result_1)
```
#### Output
```
[[1 2 3 7]
[4 5 6 8]]
```

Wyświetl plik

@ -3,8 +3,11 @@
- [Installing NumPy](installing-numpy.md) - [Installing NumPy](installing-numpy.md)
- [Introduction](introduction.md) - [Introduction](introduction.md)
- [NumPy Data Types](datatypes.md) - [NumPy Data Types](datatypes.md)
- [Numpy Array Shape and Reshape](reshape-array.md)
- [Basic Mathematics](basic_math.md) - [Basic Mathematics](basic_math.md)
- [Operations on Arrays in NumPy](operations-on-arrays.md) - [Operations on Arrays in NumPy](operations-on-arrays.md)
- [Loading Arrays from Files](loading_arrays_from_files.md) - [Loading Arrays from Files](loading_arrays_from_files.md)
- [Saving Numpy Arrays into FIles](saving_numpy_arrays_to_files.md) - [Saving Numpy Arrays into FIles](saving_numpy_arrays_to_files.md)
- [Sorting NumPy Arrays](sorting-array.md) - [Sorting NumPy Arrays](sorting-array.md)
- [NumPy Array Iteration](array-iteration.md)
- [Concatenation of Arrays](concatenation-of-arrays.md)

Wyświetl plik

@ -0,0 +1,57 @@
# Numpy Array Shape and Reshape
In NumPy, the primary data structure is the ndarray (N-dimensional array). An array can have one or more dimensions, and it organizes your data efficiently.
Let us create a 2D array
``` python
import numpy as np
numbers = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
print(numbers)
```
#### Output:
``` python
array([[1, 2, 3, 4],[5, 6, 7, 8]])
```
## Changing Array Shape using `reshape()`
The `reshape()` function allows you to rearrange the data within a NumPy array.
It take 2 arguments, row and columns. The `reshape()` can add or remove the dimensions. For instance, array can convert a 1D array into a 2D array or vice versa.
``` python
arr_1d = np.array([1, 2, 3, 4, 5, 6]) # 1D array
arr_2d = arr_1d.reshape(2, 3) # Reshaping with 2 rows and 3 cols
print(arr_2d)
```
#### Output:
``` python
array([[1, 2, 3],[4, 5, 6]])
```
## Changing Array Shape using `resize()`
The `resize()` function allows you to modify the shape of a NumPy array directly.
It take 2 arguements, row and columns.
``` python
import numpy as np
arr_1d = np.array([1, 2, 3, 4, 5, 6])
arr_1d.resize((2, 3)) # 2 rows and 3 cols
print(arr_1d)
```
#### Output:
``` python
array([[1, 2, 3],[4, 5, 6]])
```

Wyświetl plik

@ -0,0 +1,158 @@
# Working with Date & Time in Pandas
While working with data, it is common to come across data containing date and time. Pandas is a very handy tool for dealing with such data and provides a wide range of date and time data processing options.
- **Parsing dates and times**: Pandas provides a number of functions for parsing dates and times from strings, including `to_datetime()` and `parse_dates()`. These functions can handle a variety of date and time formats, Unix timestamps, and human-readable formats.
- **Manipulating dates and times**: Pandas provides a number of functions for manipulating dates and times, including `shift()`, `resample()`, and `to_timedelta()`. These functions can be used to add or subtract time periods, change the frequency of a time series, and calculate the difference between two dates or times.
- **Visualizing dates and times**: Pandas provides a number of functions for visualizing dates and times, including `plot()`, `hist()`, and `bar()`. These functions can be used to create line charts, histograms, and bar charts of date and time data.
### `Timestamp` function
The timestamp function in Pandas is used to convert a datetime object to a Unix timestamp. A Unix timestamp is a numerical representation of datetime.
Example for retrieving day, month and year from given date:
```python
import pandas as pd
ts = pd.Timestamp('2024-05-05')
y = ts.year
print('Year is: ', y)
m = ts.month
print('Month is: ', m)
d = ts.day
print('Day is: ', d)
```
Output:
```python
Year is: 2024
Month is: 5
Day is: 5
```
Example for extracting time related data from given date:
```python
import pandas as pd
ts = pd.Timestamp('2024-10-24 12:00:00')
print('Hour is: ', ts.hour)
print('Minute is: ', ts.minute)
print('Weekday is: ', ts.weekday())
print('Quarter is: ', ts.quarter)
```
Output:
```python
Hour is: 12
Minute is: 0
Weekday is: 1
Quarter is: 4
```
### `Timestamp.now()`
Example for getting current date and time:
```python
import pandas as pd
ts = pd.Timestamp.now()
print('Current date and time is: ', ts)
```
Output:
```python
Current date and time is: 2024-05-25 11:48:25.593213
```
### `date_range` function
Example for generating dates' for next five days:
```python
import pandas as pd
ts = pd.date_range(start = pd.Timestamp.now(), periods = 5)
for i in ts:
print(i.date())
```
Output:
```python
2024-05-25
2024-05-26
2024-05-27
2024-05-28
2024-05-29
```
Example for generating dates' for previous five days:
```python
import pandas as pd
ts = pd.date_range(end = pd.Timestamp.now(), periods = 5)
for i in ts:
print(i.date())
```
Output:
```python
2024-05-21
2024-05-22
2024-05-23
2024-05-24
2024-05-25
```
### Built-in vs pandas date & time operations
In `pandas`, you may add a time delta to a full column of dates in a single action, but Python's datetime requires a loop.
Example in Pandas:
```python
import pandas as pd
dates = pd.DataFrame(pd.date_range('2023-01-01', periods=100000, freq='T'))
dates += pd.Timedelta(days=1)
print(dates)
```
Output:
```python
0
0 2023-01-02 00:00:00
1 2023-01-02 00:01:00
2 2023-01-02 00:02:00
3 2023-01-02 00:03:00
4 2023-01-02 00:04:00
... ...
99995 2023-03-12 10:35:00
99996 2023-03-12 10:36:00
99997 2023-03-12 10:37:00
99998 2023-03-12 10:38:00
99999 2023-03-12 10:39:00
```
Example using Built-in datetime library:
```python
from datetime import datetime, timedelta
dates = [datetime(2023, 1, 1) + timedelta(minutes=i) for i in range(100000)]
dates = [date + timedelta(days=1) for date in dates]
```
Why use pandas functions?
- Pandas employs NumPy's datetime64 dtype, which takes up a set amount of bytes (usually 8 bytes per date), to store datetime data more compactly and efficiently.
- Each datetime object in Python takes up extra memory since it contains not only the date and time but also the additional metadata and overhead associated with Python objects.
- Pandas Offers a wide range of convenient functions and methods for date manipulation, extraction, and conversion, such as `pd.to_datetime()`, `date_range()`, `timedelta_range()`, and more. datetime library requires manual implementation for many of these operations, leading to longer and less efficient code.

Wyświetl plik

@ -5,5 +5,6 @@
- [Pandas Descriptive Statistics](Descriptive_Statistics.md) - [Pandas Descriptive Statistics](Descriptive_Statistics.md)
- [Group By Functions with Pandas](GroupBy_Functions_Pandas.md) - [Group By Functions with Pandas](GroupBy_Functions_Pandas.md)
- [Excel using Pandas DataFrame](excel_with_pandas.md) - [Excel using Pandas DataFrame](excel_with_pandas.md)
- [Working with Date & Time in Pandas](datetime.md)
- [Importing and Exporting Data in Pandas](import-export.md) - [Importing and Exporting Data in Pandas](import-export.md)
- [Handling Missing Values in Pandas](handling-missing-values.md) - [Handling Missing Values in Pandas](handling-missing-values.md)

Wyświetl plik

@ -0,0 +1,216 @@
# Bar Plots in Matplotlib
A bar plot or a bar chart is a type of data visualisation that represents data in the form of rectangular bars, with lengths or heights proportional to the values and data which they represent. The bar plots can be plotted both vertically and horizontally.
It is one of the most widely used type of data visualisation as it is easy to interpret and is pleasing to the eyes.
Matplotlib provides a very easy and intuitive method to create highly customized bar plots.
## Prerequisites
Before creating bar plots in matplotlib you must ensure that you have Python as well as Matplotlib installed on your system.
## Creating a simple Bar Plot with `bar()` method
A very basic Bar Plot can be created with `bar()` method in `matplotlib.pyplot`
```Python
import matplotlib.pyplot as plt
# Creating dataset
x = ["A", "B", "C", "D"]
y = [2, 7, 9, 11]
# Creating bar plot
plt.bar(x,y)
plt.show() # Shows the plot
```
When executed, this would show the following bar plot:
![Basic Bar Plot](images/basic_bar_plot.png)
The `bar()` function takes arguments that describes the layout of the bars.
Here, `plt.bar(x,y)` is used to specify that the bar chart is to be plotted by taking the `x` array as X-axis and `y` array as Y-axis. You can customize the graph further like adding labels to the axes, color of the bars, etc. These will be explored in the upcoming sections.
Additionally, you can also use `numpy` arrays for faster generation when handling large datasets.
```Python
import matplotlib.pyplot as plt
import numpy as np
# Using numpy array
x = np.array(["A", "B", "C", "D"])
y = np.array([2, 7, 9, 11])
plt.bar(x,y)
plt.show()
```
Its output would be the same as above.
## Customizing Bar Plots
For creating customized bar plots, it is **highly recommended** to create the plots using `matplotlib.pyplot.subplots()`, otherwise it is difficult to apply the customizations in the newer versions of Matplotlib.
### Adding title to the graph and labeling the axes
Let us create an imaginary graph of number of cars sold in a various years.
```Python
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
years = ['1999', '2000', '2001', '2002']
num_of_cars_sold = [300, 500, 700, 1000]
# Creating bar plot
ax.bar(years, num_of_cars_sold)
# Adding axis labels
ax.set_xlabel("Years")
ax.set_ylabel("Number of cars sold")
# Adding plot title
ax.set_title("Number of cars sold in various years")
plt.show()
```
![Title and axis labels](images/title_and_axis_labels.png)
Here, we have created a `matplotlib.pyplot.subplots()` object which returns a `Figure` object `fig` as well as an `Axes` object `ax` both of which are used for customizing the bar plot. `ax.set_xlabel`, `ax.set_ylabel` and `ax.set_title` are respectively used for adding labels of X, Y axis and adding title to the graph.
### Adding bar colors and legends
Let us consider our previous example of number of cars sold in various years and suppose that we want to add different colors to the bars from different centuries and respective legends for better interpretation.
This can be achieved by creating two separate arrays `bar_colors` for bar colors and `bar_labels` for legend labels and passing them as arguments to parameters color and label respectively in `ax.bar` method.
```Python
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
years = ['1998', '1999', '2000', '2001', '2002']
num_of_cars_sold = [200, 300, 500, 700, 1000]
bar_colors = ['tab:green', 'tab:green', 'tab:blue', 'tab:blue', 'tab:blue']
bar_labels = ['1900s', '_1900s', '2000s', '_2000s', '_2000s']
# Creating the customized bar plot
ax.bar(years, num_of_cars_sold, color=bar_colors, label=bar_labels)
# Adding axis labels
ax.set_xlabel("Years")
ax.set_ylabel("Number of cars sold")
# Adding plot title
ax.set_title("Number of cars sold in various years")
# Adding legend title
ax.legend(title='Centuries')
plt.show()
```
![Bar colors and Legends](images/bar_colors_and_legends.png)
Note that the labels with a preceding underscore won't show up in the legend. Legend titles can be added by simply passing `title` argument in `ax.legend()`, as shown. Also, you can have a different color for all the bars by passing the `HEX` value of that color in the `color` parameter.
### Adding labels to bars
We may want to add labels to bars representing their absolute (or truncated) values for instant and accurate reading. This can be achieved by passing the `BarContainer` object (returned by `ax.bar()` method) which is basically a aontainer with all the bars and optionally errorbars to `ax.bar_label` method.
```Python
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
years = ['1998', '1999', '2000', '2001', '2002']
num_of_cars_sold = [200, 300, 500, 700, 1000]
bar_colors = ['tab:green', 'tab:green', 'tab:blue', 'tab:blue', 'tab:blue']
bar_labels = ['1900s', '_1900s', '2000s', '_2000s', '_2000s']
# BarContainer object
bar_container = ax.bar(years, num_of_cars_sold, color=bar_colors, label=bar_labels)
ax.set_xlabel("Years")
ax.set_ylabel("Number of cars sold")
ax.set_title("Number of cars sold in various years")
ax.legend(title='Centuries')
# Adding bar labels
ax.bar_label(bar_container)
plt.show()
```
![Bar Labels](images/bar_labels.png)
**Note:** There are various other methods of adding bar labels in matplotlib.
## Horizontal Bar Plot
We can create horizontal bar plots by using the `barh()` method in `matplotlib.pyplot`. All the relevant customizations are applicable here also.
```Python
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(10,5)) # figsize is used to alter the size of figure
years = ['1998', '1999', '2000', '2001', '2002']
num_of_cars_sold = [200, 300, 500, 700, 1000]
bar_colors = ['tab:green', 'tab:green', 'tab:blue', 'tab:blue', 'tab:blue']
bar_labels = ['1900s', '_1900s', '2000s', '_2000s', '_2000s']
# Creating horizontal bar plot
bar_container = ax.barh(years, num_of_cars_sold, color=bar_colors, label=bar_labels)
# Adding axis labels
ax.set_xlabel("Years")
ax.set_ylabel("Number of cars sold")
# Adding Title
ax.set_title("Number of cars sold in various years")
ax.legend(title='Centuries')
# Adding bar labels
ax.bar_label(bar_container)
plt.show()
```
![Horizontal Bar Plot-1](images/horizontal_bar_plot_1.png)
We can also invert the Y-axis labels here to show the top values first.
```Python
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(10,5)) # figsize is used to alter the size of figure
years = ['1998', '1999', '2000', '2001', '2002']
num_of_cars_sold = [200, 300, 500, 700, 1000]
bar_colors = ['tab:green', 'tab:green', 'tab:blue', 'tab:blue', 'tab:blue']
bar_labels = ['1900s', '_1900s', '2000s', '_2000s', '_2000s']
# Creating horizontal bar plot
bar_container = ax.barh(years, num_of_cars_sold, color=bar_colors, label=bar_labels)
# Adding axis labels
ax.set_xlabel("Years")
ax.set_ylabel("Number of cars sold")
# Adding Title
ax.set_title("Number of cars sold in various years")
ax.legend(title='Centuries')
# Adding bar labels
ax.bar_label(bar_container)
# Inverting Y-axis
ax.invert_yaxis()
plt.show()
```
![Horizontal Bar Plot-2](images/horizontal_bar_plot_2.png)

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 22 KiB

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 24 KiB

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 12 KiB

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 25 KiB

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 25 KiB

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 18 KiB

Wyświetl plik

@ -1,4 +1,5 @@
# List of sections # List of sections
- [Installing Matplotlib](matplotlib_installation.md) - [Installing Matplotlib](matplotlib-installation.md)
- [Pie Charts in Matplotlib](matplotlib_pie_charts.md) - [Bar Plots in Matplotlib](matplotlib-bar-plots.md)
- [Pie Charts in Matplotlib](matplotlib-pie-charts.md)