Merge branch 'main' into new_branch

pull/1003/head
Ashita Prasad 2024-06-08 11:54:36 +05:30 zatwierdzone przez GitHub
commit 2439ffee89
Nie znaleziono w bazie danych klucza dla tego podpisu
ID klucza GPG: B5690EEEBB952194
24 zmienionych plików z 1699 dodań i 1 usunięć

Wyświetl plik

@ -0,0 +1,75 @@
# Understanding the `eval` Function in Python
## Introduction
The `eval` function in Python allows you to execute a string-based Python expression dynamically. This can be useful in various scenarios where you need to evaluate expressions that are not known until runtime.
## Syntax
```python
eval(expression, globals=None, locals=None)
```
### Parameters:
* expression: String is parsed and evaluated as a Python expression
* globals [optional]: Dictionary to specify the available global methods and variables.
* locals [optional]: Another dictionary to specify the available local methods and variables.
## Examples
Example 1:
```python
result = eval('2 + 3 * 4')
print(result) # Output: 14
```
Example 2:
```python
x = 10
expression = 'x * 2'
result = eval(expression, {'x': x})
print(result) # Output: 20
```
Example 3:
```python
x = 10
def multiply(a, b):
return a * b
expression = 'multiply(x, 5) + 2'
result = eval(expression)
print("Result:",result) # Output: Result:52
```
Example 4:
```python
expression = input("Enter a Python expression: ")
result = eval(expression)
print("Result:", result)
#input= "3+2"
#Output: Result:5
```
Example 5:
```python
import numpy as np
a=np.random.randint(1,9)
b=np.random.randint(1,9)
operations=["*","-","+"]
op=np.random.choice(operations)
expression=str(a)+op+str(b)
correct_answer=eval(expression)
given_answer=int(input(str(a)+" "+op+" "+str(b)+" = "))
if given_answer==correct_answer:
print("Correct")
else:
print("Incorrect")
print("correct answer is :" ,correct_answer)
#2 * 1 = 8
#Incorrect
#correct answer is : 2
#or
#3 * 2 = 6
#Correct
```
## Conclusion
The eval function is a powerful tool in Python that allows for dynamic evaluation of expressions.

Wyświetl plik

@ -2,6 +2,8 @@
- [OOPs](oops.md)
- [Decorators/\*args/**kwargs](decorator-kwargs-args.md)
- ['itertools' module](itertools.md)
- [Type Hinting](type-hinting.md)
- [Lambda Function](lambda-function.md)
- [Working with Dates & Times in Python](dates_and_times.md)
- [Regular Expressions in Python](regular_expressions.md)
@ -11,4 +13,6 @@
- [Exception Handling in Python](exception-handling.md)
- [Generators](generators.md)
- [Filter](filter-function.md)
- [Reduce](reduce-function.md)
- [Reduce](reduce-function.md)
- [List Comprehension](list-comprehension.md)
- [Eval Function](eval_function.md)

Wyświetl plik

@ -0,0 +1,144 @@
# The 'itertools' Module in Python
The itertools module in Python provides a collection of fast, memory-efficient tools that are useful for creating and working with iterators. These functions
allow you to iterate over data in various ways, often combining, filtering, or extending iterators to generate complex sequences efficiently.
## Benefits of itertools
1. Efficiency: Functions in itertools are designed to be memory-efficient, often generating elements on the fly and avoiding the need to store large intermediate results.
2. Conciseness: Using itertools can lead to more readable and concise code, reducing the need for complex loops and temporary variables.
3. Composability: Functions from itertools can be easily combined, allowing you to build complex iterator pipelines from simple building blocks.
## Useful Functions in itertools <br>
Here are some of the most useful functions in the itertools module, along with examples of how to use them:
1. 'count': Generates an infinite sequence of numbers, starting from a specified value.
```bash
import itertools
counter = itertools.count(start=10, step=2)
for _ in range(5):
print(next(counter))
# Output: 10, 12, 14, 16, 18
```
2. 'cycle': Cycles through an iterable indefinitely.
```bash
import itertools
cycler = itertools.cycle(['A', 'B', 'C'])
for _ in range(6):
print(next(cycler))
# Output: A, B, C, A, B, C
```
3.'repeat': Repeats an object a specified number of times or indefinitely.
```bash
import itertools
repeater = itertools.repeat('Hello', 3)
for item in repeater:
print(item)
# Output: Hello, Hello, Hello
```
4. 'chain': Combines multiple iterables into a single iterable.
```bash
import itertools
combined = itertools.chain([1, 2, 3], ['a', 'b', 'c'])
for item in combined:
print(item)
# Output: 1, 2, 3, a, b, c
```
5. 'islice': Slices an iterator, similar to slicing a list.
```bash
import itertools
sliced = itertools.islice(range(10), 2, 8, 2)
for item in sliced:
print(item)
# Output: 2, 4, 6
```
6. 'compress': Filters elements in an iterable based on a corresponding selector iterable.
```bash
import itertools
data = ['A', 'B', 'C', 'D']
selectors = [1, 0, 1, 0]
result = itertools.compress(data, selectors)
for item in result:
print(item)
# Output: A, C
```
7. 'permutations': Generates all possible permutations of an iterable.
```bash
import itertools
perms = itertools.permutations('ABC', 2)
for item in perms:
print(item)
# Output: ('A', 'B'), ('A', 'C'), ('B', 'A'), ('B', 'C'), ('C', 'A'), ('C', 'B')
```
8. 'combinations': Generates all possible combinations of a specified length from an iterable.
```bash
import itertools
combs = itertools.combinations('ABC', 2)
for item in combs:
print(item)
# Output: ('A', 'B'), ('A', 'C'), ('B', 'C')
```
9. 'product': Computes the Cartesian product of input iterables.
```bash
import itertools
prod = itertools.product('AB', '12')
for item in prod:
print(item)
# Output: ('A', '1'), ('A', '2'), ('B', '1'), ('B', '2')
```
10. 'groupby': Groups elements of an iterable by a specified key function.
```bash
import itertools
data = [{'name': 'Alice', 'age': 25}, {'name': 'Bob', 'age': 25}, {'name': 'Charlie', 'age': 30}]
sorted_data = sorted(data, key=lambda x: x['age'])
grouped = itertools.groupby(sorted_data, key=lambda x: x['age'])
for key, group in grouped:
print(key, list(group))
# Output:
# 25 [{'name': 'Alice', 'age': 25}, {'name': 'Bob', 'age': 25}]
# 30 [{'name': 'Charlie', 'age': 30}]
```
11. 'accumulate': Makes an iterator that returns accumulated sums, or accumulated results of other binary functions specified via the optional func argument.
```bash
import itertools
import operator
data = [1, 2, 3, 4, 5]
acc = itertools.accumulate(data, operator.mul)
for item in acc:
print(item)
# Output: 1, 2, 6, 24, 120
```
## Conclusion
The itertools module is a powerful toolkit for working with iterators in Python. Its functions enable efficient and concise handling of iterable data, allowing you to create complex data processing pipelines with minimal memory overhead.
By leveraging itertools, you can improve the readability and performance of your code, making it a valuable addition to your Python programming arsenal.

Wyświetl plik

@ -0,0 +1,73 @@
# List Comprehension
Creating lists concisely and expressively is what list comprehension in Python does. You can generate lists from already existing iterables like lists, tuples or strings with a short form.
This boosts the readability of code and reduces necessity of using explicit looping constructs.
## Syntax :
### Basic syntax
```python
new_list = [expression for item in iterable]
```
- **new_list**: This is the name given to the list that will be created using the list comprehension.
- **expression**: This is the expression that defines how each element of the new list will be generated or transformed.
- **item**: This variable represents each individual element from the iterable. It takes on the value of each element in the iterable during each iteration.
- **iterable**: This is the sequence-like object over which the iteration will take place. It provides the elements that will be processed by the expression.
This list comprehension syntax `[expression for item in iterable]` allows you to generate a new list by applying a specific expression to each element in an iterable.
### Syntax including condition
```python
new_list = [expression for item in iterable if condition]
```
- **new_list**: This is the name given to the list that will be created using the list comprehension.
- **expression**: This is the expression that defines how each element of the new list will be generated or transformed.
- **item**: This variable represents each individual element from the iterable. It takes on the value of each element in the iterable during each iteration.
- **iterable**: This is the sequence-like object over which the iteration will take place. It provides the elements that will be processed by the expression.
- **if condition**: This is an optional part of the syntax. It allows for conditional filtering of elements from the iterable. Only items that satisfy the condition
will be included in the new list.
## Examples:
1. Generating a list of squares of numbers from 1 to 5:
```python
squares = [x ** 2 for x in range(1, 6)]
print(squares)
```
- **Output** :
```python
[1, 4, 9, 16, 25]
```
2. Filtering even numbers from a list:
```python
nums = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
even = [x for x in nums if x % 2 == 0]
print(even)
```
- **Output** :
```python
[2, 4, 6, 8, 10]
```
3. Flattening a list of lists:
```python
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [x for sublist in matrix for x in sublist]
print(flat)
```
- **Output** :
```python
[1, 2, 3, 4, 5, 6, 7, 8, 9]
```
List comprehension is a powerful feature in Python for creating lists based on existing iterables with a concise syntax.
By mastering list comprehension, developers can write cleaner, more expressive code and leverage Python's functional programming capabilities effectively.

Wyświetl plik

@ -0,0 +1,106 @@
# Introduction to Type Hinting in Python
Type hinting is a feature in Python that allows you to specify the expected data types of variables, function arguments, and return values. It was introduced
in Python 3.5 via PEP 484 and has since become a standard practice to improve code readability and facilitate static analysis tools.
**Benefits of Type Hinting**
1. Improved Readability: Type hints make it clear what type of data is expected, making the code easier to understand for others and your future self.
2. Error Detection: Static analysis tools like MyPy can use type hints to detect type errors before runtime, reducing bugs and improving code quality.
3.Better Tooling Support: Modern IDEs and editors can leverage type hints to provide better autocompletion, refactoring, and error checking features.
4. Documentation: Type hints serve as a form of documentation, indicating the intended usage of functions and classes.
**Syntax of Type Hinting** <br>
Type hints can be added to variables, function arguments, and return values using annotations.
1. Variable Annotations:
```bash
age: int = 25
name: str = "Alice"
is_student: bool = True
```
2. Function Annotations:
```bash
def greet(name: str) -> str:
return f"Hello, {name}!"
```
3. Multiple Arguments and Return Types:
```bash
def add(a: int, b: int) -> int:
return a + b
```
4. Optional Types: Use the Optional type from the typing module for values that could be None.
```bash
from typing import Optional
def get_user_name(user_id: int) -> Optional[str]:
# Function logic here
return None # Example return value
```
5. Union Types: Use the Union type when a variable can be of multiple types.
```bash
from typing import Union
def get_value(key: str) -> Union[int, str]:
# Function logic here
return "value" # Example return value
```
6. List and Dictionary Types: Use the List and Dict types from the typing module for collections.
```bash
from typing import List, Dict
def process_data(data: List[int]) -> Dict[str, int]:
# Function logic here
return {"sum": sum(data)} # Example return value
```
7. Type Aliases: Create type aliases for complex types to make the code more readable.
```bash
from typing import List, Tuple
Coordinates = List[Tuple[int, int]]
def draw_shape(points: Coordinates) -> None:
# Function logic here
pass
```
**Example of Type Hinting in a Class** <br>
Here is a more comprehensive example using type hints in a class:
```bash
from typing import List
class Student:
def __init__(self, name: str, age: int, grades: List[int]) -> None:
self.name = name
self.age = age
self.grades = grades
def average_grade(self) -> float:
return sum(self.grades) / len(self.grades)
def add_grade(self, grade: int) -> None:
self.grades.append(grade)
# Example usage
student = Student("Alice", 20, [90, 85, 88])
print(student.average_grade()) # Output: 87.66666666666667
student.add_grade(92)
print(student.average_grade()) # Output: 88.75
```
### Conclusion
Type hinting in Python enhances code readability, facilitates error detection through static analysis, and improves tooling support. By adopting
type hinting, you can write clearer and more maintainable code, reducing the likelihood of bugs and making your codebase easier to navigate for yourself and others.

Wyświetl plik

@ -0,0 +1,153 @@
# Hashing with Chaining
In Data Structures and Algorithms, hashing is used to map data of arbitrary size to fixed-size values. A common approach to handle collisions in hashing is **chaining**. In chaining, each slot of the hash table contains a linked list, and all elements that hash to the same slot are stored in that list.
## Points to be Remembered
- **Hash Function**: A function that converts an input (or 'key') into an index in a hash table.
- **Collision**: When two keys hash to the same index.
- **Chaining**: A method to resolve collisions by maintaining a linked list for each hash table slot.
## Real Life Examples of Hashing with Chaining
- **Phone Directory**: Contacts are stored in a hash table where the contact's name is hashed to an index. If multiple names hash to the same index, they are stored in a linked list at that index.
- **Library Catalog**: Books are indexed by their titles. If multiple books have titles that hash to the same index, they are stored in a linked list at that index.
## Applications of Hashing
Hashing is widely used in Computer Science:
- **Database Indexing**
- **Caches** (like CPU caches, web caches)
- **Associative Arrays** (or dictionaries in Python)
- **Sets** (unordered collections of unique elements)
Understanding these applications is essential for Software Development.
## Operations in Hash Table with Chaining
Key operations include:
- **INSERT**: Insert a new element into the hash table.
- **SEARCH**: Find the position of an element in the hash table.
- **DELETE**: Remove an element from the hash table.
## Implementing Hash Table with Chaining in Python
```python
class Node:
def __init__(self, key, value):
self.key = key
self.value = value
self.next = None
class HashTable:
def __init__(self, size):
self.size = size
self.table = [None] * size
def hash_function(self, key):
return key % self.size
def insert(self, key, value):
hash_index = self.hash_function(key)
new_node = Node(key, value)
if self.table[hash_index] is None:
self.table[hash_index] = new_node
else:
current = self.table[hash_index]
while current.next is not None:
current = current.next
current.next = new_node
def search(self, key):
hash_index = self.hash_function(key)
current = self.table[hash_index]
while current is not None:
if current.key == key:
return current.value
current = current.next
return None
def delete(self, key):
hash_index = self.hash_function(key)
current = self.table[hash_index]
prev = None
while current is not None:
if current.key == key:
if prev is None:
self.table[hash_index] = current.next
else:
prev.next = current.next
return True
prev = current
current = current.next
return False
def display(self):
for index, item in enumerate(self.table):
print(f"Index {index}:", end=" ")
current = item
while current is not None:
print(f"({current.key}, {current.value})", end=" -> ")
current = current.next
print("None")
# Example usage
hash_table = HashTable(10)
hash_table.insert(1, 'A')
hash_table.insert(11, 'B')
hash_table.insert(21, 'C')
print("Hash Table after Insert operations:")
hash_table.display()
print("Search operation for key 11:", hash_table.search(11))
hash_table.delete(11)
print("Hash Table after Delete operation:")
hash_table.display()
```
## Output
```markdown
Hash Table after Insert operations:
Index 0: None
Index 1: (1, 'A') -> (11, 'B') -> (21, 'C') -> None
Index 2: None
Index 3: None
Index 4: None
Index 5: None
Index 6: None
Index 7: None
Index 8: None
Index 9: None
Search operation for key 11: B
Hash Table after Delete operation:
Index 0: None
Index 1: (1, 'A') -> (21, 'C') -> None
Index 2: None
Index 3: None
Index 4: None
Index 5: None
Index 6: None
Index 7: None
Index 8: None
Index 9: None
```
## Complexity Analysis
- **Insertion**: Average case O(1), Worst case O(n) when many elements hash to the same slot.
- **Search**: Average case O(1), Worst case O(n) when many elements hash to the same slot.
- **Deletion**: Average case O(1), Worst case O(n) when many elements hash to the same slot.

Wyświetl plik

@ -0,0 +1,139 @@
# Hashing with Linear Probing
In Data Structures and Algorithms, hashing is used to map data of arbitrary size to fixed-size values. A common approach to handle collisions in hashing is **linear probing**. In linear probing, if a collision occurs (i.e., the hash value points to an already occupied slot), we linearly probe through the table to find the next available slot. This method ensures that every element can be inserted or found in the hash table.
## Points to be Remembered
- **Hash Function**: A function that converts an input (or 'key') into an index in a hash table.
- **Collision**: When two keys hash to the same index.
- **Linear Probing**: A method to resolve collisions by checking the next slot (i.e., index + 1) until an empty slot is found.
## Real Life Examples of Hashing with Linear Probing
- **Student Record System**: Each student record is stored in a table where the student's ID number is hashed to an index. If two students have the same hash index, linear probing finds the next available slot.
- **Library System**: Books are indexed by their ISBN numbers. If two books hash to the same slot, linear probing helps find another spot for the book in the catalog.
## Applications of Hashing
Hashing is widely used in Computer Science:
- **Database Indexing**
- **Caches** (like CPU caches, web caches)
- **Associative Arrays** (or dictionaries in Python)
- **Sets** (unordered collections of unique elements)
Understanding these applications is essential for Software Development.
## Operations in Hash Table with Linear Probing
Key operations include:
- **INSERT**: Insert a new element into the hash table.
- **SEARCH**: Find the position of an element in the hash table.
- **DELETE**: Remove an element from the hash table.
## Implementing Hash Table with Linear Probing in Python
```python
class HashTable:
def __init__(self, size):
self.size = size
self.table = [None] * size
def hash_function(self, key):
return key % self.size
def insert(self, key, value):
hash_index = self.hash_function(key)
if self.table[hash_index] is None:
self.table[hash_index] = (key, value)
else:
while self.table[hash_index] is not None:
hash_index = (hash_index + 1) % self.size
self.table[hash_index] = (key, value)
def search(self, key):
hash_index = self.hash_function(key)
while self.table[hash_index] is not None:
if self.table[hash_index][0] == key:
return self.table[hash_index][1]
hash_index = (hash_index + 1) % self.size
return None
def delete(self, key):
hash_index = self.hash_function(key)
while self.table[hash_index] is not None:
if self.table[hash_index][0] == key:
self.table[hash_index] = None
return True
hash_index = (hash_index + 1) % self.size
return False
def display(self):
for index, item in enumerate(self.table):
print(f"Index {index}: {item}")
# Example usage
hash_table = HashTable(10)
hash_table.insert(1, 'A')
hash_table.insert(11, 'B')
hash_table.insert(21, 'C')
print("Hash Table after Insert operations:")
hash_table.display()
print("Search operation for key 11:", hash_table.search(11))
hash_table.delete(11)
print("Hash Table after Delete operation:")
hash_table.display()
```
## Output
```markdown
Hash Table after Insert operations:
Index 0: None
Index 1: (1, 'A')
Index 2: None
Index 3: None
Index 4: None
Index 5: None
Index 6: None
Index 7: None
Index 8: None
Index 9: None
Index 10: None
Index 11: (11, 'B')
Index 12: (21, 'C')
Search operation for key 11: B
Hash Table after Delete operation:
Index 0: None
Index 1: (1, 'A')
Index 2: None
Index 3: None
Index 4: None
Index 5: None
Index 6: None
Index 7: None
Index 8: None
Index 9: None
Index 10: None
Index 11: None
Index 12: (21, 'C')
```
## Complexity Analysis
- **Insertion**: Average case O(1), Worst case O(n) when many collisions occur.
- **Search**: Average case O(1), Worst case O(n) when many collisions occur.
- **Deletion**: Average case O(1), Worst case O(n) when many collisions occur.

Wyświetl plik

@ -13,3 +13,6 @@
- [Stacks in Python](stacks.md)
- [Sliding Window Technique](sliding-window.md)
- [Trie](trie.md)
- [Two Pointer Technique](two-pointer-technique.md)
- [Hashing through Linear Probing](hashing-linear-probing.md)
- [Hashing through Chaining](hashing-chaining.md)

Wyświetl plik

@ -0,0 +1,132 @@
# Two-Pointer Technique
---
- The two-pointer technique is a popular algorithmic strategy used to solve various problems efficiently. This technique involves using two pointers (or indices) to traverse through data structures such as arrays or linked lists.
- The pointers can move in different directions, allowing for efficient processing of elements to achieve the desired results.
## Common Use Cases
1. **Finding pairs in a sorted array that sum to a target**: One pointer starts at the beginning and the other at the end.
2. **Reversing a linked list**: One pointer starts at the head, and the other at the next node, progressing through the list.
3. **Removing duplicates from a sorted array**: One pointer keeps track of the unique elements, and the other traverses the array.
4. **Merging two sorted arrays**: Two pointers are used to iterate through the arrays and merge them.
## Example 1: Finding Pairs with a Given Sum
### Problem Statement
Given a sorted array of integers and a target sum, find all pairs in the array that sum up to the target.
### Approach
1. Initialize two pointers: one at the beginning (`left`) and one at the end (`right`) of the array.
2. Calculate the sum of the elements at the `left` and `right` pointers.
3. If the sum is equal to the target, record the pair and move both pointers inward.
4. If the sum is less than the target, move the `left` pointer to the right to increase the sum.
5. If the sum is greater than the target, move the `right` pointer to the left to decrease the sum.
6. Repeat the process until the `left` pointer is not less than the `right` pointer.
### Example Code
```python
def find_pairs_with_sum(arr, target):
left = 0
right = len(arr) - 1
pairs = []
while left < right:
current_sum = arr[left] + arr[right]
if current_sum == target:
pairs.append((arr[left], arr[right]))
left += 1
right -= 1
elif current_sum < target:
left += 1
else:
right -= 1
return pairs
# Example usage
arr = [1, 2, 3, 4, 5, 6, 7, 8, 9]
target = 10
result = find_pairs_with_sum(arr, target)
print("Pairs with sum", target, "are:", result)
```
## Example 2: Removing Duplicates from a Sorted Array
### Problem Statement
Given a sorted array, remove the duplicates in place such that each element appears only once and return the new length of the array.
### Approach
1. If the array is empty, return 0.
2. Initialize a slow pointer at the beginning of the array.
3. Use a fast pointer to traverse through the array.
4. Whenever the element at the fast pointer is different from the element at the slow pointer, increment the slow pointer and update the element at the slow pointer with the element at the fast pointer.
5. Continue this process until the fast pointer reaches the end of the array.
6. The slow pointer will indicate the position of the last unique element.
### Example Code
```python
def remove_duplicates(arr):
if not arr:
return 0
slow = 0
for fast in range(1, len(arr)):
if arr[fast] != arr[slow]:
slow += 1
arr[slow] = arr[fast]
return slow + 1
# Example usage
arr = [1, 1, 2, 2, 3, 4, 4, 5]
new_length = remove_duplicates(arr)
print("Array after removing duplicates:", arr[:new_length])
print("New length of array:", new_length)
```
# Advantages of the Two-Pointer Technique
Here are some key benefits of using the two-pointer technique:
## 1. **Improved Time Complexity**
It often reduces the time complexity from O(n^2) to O(n), making it significantly faster for many problems.
### Example
- **Finding pairs with a given sum**: Efficiently finds pairs in O(n) time.
## 2. **Simplicity**
The implementation is straightforward, using basic operations like incrementing or decrementing pointers.
### Example
- **Removing duplicates from a sorted array**: Easy to implement and understand.
## 3. **In-Place Solutions**
Many problems can be solved in place, requiring no extra space beyond the input data.
### Example
- **Reversing a linked list**: Adjusts pointers within the existing nodes.
## 4. **Versatility**
Applicable to a wide range of problems, from arrays and strings to linked lists.
### Example
- **Merging two sorted arrays**: Efficiently merges using two pointers.
## 5. **Efficiency**
Minimizes redundant operations and enhances performance, especially with large data sets.
### Example
- **Partitioning problems**: Efficiently partitions elements with minimal operations.

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 56 KiB

Wyświetl plik

@ -0,0 +1,140 @@
# Ensemble Learning
Ensemble Learning is a powerful machine learning paradigm that combines multiple models to achieve better performance than any individual model. The idea is to leverage the strengths of different models to improve overall accuracy, robustness, and generalization.
## Introduction
Ensemble Learning is a technique that combines the predictions from multiple machine learning models to make more accurate and robust predictions than a single model. It leverages the diversity of different models to reduce errors and improve performance.
## Types of Ensemble Learning
### Bagging
Bagging, or Bootstrap Aggregating, involves training multiple versions of the same model on different subsets of the training data and averaging their predictions. The most common example of bagging is the `RandomForest` algorithm.
### Boosting
Boosting focuses on training models sequentially, where each new model corrects the errors made by the previous ones. This way, the ensemble learns from its mistakes, leading to improved performance. `AdaBoost` and `Gradient Boosting` are popular examples of boosting algorithms.
### Stacking
Stacking involves training multiple models (the base learners) and a meta-model that combines their predictions. The base learners are trained on the original dataset, while the meta-model is trained on the outputs of the base learners. This approach allows leveraging the strengths of different models.
## Advantages and Disadvantages
### Advantages
- **Improved Accuracy**: Combines the strengths of multiple models.
- **Robustness**: Reduces the risk of overfitting and model bias.
- **Versatility**: Can be applied to various machine learning tasks, including classification and regression.
### Disadvantages
- **Complexity**: More complex than individual models, making interpretation harder.
- **Computational Cost**: Requires more computational resources and training time.
- **Implementation**: Can be challenging to implement and tune effectively.
## Key Concepts
- **Diversity**: The models in the ensemble should be diverse to benefit from their different strengths.
- **Voting/Averaging**: For classification, majority voting is used to combine predictions. For regression, averaging is used.
- **Weighting**: In some ensembles, models are weighted based on their accuracy or other metrics.
## Code Examples
### Bagging with Random Forest
Below is an example of using Random Forest for classification on the Iris dataset.
```python
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report
# Load dataset
iris = load_iris()
X, y = iris.data, iris.target
# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Initialize Random Forest model
clf = RandomForestClassifier(n_estimators=100, random_state=42)
# Train the model
clf.fit(X_train, y_train)
# Make predictions
y_pred = clf.predict(X_test)
# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy * 100:.2f}%")
print("Classification Report:\n", classification_report(y_test, y_pred))
```
### Boosting with AdaBoost
Below is an example of using AdaBoost for classification on the Iris dataset.
```
from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
# Initialize base model
base_model = DecisionTreeClassifier(max_depth=1)
# Initialize AdaBoost model
ada_clf = AdaBoostClassifier(base_estimator=base_model, n_estimators=50, random_state=42)
# Train the model
ada_clf.fit(X_train, y_train)
# Make predictions
y_pred = ada_clf.predict(X_test)
# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy * 100:.2f}%")
print("Classification Report:\n", classification_report(y_test, y_pred))
```
### Stacking with Multiple Models
Below is an example of using stacking with multiple models for classification on the Iris dataset.
```
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn.ensemble import StackingClassifier
# Define base models
base_models = [
('knn', KNeighborsClassifier(n_neighbors=5)),
('svc', SVC(kernel='linear', probability=True))
]
# Define meta-model
meta_model = LogisticRegression()
# Initialize Stacking model
stacking_clf = StackingClassifier(estimators=base_models, final_estimator=meta_model, cv=5)
# Train the model
stacking_clf.fit(X_train, y_train)
# Make predictions
y_pred = stacking_clf.predict(X_test)
# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy * 100:.2f}%")
print("Classification Report:\n", classification_report(y_test, y_pred))
```
## Conclusion
Ensemble Learning is a powerful technique that combines multiple models to improve overall performance. By leveraging the strengths of different models, it provides better accuracy, robustness, and generalization. However, it comes with increased complexity and computational cost. Understanding and implementing ensemble methods can significantly enhance machine learning solutions.

Wyświetl plik

@ -11,9 +11,12 @@
- [Introduction To Convolutional Neural Networks (CNNs)](intro-to-cnn.md)
- [TensorFlow.md](tensorflow.md)
- [PyTorch.md](pytorch.md)
- [Ensemble Learning](ensemble-learning.md)
- [Types of optimizers](types-of-optimizers.md)
- [Logistic Regression](logistic-regression.md)
- [Types_of_Cost_Functions](cost-functions.md)
- [Clustering](clustering.md)
- [Hierarchical Clustering](hierarchical-clustering.md)
- [Grid Search](grid-search.md)
- [Transformers](transformers.md)
- [K-nearest neighbor (KNN)](knn.md)

Wyświetl plik

@ -0,0 +1,122 @@
# K-Nearest Neighbors (KNN) Machine Learning Algorithm in Python
## Introduction
K-Nearest Neighbors (KNN) is a simple, yet powerful, supervised machine learning algorithm used for both classification and regression tasks. It assumes that similar things exist in close proximity. In other words, similar data points are near to each other.
## How KNN Works
KNN works by finding the distances between a query and all the examples in the data, selecting the specified number of examples (K) closest to the query, then voting for the most frequent label (in classification) or averaging the labels (in regression).
### Steps:
1. **Choose the number K of neighbors**
2. **Calculate the distance** between the query-instance and all the training samples
3. **Sort the distances** and determine the nearest neighbors based on the K-th minimum distance
4. **Gather the labels** of the nearest neighbors
5. **Vote for the most frequent label** (in case of classification) or **average the labels** (in case of regression)
## When to Use KNN
### Advantages:
- **Simple and easy to understand:** KNN is intuitive and easy to implement.
- **No training phase:** KNN is a lazy learner, meaning there is no explicit training phase.
- **Effective with a small dataset:** KNN performs well with a small number of input variables.
### Disadvantages:
- **Computationally expensive:** The algorithm becomes significantly slower as the number of examples and/or predictors/independent variables increase.
- **Sensitive to irrelevant features:** All features contribute to the distance equally.
- **Memory-intensive:** Storing all the training data can be costly.
### Use Cases:
- **Recommender Systems:** Suggest items based on similarity to user preferences.
- **Image Recognition:** Classify images by comparing new images to the training set.
- **Finance:** Predict credit risk or fraud detection based on historical data.
## KNN in Python
### Required Libraries
To implement KNN, we need the following Python libraries:
- `numpy`
- `pandas`
- `scikit-learn`
- `matplotlib` (for visualization)
### Installation
```bash
pip install numpy pandas scikit-learn matplotlib
```
### Example Code
Let's implement a simple KNN classifier using the Iris dataset.
#### Step 1: Import Libraries
```python
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt
```
#### Step 2: Load Dataset
```python
from sklearn.datasets import load_iris
iris = load_iris()
X = iris.data
y = iris.target
```
#### Step 3: Split Dataset
```python
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
```
#### Step 4: Train KNN Model
```python
knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X_train, y_train)
```
#### Step 5: Make Predictions
```python
y_pred = knn.predict(X_test)
```
#### Step 6: Evaluate the Model
```python
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy}')
```
### Visualization (Optional)
```python
# Plotting the decision boundary for visualization (for 2D data)
h = .02 # step size in the mesh
# Create color maps
cmap_light = plt.cm.RdYlBu
cmap_bold = plt.cm.RdYlBu
# For simplicity, we take only the first two features of the dataset
X_plot = X[:, :2]
x_min, x_max = X_plot[:, 0].min() - 1, X_plot[:, 0].max() + 1
y_min, y_max = X_plot[:, 1].min() - 1, y_plot[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
np.arange(y_min, y_max, h))
Z = knn.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
plt.figure()
plt.pcolormesh(xx, yy, Z, cmap=cmap_light)
# Plot also the training points
plt.scatter(X_plot[:, 0], X_plot[:, 1], c=y, edgecolor='k', cmap=cmap_bold)
plt.xlim(xx.min(), xx.max())
plt.ylim(yy.min(), yy.max())
plt.title("3-Class classification (k = 3)")
plt.show()
```
## Generalization and Considerations
- **Choosing K:** The choice of K is critical. Smaller values of K can lead to noisy models, while larger values make the algorithm computationally expensive and might oversimplify the model.
- **Feature Scaling:** Since KNN relies on distance calculations, features should be scaled (standardized or normalized) to ensure that all features contribute equally to the distance computation.
- **Distance Metrics:** The choice of distance metric (Euclidean, Manhattan, etc.) can affect the performance of the algorithm.
In conclusion, KNN is a versatile and easy-to-implement algorithm suitable for various classification and regression tasks, particularly when working with small datasets and well-defined features. However, careful consideration should be given to the choice of K, feature scaling, and distance metrics to optimize its performance.

Wyświetl plik

@ -0,0 +1,443 @@
# Transformers
## Introduction
A transformer is a deep learning architecture developed by Google and based on the multi-head attention mechanism. It is based on the softmax-based attention
mechanism. Before transformers, predecessors of attention mechanism were added to gated recurrent neural networks, such as LSTMs and gated recurrent units (GRUs), which processed datasets sequentially. Dependency on previous token computations prevented them from being able to parallelize the attention mechanism.
Transformers are a revolutionary approach to natural language processing (NLP). Unlike older models, they excel at understanding long-range connections between words. This "attention" mechanism lets them grasp the context of a sentence, making them powerful for tasks like machine translation, text summarization, and question answering. Introduced in 2017, transformers are now the backbone of many large language models, including tools you might use every day. Their ability to handle complex relationships in language is fueling advancements in AI across various fields.
## Model Architecture
![Model Architecture](assets/transformer-architecture.png)
Source: [Attention Is All You Need](https://arxiv.org/pdf/1706.03762)
### Encoder
The encoder is composed of a stack of identical layers. Each layer has two sub-layers. The first is a multi-head self-attention mechanism, and the second is a simple, positionwise fully connected feed-forward network. Each encoder consists of two major components: a self-attention mechanism and a feed-forward neural network. The self-attention mechanism accepts input encodings from the previous encoder and weights their relevance to each other to generate output encodings. The feed-forward neural network further processes each output encoding individually. These output encodings are then passed to the next encoder as its input, as well as to the decoders.
### Decoder
The decoder is also composed of a stack of identical layers. In addition to the two sub-layers in each encoder layer, the decoder inserts a third sub-layer, which performs multi-head attention over the output of the encoder stack. The decoder functions in a similar fashion to the encoder, but an additional attention mechanism is inserted which instead draws relevant information from the encodings generated by the encoders. This mechanism can also be called the encoder-decoder attention.
### Attention
#### Scaled Dot-Product Attention
The input consists of queries and keys of dimension $d_k$ , and values of dimension $d_v$. We compute the dot products of the query with all keys, divide each by $\sqrt {d_k}$ , and apply a softmax function to obtain the weights on the values.
$$Attention(Q, K, V) = softmax(\dfrac{QK^T}{\sqrt{d_k}}) \times V$$
#### Multi-Head Attention
Instead of performing a single attention function with $d_{model}$-dimensional keys, values and queries, it is beneficial to linearly project the queries, keys and values h times with different, learned linear projections to $d_k$ , $d_k$ and $d_v$ dimensions, respectively.
Multi-head attention allows the model to jointly attend to information from different representation
subspaces at different positions. With a single attention head, averaging inhibits this.
$$MultiHead(Q, K, V) = Concat(head_1, _{...}, head_h) \times W^O$$
where,
$$head_i = Attention(QW_i^Q, KW_i^K, VW_i^V)$$
where the projections are parameter matrices.
#### Masked Attention
It may be necessary to cut out attention links between some word-pairs. For example, the decoder for token position
$t$ should not have access to token position $t+1$.
$$MaskedAttention(Q, K, V) = softmax(M + \dfrac{QK^T}{\sqrt{d_k}}) \times V$$
### Feed-Forward Network
Each of the layers in the encoder and decoder contains a fully connected feed-forward network, which is applied to each position separately and identically. This
consists of two linear transformations with a ReLU activation in between.
$$FFN(x) = max(0, xW_1 + b_1)W_2 + b_2$$
### Positional Encoding
A positional encoding is a fixed-size vector representation that encapsulates the relative positions of tokens within a target sequence: it provides the transformer model with information about where the words are in the input sequence.
The sine and cosine functions of different frequencies:
$$PE<sub>(pos,2i)</sub> = \sin({\dfrac{pos}{10000^{\dfrac{2i}{d_{model}}}}})$$
$$PE<sub>(pos,2i)</sub> = \cos({\dfrac{pos}{10000^{\dfrac{2i}{d_{model}}}}})$$
## Implementation
### Theory
Text is converted to numerical representations called tokens, and each token is converted into a vector via looking up from a word embedding table.
At each layer, each token is then contextualized within the scope of the context window with other tokens via a parallel multi-head attention mechanism
allowing the signal for key tokens to be amplified and less important tokens to be diminished.
The transformer uses an encoder-decoder architecture. The encoder extracts features from an input sentence, and the decoder uses the features to produce an output sentence. Some architectures use full encoders and decoders, autoregressive encoders and decoders, or combination of both. This depends on the usage and context of the input.
### Tensorflow
TensorFlow is a free and open-source software library for machine learning and artificial intelligence. It can be used across a range of tasks but has a particular focus on training and inference of deep neural networks. It was developed by the Google Brain team for Google's internal use in research and production.
Tensorflow provides the transformer encoder and decoder block that can be implemented by the specification of the user. Although, the transformer is not provided as a standalone to be imported and executed, the user has to create the model first. They also have a tutorial on how to implement the transformer from scratch for machine translation and can be found [here](https://www.tensorflow.org/text/tutorials/transformer).
More information on [encoder](https://www.tensorflow.org/api_docs/python/tfm/nlp/layers/TransformerEncoderBlock) and [decoder](https://www.tensorflow.org/api_docs/python/tfm/nlp/layers/TransformerDecoderBlock) block mentioned in the code.
Imports:
```python
import tensorflow as tf
import tensorflow_models as tfm
```
Adding word embeddings and positional encoding:
```python
class PositionalEmbedding(tf.keras.layers.Layer):
def __init__(self, vocab_size, d_model):
super().__init__()
self.d_model = d_model
self.embedding = tf.keras.layers.Embedding(vocab_size, d_model, mask_zero=True)
self.pos_encoding = tfm.nlp.layers.RelativePositionEmbedding(hidden_size=d_model)
def compute_mask(self, *args, **kwargs):
return self.embedding.compute_mask(*args, **kwargs)
def call(self, x):
length = tf.shape(x)[1]
x = self.embedding(x)
x = x + self.pos_encoding[tf.newaxis, :length, :]
return x
```
Creating the encoder for the transformer:
```python
class Encoder(tf.keras.layers.Layer):
def __init__(self, num_layers, d_model, num_heads,
dff, vocab_size, dropout_rate=0.1):
super().__init__()
self.d_model = d_model
self.num_layers = num_layers
self.pos_embedding = PositionalEmbedding(
vocab_size=vocab_size, d_model=d_model)
self.enc_layers = [
tfm.nlp.layers.TransformerEncoderBlock(output_last_dim=d_model,
num_attention_heads=num_heads,
inner_dim=dff,
inner_activation="relu",
inner_dropout=dropout_rate)
for _ in range(num_layers)]
self.dropout = tf.keras.layers.Dropout(dropout_rate)
def call(self, x):
x = self.pos_embedding(x, length=2048)
x = self.dropout(x)
for i in range(self.num_layers):
x = self.enc_layers[i](x)
return x
```
Creating the decoder for the transformer:
```python
class Decoder(tf.keras.layers.Layer):
def __init__(self, num_layers, d_model, num_heads, dff, vocab_size,
dropout_rate=0.1):
super(Decoder, self).__init__()
self.d_model = d_model
self.num_layers = num_layers
self.pos_embedding = PositionalEmbedding(vocab_size=vocab_size,
d_model=d_model)
self.dropout = tf.keras.layers.Dropout(dropout_rate)
self.dec_layers = [
tfm.nlp.layers.TransformerDecoderBlock(num_attention_heads=num_heads,
intermediate_size=dff,
intermediate_activation="relu",
dropout_rate=dropout_rate)
for _ in range(num_layers)]
def call(self, x, context):
x = self.pos_embedding(x)
x = self.dropout(x)
for i in range(self.num_layers):
x = self.dec_layers[i](x, context)
return x
```
Combining the encoder and decoder to create the transformer:
```python
class Transformer(tf.keras.Model):
def __init__(self, num_layers, d_model, num_heads, dff,
input_vocab_size, target_vocab_size, dropout_rate=0.1):
super().__init__()
self.encoder = Encoder(num_layers=num_layers, d_model=d_model,
num_heads=num_heads, dff=dff,
vocab_size=input_vocab_size,
dropout_rate=dropout_rate)
self.decoder = Decoder(num_layers=num_layers, d_model=d_model,
num_heads=num_heads, dff=dff,
vocab_size=target_vocab_size,
dropout_rate=dropout_rate)
self.final_layer = tf.keras.layers.Dense(target_vocab_size)
def call(self, inputs):
context, x = inputs
context = self.encoder(context)
x = self.decoder(x, context)
logits = self.final_layer(x)
return logits
```
Model initialization that be used for training and inference:
```python
transformer = Transformer(
num_layers=num_layers,
d_model=d_model,
num_heads=num_heads,
dff=dff,
input_vocab_size=64,
target_vocab_size=64,
dropout_rate=dropout_rate
)
```
Sample:
```python
src = tf.random.uniform((64, 40))
tgt = tf.random.uniform((64, 50))
output = transformer((src, tgt))
```
O/P:
```
<tf.Tensor: shape=(64, 50, 64), dtype=float32, numpy=
array([[[ 0.78274703, -1.2312567 , 0.7272992 , ..., 2.1805947 ,
1.3511044 , -1.275499 ],
[ 0.82658154, -1.2863302 , 0.76494133, ..., 2.39311 ,
1.0973787 , -1.3414565 ],
[ 0.57013685, -1.3958443 , 1.0213287 , ..., 2.3791933 ,
0.58439416, -0.93464035],
...,
[ 0.82214123, -0.51090807, 0.25897795, ..., 2.1979148 ,
1.4126635 , -0.5771998 ],
[ 0.6371507 , -0.36584622, 0.40954843, ..., 2.0241373 ,
1.6503414 , -0.74359566],
[ 0.6739802 , -0.39973688, 0.3338765 , ..., 1.6819229 ,
1.7505672 , -1.0763712 ]],
[[ 0.78274703, -1.2312567 , 0.7272992 , ..., 2.1805947 ,
1.3511044 , -1.275499 ],
[ 0.82658154, -1.2863302 , 0.76494133, ..., 2.39311 ,
1.0973787 , -1.3414565 ],
[ 0.57013685, -1.3958443 , 1.0213287 , ..., 2.3791933 ,
0.58439416, -0.93464035],
...,
[ 0.82214123, -0.51090807, 0.25897795, ..., 2.1979148 ,
1.4126635 , -0.5771998 ],
[ 0.6371507 , -0.36584622, 0.40954843, ..., 2.0241373 ,
1.6503414 , -0.74359566],
[ 0.6739802 , -0.39973688, 0.3338765 , ..., 1.6819229 ,
1.7505672 , -1.0763712 ]],
[[ 0.78274703, -1.2312567 , 0.7272992 , ..., 2.1805947 ,
1.3511044 , -1.275499 ],
[ 0.82658154, -1.2863302 , 0.76494133, ..., 2.39311 ,
1.0973787 , -1.3414565 ],
[ 0.57013685, -1.3958443 , 1.0213287 , ..., 2.3791933 ,
0.58439416, -0.93464035],
...,
[ 0.82214123, -0.51090807, 0.25897795, ..., 2.1979148 ,
1.4126635 , -0.5771998 ],
[ 0.6371507 , -0.36584622, 0.40954843, ..., 2.0241373 ,
1.6503414 , -0.74359566],
[ 0.6739802 , -0.39973688, 0.3338765 , ..., 1.6819229 ,
1.7505672 , -1.0763712 ]],
...,
[[ 0.78274703, -1.2312567 , 0.7272992 , ..., 2.1805947 ,
1.3511044 , -1.275499 ],
[ 0.82658154, -1.2863302 , 0.76494133, ..., 2.39311 ,
1.0973787 , -1.3414565 ],
[ 0.57013685, -1.3958443 , 1.0213287 , ..., 2.3791933 ,
0.58439416, -0.93464035],
...,
[ 0.82214123, -0.51090807, 0.25897795, ..., 2.1979148 ,
1.4126635 , -0.5771998 ],
[ 0.6371507 , -0.36584622, 0.40954843, ..., 2.0241373 ,
1.6503414 , -0.74359566],
[ 0.6739802 , -0.39973688, 0.3338765 , ..., 1.6819229 ,
1.7505672 , -1.0763712 ]],
[[ 0.78274703, -1.2312567 , 0.7272992 , ..., 2.1805947 ,
1.3511044 , -1.275499 ],
[ 0.82658154, -1.2863302 , 0.76494133, ..., 2.39311 ,
1.0973787 , -1.3414565 ],
[ 0.57013685, -1.3958443 , 1.0213287 , ..., 2.3791933 ,
0.58439416, -0.93464035],
...,
[ 0.82214123, -0.51090807, 0.25897795, ..., 2.1979148 ,
1.4126635 , -0.5771998 ],
[ 0.6371507 , -0.36584622, 0.40954843, ..., 2.0241373 ,
1.6503414 , -0.74359566],
[ 0.6739802 , -0.39973688, 0.3338765 , ..., 1.6819229 ,
1.7505672 , -1.0763712 ]],
[[ 0.78274703, -1.2312567 , 0.7272992 , ..., 2.1805947 ,
1.3511044 , -1.275499 ],
[ 0.82658154, -1.2863302 , 0.76494133, ..., 2.39311 ,
1.0973787 , -1.3414565 ],
[ 0.57013685, -1.3958443 , 1.0213287 , ..., 2.3791933 ,
0.58439416, -0.93464035],
...,
[ 0.82214123, -0.51090807, 0.25897795, ..., 2.1979148 ,
1.4126635 , -0.5771998 ],
[ 0.6371507 , -0.36584622, 0.40954843, ..., 2.0241373 ,
1.6503414 , -0.74359566],
[ 0.6739802 , -0.39973688, 0.3338765 , ..., 1.6819229 ,
1.7505672 , -1.0763712 ]]], dtype=float32)>
```
```
>>> output.shape
TensorShape([64, 50, 64])
```
### PyTorch
PyTorch is a machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, originally developed by Meta AI and now part of the Linux Foundation umbrella.
Unlike Tensorflow, PyTorch provides the full implementation of the transformer model that can be executed on the go. More information can be found [here](https://pytorch.org/docs/stable/_modules/torch/nn/modules/transformer.html#Transformer). A full implementation of the model can be found [here](https://github.com/pytorch/examples/tree/master/word_language_model).
Imports:
```python
import torch
import torch.nn as nn
```
Initializing the model:
```python
transformer = nn.Transformer(nhead=16, num_encoder_layers=8)
```
Sample:
```python
src = torch.rand((10, 32, 512))
tgt = torch.rand((20, 32, 512))
output = transformer(src, tgt)
```
O/P:
```
tensor([[[ 0.2938, -0.4824, -0.7816, ..., 0.0742, 0.5162, 0.3632],
[-0.0786, -0.5241, 0.6384, ..., 0.3462, -0.0618, 0.9943],
[ 0.7827, 0.1067, -0.1637, ..., -1.7730, -0.3322, -0.0029],
...,
[-0.3202, 0.2341, -0.0896, ..., -0.9714, -0.1251, -0.0711],
[-0.1663, -0.5047, -0.0404, ..., -0.9339, 0.3963, 0.1018],
[ 1.2834, -0.4400, 0.0486, ..., -0.6876, -0.4752, 0.0180]],
[[ 0.9869, -0.7384, -1.0704, ..., -0.9417, 1.3279, -0.1665],
[ 0.3445, -0.2454, -0.3644, ..., -0.4856, -1.1004, -0.6819],
[ 0.7568, -0.3151, -0.5034, ..., -1.2081, -0.7119, 0.3775],
...,
[-0.0451, -0.7596, 0.0168, ..., -0.8267, -0.3272, 1.0457],
[ 0.3150, -0.6588, -0.1840, ..., 0.1822, -0.0653, 0.9053],
[ 0.8692, -0.3519, 0.3128, ..., -1.8446, -0.2325, -0.8662]],
[[ 0.9719, -0.3113, 0.4637, ..., -0.4422, 1.2348, 0.8274],
[ 0.3876, -0.9529, -0.7810, ..., -0.5843, -1.1439, -0.3366],
[-0.5774, 0.3789, -0.2819, ..., -1.4057, 0.4352, 0.1474],
...,
[ 0.6899, -0.1146, -0.3297, ..., -1.7059, -0.1750, 0.4203],
[ 0.3689, -0.5174, -0.1253, ..., 0.1417, 0.4159, 0.7560],
[ 0.5024, -0.7996, 0.1592, ..., -0.8344, -1.1125, 0.4736]],
...,
[[ 0.0704, -0.3971, -0.2768, ..., -1.9929, 0.8608, 1.2264],
[ 0.4013, -0.0962, -0.0965, ..., -0.4452, -0.8682, -0.4593],
[ 0.1656, 0.5224, -0.1723, ..., -1.5785, 0.3219, 1.1507],
...,
[-0.9443, 0.4653, 0.2936, ..., -0.9840, -0.0142, -0.1595],
[-0.6544, -0.3294, -0.0803, ..., 0.1623, -0.5061, 0.9824],
[-0.0978, -1.0023, -0.6915, ..., -0.2296, -0.0594, -0.4715]],
[[ 0.6531, -0.9285, -0.0331, ..., -1.1481, 0.7768, -0.7321],
[ 0.3325, -0.6683, -0.6083, ..., -0.4501, 0.2289, 0.3573],
[-0.6750, 0.4600, -0.8512, ..., -2.0097, -0.5159, 0.2773],
...,
[-1.4356, -1.0135, 0.0081, ..., -1.2985, -0.3715, -0.2678],
[ 0.0546, -0.2111, -0.0965, ..., -0.3822, -0.4612, 1.6217],
[ 0.7700, -0.5309, -0.1754, ..., -2.2807, -0.0320, -1.5551]],
[[ 0.2399, -0.9659, 0.1086, ..., -1.1756, 0.4063, 0.0615],
[-0.2202, -0.7972, -0.5024, ..., -0.9126, -1.5248, 0.2418],
[ 0.5215, 0.4540, 0.0036, ..., -0.2135, 0.2145, 0.6638],
...,
[-0.2190, -0.4967, 0.7149, ..., -0.3324, 0.3502, 1.0624],
[-0.0108, -0.9205, -0.1315, ..., -1.0153, 0.2989, 1.1415],
[ 1.1284, -0.6560, 0.6755, ..., -1.2157, 0.8580, -0.5022]]],
grad_fn=<NativeLayerNormBackward0>)
```
```
>> output.shape
torch.Size([20, 32, 512])
```
### HuggingFace
Hugging Face, Inc. is a French-American company incorporated under the Delaware General Corporation Law and based in New York City that develops computation tools for building applications using machine learning.
It has a wide-range of models that can implemented in Tensorflow, PyTorch and other development backends as well. The models are already trained on a dataset and can be pretrained on custom dataset for customized use, according to the user. The information for training the model and loading the pretrained model can be found [here](https://huggingface.co/docs/transformers/en/training).
In HuggingFace, `pipeline` is used to run inference from the trained model available in the Hub. This is very beginner friendly. The model is downloaded to the local system on running the script before running the inference. It has to be made sure that the model downloaded does not exceed your available data plan.
Imports:
```python
from transformers import pipeline
```
Initialization:
The model used here is BART (large) which was trained on MultiNLI dataset, which consist of sentence paired with its textual entailment.
```python
classifier = pipeline(model="facebook/bart-large-mnli")
```
Sample:
The first argument is the sentence which needs to be analyzed. The second argument, `candidate_labels`, is the list of labels which most likely the first argument sentence belongs to. The output dictionary will have a key as `score`, where the highest index is the textual entailment of the sentence with the index of the label in the list.
```python
output = classifier(
"I need to leave but later",
candidate_labels=["urgent", "not urgent", "sleep"],
)
```
O/P:
```
{'sequence': 'I need to leave but later',
'labels': ['not urgent', 'urgent', 'sleep'],
'scores': [0.8889380097389221, 0.10631518065929413, 0.00474683940410614]}
```
## Application
The transformer has had great success in natural language processing (NLP). Many large language models such as GPT-2, GPT-3, GPT-4, Claude, BERT, XLNet, RoBERTa and ChatGPT demonstrate the ability of transformers to perform a wide variety of such NLP-related tasks, and have the potential to find real-world applications.
These may include:
- Machine translation
- Document summarization
- Text generation
- Biological sequence analysis
- Computer code generation
## Bibliography
- [Attention Is All You Need](https://arxiv.org/pdf/1706.03762)
- [Tensorflow Tutorial](https://www.tensorflow.org/text/tutorials/transformer)
- [Tensorflow Models Docs](https://www.tensorflow.org/api_docs/python/tfm/nlp/layers)
- [Wikipedia](https://en.wikipedia.org/wiki/Transformer_(deep_learning_architecture))
- [HuggingFace](https://huggingface.co/docs/transformers/en/index)
- [PyTorch](https://pytorch.org/docs/stable/generated/torch.nn.Transformer.html)

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 2.3 KiB

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 12 KiB

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 12 KiB

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 12 KiB

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 17 KiB

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 12 KiB

Plik binarny nie jest wyświetlany.

Po

Szerokość:  |  Wysokość:  |  Rozmiar: 11 KiB

Wyświetl plik

@ -5,5 +5,6 @@
- [Bar Plots in Matplotlib](matplotlib-bar-plots.md)
- [Pie Charts in Matplotlib](matplotlib-pie-charts.md)
- [Line Charts in Matplotlib](matplotlib-line-plots.md)
- [Scatter Plots in Matplotlib](matplotlib-scatter-plot.md)
- [Introduction to Seaborn and Installation](seaborn-intro.md)
- [Getting started with Seaborn](seaborn-basics.md)

Wyświetl plik

@ -0,0 +1,160 @@
# Scatter() plot in matplotlib
* A scatter plot is a type of data visualization that uses dots to show values for two variables, with one variable on the x-axis and the other on the y-axis. It's useful for identifying relationships, trends, and correlations, as well as spotting clusters and outliers.
* The dots on the plot shows how the variables are related. A scatter plot is made with the matplotlib library's `scatter() method`.
## Syntax
**Here's how to write code for the `scatter() method`:**
```
matplotlib.pyplot.scatter (x_axis_value, y_axis_value, s = None, c = None, vmin = None, vmax = None, marker = None, cmap = None, alpha = None, linewidths = None, edgecolors = None)
```
## Prerequisites
Scatter plots can be created in Python with Matplotlib's pyplot library. To build a Scatter plot, first import matplotlib. It is a standard convention to import Matplotlib's pyplot library as plt.
```
import matplotlib.pyplot as plt
```
## Creating a simple Scatter Plot
With Pyplot, you can use the `scatter()` function to draw a scatter plot.
The `scatter()` function plots one dot for each observation. It needs two arrays of the same length, one for the values of the x-axis, and one for values on the y-axis:
```
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
plt.scatter(x, y)
plt.show()
```
When executed, this will show the following Scatter plot:
![Basic line Chart](images/simple_scatter.png)
## Compare Plots
In a scatter plot, comparing plots involves examining multiple sets of points to identify differences or similarities in patterns, trends, or correlations between the data sets.
```
import matplotlib.pyplot as plt
import numpy as np
#day one, the age and speed of 13 cars:
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
plt.scatter(x, y)
#day two, the age and speed of 15 cars:
x = np.array([2,2,8,1,15,8,12,9,7,3,11,4,7,14,12])
y = np.array([100,105,84,105,90,99,90,95,94,100,79,112,91,80,85])
plt.scatter(x, y)
plt.show()
```
When executed, this will show the following Compare Scatter plot:
![Compare Plots](images/scatter_compare.png)
## Colors in Scatter plot
You can set your own color for each scatter plot with the `color` or the `c` argument:
```
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
plt.scatter(x, y, color = 'hotpink')
x = np.array([2,2,8,1,15,8,12,9,7,3,11,4,7,14,12])
y = np.array([100,105,84,105,90,99,90,95,94,100,79,112,91,80,85])
plt.scatter(x, y, color = '#88c999')
plt.show()
```
When executed, this will show the following Colors Scatter plot:
![Colors in Scatter plot](images/scatter_color.png)
## Color Each Dot
You can even set a specific color for each dot by using an array of colors as value for the `c` argument:
``Note: You cannot use the `color` argument for this, only the `c` argument.``
```
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
colors = np.array(["red","green","blue","yellow","pink","black","orange","purple","beige","brown","gray","cyan","magenta"])
plt.scatter(x, y, c=colors)
plt.show()
```
When executed, this will show the following Color Each Dot:
![Color Each Dot](images/scatter_coloreachdot.png)
## ColorMap
The Matplotlib module has a number of available colormaps.
A colormap is like a list of colors, where each color has a value that ranges from 0 to 100.
Here is an example of a colormap:
![ColorMap](images/img_colorbar.png)
This colormap is called 'viridis' and as you can see it ranges from 0, which is a purple color, up to 100, which is a yellow color.
## How to Use the ColorMap
You can specify the colormap with the keyword argument `cmap` with the value of the colormap, in this case `'viridis'` which is one of the built-in colormaps available in Matplotlib.
In addition you have to create an array with values (from 0 to 100), one value for each point in the scatter plot:
```
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
colors = np.array([0, 10, 20, 30, 40, 45, 50, 55, 60, 70, 80, 90, 100])
plt.scatter(x, y, c=colors, cmap='viridis')
plt.show()
```
When executed, this will show the following Scatter ColorMap:
![Scatter ColorMap](images/scatter_colormap1.png)
You can include the colormap in the drawing by including the `plt.colorbar()` statement:
```
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
colors = np.array([0, 10, 20, 30, 40, 45, 50, 55, 60, 70, 80, 90, 100])
plt.scatter(x, y, c=colors, cmap='viridis')
plt.colorbar()
plt.show()
```
When executed, this will show the following Scatter ColorMap using `plt.colorbar()`:
![Scatter ColorMap1](images/scatter_colormap2.png)