Merge branch 'main' into main
|
@ -0,0 +1,87 @@
|
|||
# Generators
|
||||
|
||||
## Introduction
|
||||
|
||||
Generators in Python are a sophisticated feature that enables the creation of iterators without the need to construct a full list in memory. They allow you to generate values on-the-fly, which is particularly beneficial for working with large datasets or infinite sequences. We will explore generators in depth, covering their types, mathematical formulation, advantages, disadvantages, and implementation examples.
|
||||
|
||||
## Function Generators
|
||||
|
||||
Function generators are created using the `yield` keyword within a function. When invoked, a function generator returns a generator iterator, allowing you to iterate over the values generated by the function.
|
||||
|
||||
### Mathematical Formulation
|
||||
|
||||
Function generators can be represented mathematically using set-builder notation. The general form is:
|
||||
|
||||
```
|
||||
{expression | variable in iterable, condition}
|
||||
```
|
||||
|
||||
Where:
|
||||
- `expression` is the expression to generate values.
|
||||
- `variable` is the variable used in the expression.
|
||||
- `iterable` is the sequence of values to iterate over.
|
||||
- `condition` is an optional condition that filters the values.
|
||||
|
||||
### Advantages of Function Generators
|
||||
|
||||
1. **Memory Efficiency**: Function generators produce values lazily, meaning they generate values only when needed, saving memory compared to constructing an entire sequence upfront.
|
||||
|
||||
2. **Lazy Evaluation**: Values are generated on-the-fly as they are consumed, leading to improved performance and reduced overhead, especially when dealing with large datasets.
|
||||
|
||||
3. **Infinite Sequences**: Function generators can represent infinite sequences, such as the Fibonacci sequence, allowing you to work with data streams of arbitrary length without consuming excessive memory.
|
||||
|
||||
### Disadvantages of Function Generators
|
||||
|
||||
1. **Single Iteration**: Once a function generator is exhausted, it cannot be reused. If you need to iterate over the sequence again, you'll have to create a new generator.
|
||||
|
||||
2. **Limited Random Access**: Function generators do not support random access like lists. They only allow sequential access, which might be a limitation depending on the use case.
|
||||
|
||||
### Implementation Example
|
||||
|
||||
```python
|
||||
def fibonacci():
|
||||
a, b = 0, 1
|
||||
while True:
|
||||
yield a
|
||||
a, b = b, a + b
|
||||
|
||||
# Usage
|
||||
fib_gen = fibonacci()
|
||||
for _ in range(10):
|
||||
print(next(fib_gen))
|
||||
```
|
||||
|
||||
## Generator Expressions
|
||||
|
||||
Generator expressions are similar to list comprehensions but return a generator object instead of a list. They offer a concise way to create generators without the need for a separate function.
|
||||
|
||||
### Mathematical Formulation
|
||||
|
||||
Generator expressions can also be represented mathematically using set-builder notation. The general form is the same as for function generators.
|
||||
|
||||
### Advantages of Generator Expressions
|
||||
|
||||
1. **Memory Efficiency**: Generator expressions produce values lazily, similar to function generators, resulting in memory savings.
|
||||
|
||||
2. **Lazy Evaluation**: Values are generated on-the-fly as they are consumed, providing improved performance and reduced overhead.
|
||||
|
||||
### Disadvantages of Generator Expressions
|
||||
|
||||
1. **Single Iteration**: Like function generators, once a generator expression is exhausted, it cannot be reused.
|
||||
|
||||
2. **Limited Random Access**: Generator expressions, similar to function generators, do not support random access.
|
||||
|
||||
### Implementation Example
|
||||
|
||||
```python
|
||||
# Generate squares of numbers from 0 to 9
|
||||
square_gen = (x**2 for x in range(10))
|
||||
|
||||
# Usage
|
||||
for num in square_gen:
|
||||
print(num)
|
||||
```
|
||||
|
||||
## Conclusion
|
||||
|
||||
Generators offer a powerful mechanism for creating iterators efficiently in Python. By understanding the differences between function generators and generator expressions, along with their mathematical formulation, advantages, and disadvantages, you can leverage them effectively in various scenarios. Whether you're dealing with large datasets or need to work with infinite sequences, generators provide a memory-efficient solution with lazy evaluation capabilities, contributing to more elegant and scalable code.
|
|
@ -1,11 +1,13 @@
|
|||
# List of sections
|
||||
|
||||
- [OOPs](OOPs.md)
|
||||
- [OOPs](oops.md)
|
||||
- [Decorators/\*args/**kwargs](decorator-kwargs-args.md)
|
||||
- [Lambda Function](lambda-function.md)
|
||||
- [Working with Dates & Times in Python](dates_and_times.md)
|
||||
- [Regular Expressions in Python](regular_expressions.md)
|
||||
- [JSON module](json-module.md)
|
||||
- [Map Function](map-function.md)
|
||||
- [Protocols](protocols.md)
|
||||
- [Exception Handling in Python](exception-handling.md)
|
||||
- [Generators](generators.md)
|
||||
- [Closures](closures.md)
|
||||
|
|
|
@ -0,0 +1,243 @@
|
|||
# Protocols in Python
|
||||
Python can establish informal interfaces using protocols In order to improve code structure, reusability, and type checking. Protocols allow for progressive adoption and are more flexible than standard interfaces in other programming languages like JAVA, which are tight contracts that specify the methods and attributes a class must implement.
|
||||
|
||||
>Before going into depth of this topic let's understand another topic which is pre-requisite od this topic \#TypingModule
|
||||
|
||||
## Typing Module
|
||||
This is a module in python which provides
|
||||
1. Provides classes, functions, and type aliases.
|
||||
2. Allows adding type annotations to our code.
|
||||
3. Enhances code readability.
|
||||
4. Helps in catching errors early.
|
||||
|
||||
### Type Hints in Python:
|
||||
Type hints allow you to specify the expected data types of variables, function parameters, and return values. This can improve code readability and help with debugging.
|
||||
|
||||
Here is a simple function that adds two numbers:
|
||||
```python
|
||||
def add(a,b):
|
||||
return a + b
|
||||
add(10,20)
|
||||
```
|
||||
>Output: 30
|
||||
|
||||
While this works fine, adding type hints makes the code more understandable and serves as documentation:
|
||||
|
||||
```python
|
||||
def add(a:int, b:int)->int:
|
||||
return a + b
|
||||
print(add(1,10))
|
||||
```
|
||||
>Output: 11
|
||||
|
||||
In this version, `a` and `b` are expected to be integers, and the function is expected to return an integer. This makes the function's purpose and usage clearer.
|
||||
|
||||
#### let's see another example
|
||||
|
||||
The function given below takes an iterable (it can be any off list, tuple, dict, set, frozeset, String... etc) and print it's content in a single line along with it's type.
|
||||
|
||||
```python
|
||||
from typing import Iterable
|
||||
# type alias
|
||||
|
||||
def print_all(l: Iterable)->None:
|
||||
print(type(l),end=' ')
|
||||
for i in l:
|
||||
print(i,end=' ')
|
||||
print()
|
||||
|
||||
l = [1,2,3,4,5] # type: List[int]
|
||||
s = {1,2,3,4,5} # type: Set[int]
|
||||
t = (1,2,3,4,5) # type: Tuple[int]
|
||||
|
||||
for iter_obj in [l,s,t]:
|
||||
print_all(iter_obj)
|
||||
|
||||
```
|
||||
Output:
|
||||
> <class 'list'> 1 2 3 4 5
|
||||
> <class 'set'> 1 2 3 4 5
|
||||
> <class 'tuple'> 1 2 3 4 5
|
||||
|
||||
and now lets try calling the function `print_all` using a non-iterable object `int` as argument.
|
||||
|
||||
```python
|
||||
a = 10
|
||||
print_all(a) # This will raise an error
|
||||
```
|
||||
Output:
|
||||
>TypeError: 'int' object is not iterable
|
||||
|
||||
This error occurs because `a` is an `integer`, and the `integer` class does not have any methods or attributes that make it work like an iterable. In other words, the integer class does not conform to the `Iterable` protocol.
|
||||
|
||||
**Benefits of Type Hints**
|
||||
Using type hints helps in several ways:
|
||||
|
||||
1. **Error Detection**: Tools like mypy can catch type-related problems during development, decreasing runtime errors.
|
||||
2. **Code Readability**: Type hints serve as documentation, making it easy to comprehend what data types are anticipated and returned.
|
||||
3. **Improved Maintenance**: With unambiguous type expectations, maintaining and updating code becomes easier, especially in huge codebases.
|
||||
|
||||
Now that we have understood about type hints and typing module let's dive deep into protocols.
|
||||
|
||||
## Understanding Protocols
|
||||
|
||||
In Python, protocols define interfaces similar to Java interfaces. They let you specify methods and attributes that an object must implement without requiring inheritance from a base class. Protocols are part of the `typing` module and provide a way to enforce certain structures in your classes, enhancing type safety and code clarity.
|
||||
|
||||
### What is a Protocol?
|
||||
|
||||
A protocol specifies one or more method signatures that a class must implement to be considered as conforming to the protocol.
|
||||
This concept is often referred to as "structural subtyping" or "duck typing," meaning that if an object implements the required methods and attributes, it can be treated as an instance of the protocol.
|
||||
|
||||
Let's write our own protocol:
|
||||
|
||||
```python
|
||||
from typing import Protocol
|
||||
|
||||
# Define a Printable protocol
|
||||
class Printable(Protocol):
|
||||
def print(self) -> None:
|
||||
"""Print the object"""
|
||||
pass
|
||||
|
||||
# Book class implements the Printable protocol
|
||||
class Book:
|
||||
def __init__(self, title: str):
|
||||
self.title = title
|
||||
|
||||
def print(self) -> None:
|
||||
print(f"Book Title: {self.title}")
|
||||
|
||||
# print_object function takes a Printable object and calls its print method
|
||||
def print_object(obj: Printable) -> None:
|
||||
obj.print()
|
||||
|
||||
book = Book("Python Programming")
|
||||
print_object(book)
|
||||
```
|
||||
Output:
|
||||
> Book Title: Python Programming
|
||||
|
||||
In this example:
|
||||
|
||||
1. **Printable Protocol:** Defines an interface with a single method print.
|
||||
2. **Book Class:** Implements the Printable protocol by providing a print method.
|
||||
3. **print_object Function:** Accepts any object that conforms to the Printable protocol and calls its print method.
|
||||
|
||||
we got our output because the class `Book` confirms to the protocols `printable`.
|
||||
similarly When you pass an object to `print_object` that does not conform to the Printable protocol, an error will occur. This is because the object does not implement the required `print` method.
|
||||
Let's see an example:
|
||||
```python
|
||||
class Team:
|
||||
def huddle(self) -> None:
|
||||
print("Team Huddle")
|
||||
|
||||
c = Team()
|
||||
print_object(c) # This will raise an error
|
||||
```
|
||||
Output:
|
||||
>AttributeError: 'Team' object has no attribute 'print'
|
||||
|
||||
In this case:
|
||||
- The `Team` class has a `huddle` method but does not have a `print` method.
|
||||
- When `print_object` tries to call the `print` method on a `Team` instance, it raises an `AttributeError`.
|
||||
|
||||
> This is an important aspect of using protocols: they ensure that objects provide the necessary methods, leading to more predictable and reliable code.
|
||||
|
||||
**Ensuring Protocol Conformance**
|
||||
To avoid such errors, you need to ensure that any object passed to `print_object` implements the `Printable` protocol. Here's how you can modify the `Team` class to conform to the protocol:
|
||||
```python
|
||||
class Team:
|
||||
def __init__(self, name: str):
|
||||
self.name = name
|
||||
|
||||
def huddle(self) -> None:
|
||||
print("Team Huddle")
|
||||
|
||||
def print(self) -> None:
|
||||
print(f"Team Name: {self.name}")
|
||||
|
||||
c = Team("Dream Team")
|
||||
print_object(c)
|
||||
```
|
||||
Output:
|
||||
>Team Name: Dream Team
|
||||
|
||||
The `Team` class now implements the `print` method, conforming to the `Printable` protocol. and hence, no longer raises an error.
|
||||
|
||||
### Protocols and Inheritance:
|
||||
Protocols can also be used in combination with inheritance to create more complex interfaces.
|
||||
we can do that by following these steps:
|
||||
**Step 1 - Base protocol**: Define a base protocol that specifies a common set of methods and attributes.
|
||||
**Step 2 - Derived Protocols**: Create derives protocols that extends the base protocol with addition requirements
|
||||
**Step 3 - Polymorphism**: Objects can then conform to multiple protocols, allowing for Polymorphic behavior.
|
||||
|
||||
Let's see an example on this as well:
|
||||
|
||||
```python
|
||||
from typing import Protocol
|
||||
|
||||
# Base Protocols
|
||||
class Printable(Protocol):
|
||||
def print(self) -> None:
|
||||
"""Print the object"""
|
||||
pass
|
||||
|
||||
# Base Protocols-2
|
||||
class Serializable(Protocol):
|
||||
def serialize(self) -> str:
|
||||
pass
|
||||
|
||||
# Derived Protocol
|
||||
class PrintableAndSerializable(Printable, Serializable):
|
||||
pass
|
||||
|
||||
# class with implementation of both Printable and Serializable
|
||||
class Book_serialize:
|
||||
def __init__(self, title: str):
|
||||
self.title = title
|
||||
|
||||
def print(self) -> None:
|
||||
print(f"Book Title: {self.title}")
|
||||
|
||||
def serialize(self) -> None:
|
||||
print(f"serialize: {self.title}")
|
||||
|
||||
# function accepts the object which implements PrintableAndSerializable
|
||||
def test(obj: PrintableAndSerializable):
|
||||
obj.print()
|
||||
obj.serialize()
|
||||
|
||||
book = Book_serialize("lean-in")
|
||||
test(book)
|
||||
```
|
||||
Output:
|
||||
> Book Title: lean-in
|
||||
serialize: lean-in
|
||||
|
||||
In this example:
|
||||
|
||||
**Printable Protocol:** Specifies a `print` method.
|
||||
**Serializable Protocol:** Specifies a `serialize` method.
|
||||
**PrintableAndSerializable Protocol:** Combines both `Printable` and `Serializable`.
|
||||
**Book Class**: Implements both `print` and `serialize` methods, conforming to `PrintableAndSerializable`.
|
||||
**test Function:** Accepts any object that implements the `PrintableAndSerializable` protocol.
|
||||
|
||||
If you try to pass an object that does not conform to the `PrintableAndSerializable` protocol to the test function, it will raise an `error`. Let's see an example:
|
||||
|
||||
```python
|
||||
class Team:
|
||||
def huddle(self) -> None:
|
||||
print("Team Huddle")
|
||||
|
||||
c = Team()
|
||||
test(c) # This will raise an error
|
||||
```
|
||||
output:
|
||||
> AttributeError: 'Team' object has no attribute 'print'
|
||||
|
||||
In this case:
|
||||
The `Team` class has a `huddle` method but does not implement `print` or `serialize` methods.
|
||||
When test tries to call `print` and `serialize` on a `Team` instance, it raises an `AttributeError`.
|
||||
|
||||
**In Conclusion:**
|
||||
>Python protocols offer a versatile and powerful means of defining interfaces, encouraging the decoupling of code, improving readability, and facilitating static type checking. They are particularly handy for scenarios involving file-like objects, bespoke containers, and any case where you wish to enforce certain behaviors without requiring inheritance from a specific base class. Ensuring that classes conform to protocols reduces runtime problems and makes your code more robust and maintainable.
|
|
@ -1,4 +1,4 @@
|
|||
# List of sections
|
||||
|
||||
- [API Methods](api-methods.md)
|
||||
- [FastAPI](fast-api.md)
|
||||
- [FastAPI](fast-api.md)
|
||||
|
|
|
@ -10,5 +10,6 @@
|
|||
- [Greedy Algorithms](greedy-algorithms.md)
|
||||
- [Dynamic Programming](dynamic-programming.md)
|
||||
- [Linked list](linked-list.md)
|
||||
- [Stacks in Python](stacks.md)
|
||||
- [Sliding Window Technique](sliding-window.md)
|
||||
- [Trie](trie.md)
|
||||
|
|
|
@ -0,0 +1,116 @@
|
|||
# Stacks in Python
|
||||
|
||||
In Data Structures and Algorithms, a stack is a linear data structure that complies with the Last In, First Out (LIFO) rule. It works by use of two fundamental techniques: **PUSH** which inserts an element on top of the stack and **POP** which takes out the topmost element.This concept is similar to a stack of plates in a cafeteria. Stacks are usually used for handling function calls, expression evaluation, and parsing in programming. Indeed, they are efficient in managing memory as well as tracking program state.
|
||||
|
||||
## Points to be Remebered
|
||||
|
||||
- A stack is a collection of data items that can be accessed at only one end, called **TOP**.
|
||||
- Items can be inserted and deleted in a stack only at the TOP.
|
||||
- The last item inserted in a stack is the first one to be deleted.
|
||||
- Therefore, a stack is called a **Last-In-First-Out (LIFO)** data structure.
|
||||
|
||||
## Real Life Examples of Stacks
|
||||
|
||||
- **PILE OF BOOKS** - Suppose a set of books are placed one over the other in a pile. When you remove books from the pile, the topmost book will be removed first. Similarly, when you have to add a book to the pile, the book will be placed at the top of the file.
|
||||
|
||||
- **PILE OF PLATES** - The first plate begins the pile. The second plate is placed on the top of the first plate and the third plate is placed on the top of the second plate, and so on. In general, if you want to add a plate to the pile, you can keep it on the top of the pile. Similarly, if you want to remove a plate, you can remove the plate from the top of the pile.
|
||||
|
||||
- **BANGLES IN A HAND** - When a person wears bangles, the last bangle worn is the first one to be removed.
|
||||
|
||||
## Applications of Stacks
|
||||
|
||||
Stacks are widely used in Computer Science:
|
||||
|
||||
- Function call management
|
||||
- Maintaining the UNDO list for the application
|
||||
- Web browser *history management*
|
||||
- Evaluating expressions
|
||||
- Checking the nesting of parentheses in an expression
|
||||
- Backtracking algorithms (Recursion)
|
||||
|
||||
Understanding these applications is essential for Software Development.
|
||||
|
||||
## Operations on a Stack
|
||||
|
||||
Key operations on a stack include:
|
||||
|
||||
- **PUSH** - It is the process of inserting a new element on the top of a stack.
|
||||
- **OVERFLOW** - A situation when we are pushing an item in a stack that is full.
|
||||
- **POP** - It is the process of deleting an element from the top of a stack.
|
||||
- **UNDERFLOW** - A situation when we are popping item from an empty stack.
|
||||
- **PEEK** - It is the process of getting the most recent value of stack *(i.e. the value at the top of the stack)*
|
||||
- **isEMPTY** - It is the function which return true if stack is empty else false.
|
||||
- **SHOW** -Displaying stack items.
|
||||
|
||||
## Implementing Stacks in Python
|
||||
|
||||
```python
|
||||
def isEmpty(S):
|
||||
if len(S) == 0:
|
||||
return True
|
||||
else:
|
||||
return False
|
||||
|
||||
def Push(S, item):
|
||||
S.append(item)
|
||||
|
||||
def Pop(S):
|
||||
if isEmpty(S):
|
||||
return "Underflow"
|
||||
else:
|
||||
val = S.pop()
|
||||
return val
|
||||
|
||||
def Peek(S):
|
||||
if isEmpty(S):
|
||||
return "Underflow"
|
||||
else:
|
||||
top = len(S) - 1
|
||||
return S[top]
|
||||
|
||||
def Show(S):
|
||||
if isEmpty(S):
|
||||
print("Sorry, No items in Stack")
|
||||
else:
|
||||
print("(Top)", end=' ')
|
||||
t = len(S) - 1
|
||||
while t >= 0:
|
||||
print(S[t], "<", end=' ')
|
||||
t -= 1
|
||||
print()
|
||||
|
||||
stack = [] # initially stack is empty
|
||||
|
||||
Push(stack, 5)
|
||||
Push(stack, 10)
|
||||
Push(stack, 15)
|
||||
|
||||
print("Stack after Push operations:")
|
||||
Show(stack)
|
||||
print("Peek operation:", Peek(stack))
|
||||
print("Pop operation:", Pop(stack))
|
||||
print("Stack after Pop operation:")
|
||||
Show(stack)
|
||||
```
|
||||
|
||||
## Output
|
||||
|
||||
```markdown
|
||||
Stack after Push operations:
|
||||
|
||||
(Top) 15 < 10 < 5 <
|
||||
|
||||
Peek operation: 15
|
||||
|
||||
Pop operation: 15
|
||||
|
||||
Stack after Pop operation:
|
||||
|
||||
(Top) 10 < 5 <
|
||||
```
|
||||
|
||||
## Complexity Analysis
|
||||
|
||||
- **Worst case**: `O(n)` This occurs when the stack is full, it is dominated by the usage of Show operation.
|
||||
- **Best case**: `O(1)` When the operations like isEmpty, Push, Pop and Peek are used, they have a constant time complexity of O(1).
|
||||
- **Average case**: `O(n)` The average complexity is likely to be lower than O(n), as the stack is not always full.
|
|
@ -0,0 +1,235 @@
|
|||
|
||||
# Cost Functions in Machine Learning
|
||||
|
||||
Cost functions, also known as loss functions, play a crucial role in training machine learning models. They measure how well the model performs on the training data by quantifying the difference between predicted and actual values. Different types of cost functions are used depending on the problem domain and the nature of the data.
|
||||
|
||||
## Types of Cost Functions
|
||||
|
||||
### 1. Mean Squared Error (MSE)
|
||||
|
||||
**Explanation:**
|
||||
MSE is one of the most commonly used cost functions, particularly in regression problems. It calculates the average squared difference between the predicted and actual values.
|
||||
|
||||
**Mathematical Formulation:**
|
||||
The MSE is defined as:
|
||||
$$MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2$$
|
||||
Where:
|
||||
- `n` is the number of samples.
|
||||
- $y_i$ is the actual value.
|
||||
- $\hat{y}_i$ is the predicted value.
|
||||
|
||||
**Advantages:**
|
||||
- Sensitive to large errors due to squaring.
|
||||
- Differentiable and convex, facilitating optimization.
|
||||
|
||||
**Disadvantages:**
|
||||
- Sensitive to outliers, as the squared term amplifies their impact.
|
||||
|
||||
**Python Implementation:**
|
||||
```python
|
||||
import numpy as np
|
||||
|
||||
def mean_squared_error(y_true, y_pred):
|
||||
n = len(y_true)
|
||||
return np.mean((y_true - y_pred) ** 2)
|
||||
```
|
||||
|
||||
### 2. Mean Absolute Error (MAE)
|
||||
|
||||
**Explanation:**
|
||||
MAE is another commonly used cost function for regression tasks. It measures the average absolute difference between predicted and actual values.
|
||||
|
||||
**Mathematical Formulation:**
|
||||
The MAE is defined as:
|
||||
$$MAE = \frac{1}{n} \sum_{i=1}^{n} |y_i - \hat{y}_i|$$
|
||||
Where:
|
||||
- `n` is the number of samples.
|
||||
- $y_i$ is the actual value.
|
||||
- $\hat{y}_i$ is the predicted value.
|
||||
|
||||
**Advantages:**
|
||||
- Less sensitive to outliers compared to MSE.
|
||||
- Provides a linear error term, which can be easier to interpret.
|
||||
|
||||
|
||||
**Disadvantages:**
|
||||
- Not differentiable at zero, which can complicate optimization.
|
||||
|
||||
**Python Implementation:**
|
||||
```python
|
||||
import numpy as np
|
||||
|
||||
def mean_absolute_error(y_true, y_pred):
|
||||
n = len(y_true)
|
||||
return np.mean(np.abs(y_true - y_pred))
|
||||
```
|
||||
|
||||
### 3. Cross-Entropy Loss (Binary)
|
||||
|
||||
**Explanation:**
|
||||
Cross-entropy loss is commonly used in binary classification problems. It measures the dissimilarity between the true and predicted probability distributions.
|
||||
|
||||
**Mathematical Formulation:**
|
||||
|
||||
For binary classification, the cross-entropy loss is defined as:
|
||||
|
||||
$$\text{Cross-Entropy} = -\frac{1}{n} \sum_{i=1}^{n} [y_i \log(\hat{y}_i) + (1 - y_i) \log(1 - \hat{y}_i)]$$
|
||||
|
||||
Where:
|
||||
- `n` is the number of samples.
|
||||
- $y_i$ is the actual class label (0 or 1).
|
||||
- $\hat{y}_i$ is the predicted probability of the positive class.
|
||||
|
||||
|
||||
**Advantages:**
|
||||
- Penalizes confident wrong predictions heavily.
|
||||
- Suitable for probabilistic outputs.
|
||||
|
||||
**Disadvantages:**
|
||||
- Sensitive to class imbalance.
|
||||
|
||||
**Python Implementation:**
|
||||
```python
|
||||
import numpy as np
|
||||
|
||||
def binary_cross_entropy(y_true, y_pred):
|
||||
n = len(y_true)
|
||||
return -np.mean(y_true * np.log(y_pred) + (1 - y_true) * np.log(1 - y_pred))
|
||||
```
|
||||
|
||||
### 4. Cross-Entropy Loss (Multiclass)
|
||||
|
||||
**Explanation:**
|
||||
For multiclass classification problems, the cross-entropy loss is adapted to handle multiple classes.
|
||||
|
||||
**Mathematical Formulation:**
|
||||
|
||||
The multiclass cross-entropy loss is defined as:
|
||||
|
||||
$$\text{Cross-Entropy} = -\frac{1}{n} \sum_{i=1}^{n} \sum_{c=1}^{C} y_{i,c} \log(\hat{y}_{i,c})$$
|
||||
|
||||
Where:
|
||||
- `n` is the number of samples.
|
||||
- `C` is the number of classes.
|
||||
- $y_{i,c}$ is the indicator function for the true class of sample `i`.
|
||||
- $\hat{y}_{i,c}$ is the predicted probability of sample `i` belonging to class `c`.
|
||||
|
||||
**Advantages:**
|
||||
- Handles multiple classes effectively.
|
||||
- Encourages the model to assign high probabilities to the correct classes.
|
||||
|
||||
**Disadvantages:**
|
||||
- Requires one-hot encoding for class labels, which can increase computational complexity.
|
||||
|
||||
**Python Implementation:**
|
||||
```python
|
||||
import numpy as np
|
||||
|
||||
def categorical_cross_entropy(y_true, y_pred):
|
||||
n = len(y_true)
|
||||
return -np.mean(np.sum(y_true * np.log(y_pred), axis=1))
|
||||
```
|
||||
|
||||
### 5. Hinge Loss (SVM)
|
||||
|
||||
**Explanation:**
|
||||
Hinge loss is commonly used in support vector machines (SVMs) for binary classification tasks. It penalizes misclassifications by a linear margin.
|
||||
|
||||
**Mathematical Formulation:**
|
||||
|
||||
For binary classification, the hinge loss is defined as:
|
||||
|
||||
$$\text{Hinge Loss} = \frac{1}{n} \sum_{i=1}^{n} \max(0, 1 - y_i \cdot \hat{y}_i)$$
|
||||
|
||||
Where:
|
||||
- `n` is the number of samples.
|
||||
- $y_i$ is the actual class label (-1 or 1).
|
||||
- $\hat{y}_i$ is the predicted score for sample \( i \).
|
||||
|
||||
**Advantages:**
|
||||
- Encourages margin maximization in SVMs.
|
||||
- Robust to outliers due to the linear penalty.
|
||||
|
||||
**Disadvantages:**
|
||||
- Not differentiable at the margin, which can complicate optimization.
|
||||
|
||||
**Python Implementation:**
|
||||
```python
|
||||
import numpy as np
|
||||
|
||||
def hinge_loss(y_true, y_pred):
|
||||
n = len(y_true)
|
||||
loss = np.maximum(0, 1 - y_true * y_pred)
|
||||
return np.mean(loss)
|
||||
```
|
||||
|
||||
### 6. Huber Loss
|
||||
|
||||
**Explanation:**
|
||||
Huber loss is a combination of MSE and MAE, providing a compromise between the two. It is less sensitive to outliers than MSE and provides a smooth transition to MAE for large errors.
|
||||
|
||||
**Mathematical Formulation:**
|
||||
|
||||
The Huber loss is defined as:
|
||||
|
||||
|
||||
$$\text{Huber Loss} = \frac{1}{n} \sum_{i=1}^{n} \left\{
|
||||
\begin{array}{ll}
|
||||
\frac{1}{2} (y_i - \hat{y}_i)^2 & \text{if } |y_i - \hat{y}_i| \leq \delta \\
|
||||
\delta(|y_i - \hat{y}_i| - \frac{1}{2} \delta) & \text{otherwise}
|
||||
\end{array}
|
||||
\right.$$
|
||||
|
||||
Where:
|
||||
- `n` is the number of samples.
|
||||
- $\delta$ is a threshold parameter.
|
||||
|
||||
**Advantages:**
|
||||
- Provides a smooth loss function.
|
||||
- Less sensitive to outliers than MSE.
|
||||
|
||||
**Disadvantages:**
|
||||
- Requires tuning of the threshold parameter.
|
||||
|
||||
**Python Implementation:**
|
||||
```python
|
||||
import numpy as np
|
||||
|
||||
def huber_loss(y_true, y_pred, delta):
|
||||
error = y_true - y_pred
|
||||
loss = np.where(np.abs(error) <= delta, 0.5 * error ** 2, delta * (np.abs(error) - 0.5 * delta))
|
||||
return np.mean(loss)
|
||||
```
|
||||
|
||||
### 7. Log-Cosh Loss
|
||||
|
||||
**Explanation:**
|
||||
Log-Cosh loss is a smooth approximation of the MAE and is less sensitive to outliers than MSE. It provides a smooth transition from quadratic for small errors to linear for large errors.
|
||||
|
||||
**Mathematical Formulation:**
|
||||
|
||||
The Log-Cosh loss is defined as:
|
||||
|
||||
$$\text{Log-Cosh Loss} = \frac{1}{n} \sum_{i=1}^{n} \log(\cosh(y_i - \hat{y}_i))$$
|
||||
|
||||
Where:
|
||||
- `n` is the number of samples.
|
||||
|
||||
**Advantages:**
|
||||
- Smooth and differentiable everywhere.
|
||||
- Less sensitive to outliers.
|
||||
|
||||
**Disadvantages:**
|
||||
- Computationally more expensive than simple losses like MSE.
|
||||
|
||||
**Python Implementation:**
|
||||
```python
|
||||
import numpy as np
|
||||
|
||||
def logcosh_loss(y_true, y_pred):
|
||||
error = y_true - y_pred
|
||||
loss = np.log(np.cosh(error))
|
||||
return np.mean(loss)
|
||||
```
|
||||
|
||||
These implementations provide various options for cost functions suitable for different machine learning tasks. Each function has its advantages and disadvantages, making them suitable for different scenarios and problem domains.
|
|
@ -254,4 +254,4 @@ The final decision tree classifies instances based on the following rules:
|
|||
- If Outlook is Rain and Wind is Weak, PlayTennis is Yes
|
||||
- If Outlook is Rain and Wind is Strong, PlayTennis is No
|
||||
|
||||
> Note that the calculated entropies and information gains may vary slightly depending on the specific implementation and rounding methods used.
|
||||
> Note that the calculated entropies and information gains may vary slightly depending on the specific implementation and rounding methods used.
|
|
@ -1,16 +1,18 @@
|
|||
# List of sections
|
||||
|
||||
- [Binomial Distribution](binomial_distribution.md)
|
||||
- [Regression in Machine Learning](Regression.md)
|
||||
- [Introduction to scikit-learn](sklearn-introduction.md)
|
||||
- [Binomial Distribution](binomial-distribution.md)
|
||||
- [Regression in Machine Learning](regression.md)
|
||||
- [Confusion Matrix](confusion-matrix.md)
|
||||
- [Decision Tree Learning](Decision-Tree.md)
|
||||
- [Decision Tree Learning](decision-tree.md)
|
||||
- [Random Forest](random-forest.md)
|
||||
- [Support Vector Machine Algorithm](support-vector-machine.md)
|
||||
- [Artificial Neural Network from the Ground Up](ArtificialNeuralNetwork.md)
|
||||
- [Artificial Neural Network from the Ground Up](ann.md)
|
||||
- [Introduction To Convolutional Neural Networks (CNNs)](intro-to-cnn.md)
|
||||
- [TensorFlow.md](tensorFlow.md)
|
||||
- [TensorFlow.md](tensorflow.md)
|
||||
- [PyTorch.md](pytorch.md)
|
||||
- [Types of optimizers](Types_of_optimizers.md)
|
||||
- [Types of optimizers](types-of-optimizers.md)
|
||||
- [Logistic Regression](logistic-regression.md)
|
||||
- [Types_of_Cost_Functions](cost-functions.md)
|
||||
- [Clustering](clustering.md)
|
||||
- [Grid Search](grid-search.md)
|
||||
|
|
|
@ -0,0 +1,144 @@
|
|||
# scikit-learn (sklearn) Python Library
|
||||
|
||||
## Overview
|
||||
|
||||
scikit-learn, also known as sklearn, is a popular open-source Python library that provides simple and efficient tools for data mining and data analysis. It is built on NumPy, SciPy, and matplotlib. The library is designed to interoperate with the Python numerical and scientific libraries.
|
||||
|
||||
## Key Features
|
||||
|
||||
- **Classification**: Identifying which category an object belongs to. Example algorithms include SVM, nearest neighbors, random forest.
|
||||
- **Regression**: Predicting a continuous-valued attribute associated with an object. Example algorithms include support vector regression (SVR), ridge regression, Lasso.
|
||||
- **Clustering**: Automatic grouping of similar objects into sets. Example algorithms include k-means, spectral clustering, mean-shift.
|
||||
- **Dimensionality Reduction**: Reducing the number of random variables to consider. Example algorithms include PCA, feature selection, non-negative matrix factorization.
|
||||
- **Model Selection**: Comparing, validating, and choosing parameters and models. Example methods include grid search, cross-validation, metrics.
|
||||
- **Preprocessing**: Feature extraction and normalization.
|
||||
|
||||
## When to Use scikit-learn
|
||||
|
||||
- **Use scikit-learn if**:
|
||||
- You are working on machine learning tasks such as classification, regression, clustering, dimensionality reduction, model selection, and preprocessing.
|
||||
- You need an easy-to-use, well-documented library.
|
||||
- You require tools that are compatible with NumPy and SciPy.
|
||||
|
||||
- **Do not use scikit-learn if**:
|
||||
- You need to perform deep learning tasks. In such cases, consider using TensorFlow or PyTorch.
|
||||
- You need out-of-the-box support for large-scale data. scikit-learn is designed to work with in-memory data, so for very large datasets, you might want to consider libraries like Dask-ML.
|
||||
|
||||
## Installation
|
||||
|
||||
You can install scikit-learn using pip:
|
||||
|
||||
```bash
|
||||
pip install scikit-learn
|
||||
```
|
||||
|
||||
Or via conda:
|
||||
|
||||
```bash
|
||||
conda install scikit-learn
|
||||
```
|
||||
|
||||
## Basic Usage with Code Snippets
|
||||
|
||||
### Importing the Library
|
||||
|
||||
```python
|
||||
import numpy as np
|
||||
from sklearn.model_selection import train_test_split
|
||||
from sklearn.preprocessing import StandardScaler
|
||||
from sklearn.linear_model import LogisticRegression
|
||||
from sklearn.metrics import accuracy_score
|
||||
```
|
||||
|
||||
### Loading Data
|
||||
|
||||
For illustration, let's create a simple synthetic dataset:
|
||||
|
||||
```python
|
||||
from sklearn.datasets import make_classification
|
||||
|
||||
X, y = make_classification(n_samples=1000, n_features=20, n_classes=2, random_state=42)
|
||||
```
|
||||
|
||||
### Splitting Data
|
||||
|
||||
Split the dataset into training and testing sets:
|
||||
|
||||
```python
|
||||
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
|
||||
```
|
||||
|
||||
### Preprocessing
|
||||
|
||||
Standardizing the features:
|
||||
|
||||
```python
|
||||
scaler = StandardScaler()
|
||||
X_train = scaler.fit_transform(X_train)
|
||||
X_test = scaler.transform(X_test)
|
||||
```
|
||||
|
||||
### Training a Model
|
||||
|
||||
Train a Logistic Regression model:
|
||||
|
||||
```python
|
||||
model = LogisticRegression()
|
||||
model.fit(X_train, y_train)
|
||||
```
|
||||
|
||||
### Making Predictions
|
||||
|
||||
Make predictions on the test set:
|
||||
|
||||
```python
|
||||
y_pred = model.predict(X_test)
|
||||
```
|
||||
|
||||
### Evaluating the Model
|
||||
|
||||
Evaluate the accuracy of the model:
|
||||
|
||||
```python
|
||||
accuracy = accuracy_score(y_test, y_pred)
|
||||
print(f"Accuracy: {accuracy * 100:.2f}%")
|
||||
```
|
||||
|
||||
### Putting it All Together
|
||||
|
||||
Here is a complete example from data loading to model evaluation:
|
||||
|
||||
```python
|
||||
import numpy as np
|
||||
from sklearn.datasets import make_classification
|
||||
from sklearn.model_selection import train_test_split
|
||||
from sklearn.preprocessing import StandardScaler
|
||||
from sklearn.linear_model import LogisticRegression
|
||||
from sklearn.metrics import accuracy_score
|
||||
|
||||
# Load data
|
||||
X, y = make_classification(n_samples=1000, n_features=20, n_classes=2, random_state=42)
|
||||
|
||||
# Split data
|
||||
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
|
||||
|
||||
# Preprocess data
|
||||
scaler = StandardScaler()
|
||||
X_train = scaler.fit_transform(X_train)
|
||||
X_test = scaler.transform(X_test)
|
||||
|
||||
# Train model
|
||||
model = LogisticRegression()
|
||||
model.fit(X_train, y_train)
|
||||
|
||||
# Make predictions
|
||||
y_pred = model.predict(X_test)
|
||||
|
||||
# Evaluate model
|
||||
accuracy = accuracy_score(y_test, y_pred)
|
||||
print(f"Accuracy: {accuracy * 100:.2f}%")
|
||||
```
|
||||
|
||||
## Conclusion
|
||||
|
||||
scikit-learn is a powerful and versatile library that can be used for a wide range of machine learning tasks. It is particularly well-suited for beginners due to its easy-to-use interface and extensive documentation. Whether you are working on a simple classification task or a more complex clustering problem, scikit-learn provides the tools you need to build and evaluate your models effectively.
|
|
@ -61,4 +61,4 @@ TensorFlow is a great choice if you:
|
|||
## Example Use Cases
|
||||
|
||||
- Building and deploying complex neural networks for image recognition, natural language processing, or recommendation systems.
|
||||
- Developing models that need to be run on mobile or embedded devices.
|
||||
- Developing models that need to be run on mobile or embedded devices.
|
|
@ -1,6 +1,7 @@
|
|||
# List of sections
|
||||
|
||||
- [Pandas Introduction and Dataframes in Pandas](introduction.md)
|
||||
- [Viewing data in pandas](viewing-data.md)
|
||||
- [Pandas Series Vs NumPy ndarray](pandas-series-vs-numpy-ndarray.md)
|
||||
- [Pandas Descriptive Statistics](descriptive-statistics.md)
|
||||
- [Group By Functions with Pandas](groupby-functions.md)
|
||||
|
|
|
@ -0,0 +1,67 @@
|
|||
# Viewing rows of the frame
|
||||
|
||||
## `head()` method
|
||||
|
||||
The pandas library in Python provides a convenient method called `head()` that allows you to view the first few rows of a DataFrame. Let me explain how it works:
|
||||
- The `head()` function returns the first n rows of a DataFrame or Series.
|
||||
- By default, it displays the first 5 rows, but you can specify a different number of rows using the n parameter.
|
||||
|
||||
### Syntax
|
||||
|
||||
```python
|
||||
dataframe.head(n)
|
||||
```
|
||||
|
||||
`n` is the Optional value. The number of rows to return. Default value is `5`.
|
||||
|
||||
### Example
|
||||
|
||||
```python
|
||||
import pandas as pd
|
||||
df = pd.DataFrame({'animal': ['alligator', 'bee', 'falcon', 'lion','tiger','rabit','dog','fox','monkey','elephant']})
|
||||
df.head(n=5)
|
||||
```
|
||||
|
||||
#### Output
|
||||
|
||||
```
|
||||
animal
|
||||
0 alligator
|
||||
1 bee
|
||||
2 falcon
|
||||
3 lion
|
||||
4 tiger
|
||||
```
|
||||
|
||||
## `tail()` method
|
||||
|
||||
The `tail()` function in Python displays the last five rows of the dataframe by default. It takes in a single parameter: the number of rows. We can use this parameter to display the number of rows of our choice.
|
||||
- The `tail()` function returns the last n rows of a DataFrame or Series.
|
||||
- By default, it displays the last 5 rows, but you can specify a different number of rows using the n parameter.
|
||||
|
||||
### Syntax
|
||||
|
||||
```python
|
||||
dataframe.tail(n)
|
||||
```
|
||||
|
||||
`n` is the Optional value. The number of rows to return. Default value is `5`.
|
||||
|
||||
### Example
|
||||
|
||||
```python
|
||||
import pandas as pd
|
||||
df = pd.DataFrame({'fruits': ['mongo', 'orange', 'apple', 'lemon','banana','water melon','papaya','grapes','cherry','coconut']})
|
||||
df.tail(n=5)
|
||||
```
|
||||
|
||||
#### Output
|
||||
|
||||
```
|
||||
fruits
|
||||
5 water melon
|
||||
6 papaya
|
||||
7 grapes
|
||||
8 cherry
|
||||
9 coconut
|
||||
```
|
Po Szerokość: | Wysokość: | Rozmiar: 1.2 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 28 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 14 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 16 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 22 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 19 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 53 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 14 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 18 KiB |
|
@ -4,3 +4,6 @@
|
|||
- [Introducing Matplotlib](matplotlib-introduction.md)
|
||||
- [Bar Plots in Matplotlib](matplotlib-bar-plots.md)
|
||||
- [Pie Charts in Matplotlib](matplotlib-pie-charts.md)
|
||||
- [Line Charts in Matplotlib](matplotlib-line-plots.md)
|
||||
- [Introduction to Seaborn and Installation](seaborn-intro.md)
|
||||
- [Getting started with Seaborn](seaborn-basics.md)
|
||||
|
|
|
@ -0,0 +1,278 @@
|
|||
# Line Chart in Matplotlib
|
||||
|
||||
A line chart is a simple way to visualize data where we connect individual data points. It helps us to see trends and patterns over time or across categories.
|
||||
|
||||
This type of chart is particularly useful for:
|
||||
- Comparing Data: Comparing multiple datasets on the same axes.
|
||||
- Highlighting Changes: Illustrating changes and patterns in data.
|
||||
- Visualizing Trends: Showing trends over time or other continuous variables.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Line plots can be created in Python with Matplotlib's `pyplot` library. To build a line plot, first import `matplotlib`. It is a standard convention to import Matplotlib's pyplot library as `plt`.
|
||||
|
||||
```python
|
||||
import matplotlib.pyplot as plt
|
||||
```
|
||||
|
||||
## Creating a simple Line Plot
|
||||
|
||||
First import matplotlib and numpy, these are useful for charting.
|
||||
|
||||
You can use the `plot(x,y)` method to create a line chart.
|
||||
|
||||
```python
|
||||
import matplotlib.pyplot as plt
|
||||
import numpy as np
|
||||
|
||||
x = np.linspace(-1, 1, 50)
|
||||
print(x)
|
||||
y = 2*x + 1
|
||||
|
||||
plt.plot(x, y)
|
||||
plt.show()
|
||||
```
|
||||
|
||||
When executed, this will show the following line plot:
|
||||
|
||||

|
||||
|
||||
|
||||
## Curved line
|
||||
|
||||
The `plot()` method also works for other types of line charts. It doesn’t need to be a straight line, y can have any type of values.
|
||||
|
||||
```python
|
||||
import matplotlib.pyplot as plt
|
||||
import numpy as np
|
||||
|
||||
x = np.linspace(-1, 1, 50)
|
||||
y = 2**x + 1
|
||||
|
||||
plt.plot(x, y)
|
||||
plt.show()
|
||||
```
|
||||
|
||||
When executed, this will show the following Curved line plot:
|
||||
|
||||

|
||||
|
||||
|
||||
## Line with Labels
|
||||
|
||||
To know what you are looking at, you need meta data. Labels are a type of meta data. They show what the chart is about. The chart has an `x label`, `y label` and `title`.
|
||||
|
||||
```python
|
||||
import matplotlib.pyplot as plt
|
||||
import numpy as np
|
||||
|
||||
x = np.linspace(-1, 1, 50)
|
||||
y1 = 2*x + 1
|
||||
y2 = 2**x + 1
|
||||
|
||||
plt.figure()
|
||||
plt.plot(x, y1)
|
||||
|
||||
plt.xlabel("I am x")
|
||||
plt.ylabel("I am y")
|
||||
plt.title("With Labels")
|
||||
|
||||
plt.show()
|
||||
```
|
||||
|
||||
When executed, this will show the following line with labels plot:
|
||||
|
||||

|
||||
|
||||
## Multiple lines
|
||||
|
||||
More than one line can be in the plot. To add another line, just call the `plot(x,y)` function again. In the example below we have two different values for `y(y1,y2)` that are plotted onto the chart.
|
||||
|
||||
```python
|
||||
import matplotlib.pyplot as plt
|
||||
import numpy as np
|
||||
|
||||
x = np.linspace(-1, 1, 50)
|
||||
y1 = 2*x + 1
|
||||
y2 = 2**x + 1
|
||||
|
||||
plt.figure(num = 3, figsize=(8, 5))
|
||||
plt.plot(x, y2)
|
||||
plt.plot(x, y1,
|
||||
color='red',
|
||||
linewidth=1.0,
|
||||
linestyle='--'
|
||||
)
|
||||
|
||||
plt.show()
|
||||
```
|
||||
|
||||
When executed, this will show the following Multiple lines plot:
|
||||
|
||||

|
||||
|
||||
|
||||
## Dotted line
|
||||
|
||||
Lines can be in the form of dots like the image below. Instead of calling `plot(x,y)` call the `scatter(x,y)` method. The `scatter(x,y)` method can also be used to (randomly) plot points onto the chart.
|
||||
|
||||
```python
|
||||
import matplotlib.pyplot as plt
|
||||
import numpy as np
|
||||
|
||||
n = 1024
|
||||
X = np.random.normal(0, 1, n)
|
||||
Y = np.random.normal(0, 1, n)
|
||||
T = np.arctan2(X, Y)
|
||||
|
||||
plt.scatter(np.arange(5), np.arange(5))
|
||||
|
||||
plt.xticks(())
|
||||
plt.yticks(())
|
||||
|
||||
plt.show()
|
||||
```
|
||||
|
||||
When executed, this will show the following Dotted line plot:
|
||||
|
||||

|
||||
|
||||
## Line ticks
|
||||
|
||||
You can change the ticks on the plot. Set them on the `x-axis`, `y-axis` or even change their color. The line can be more thick and have an alpha value.
|
||||
|
||||
```python
|
||||
import matplotlib.pyplot as plt
|
||||
import numpy as np
|
||||
|
||||
x = np.linspace(-1, 1, 50)
|
||||
y = 2*x - 1
|
||||
|
||||
plt.figure(figsize=(12, 8))
|
||||
plt.plot(x, y, color='r', linewidth=10.0, alpha=0.5)
|
||||
|
||||
ax = plt.gca()
|
||||
|
||||
ax.spines['right'].set_color('none')
|
||||
ax.spines['top'].set_color('none')
|
||||
|
||||
ax.xaxis.set_ticks_position('bottom')
|
||||
ax.yaxis.set_ticks_position('left')
|
||||
|
||||
ax.spines['bottom'].set_position(('data', 0))
|
||||
ax.spines['left'].set_position(('data', 0))
|
||||
|
||||
for label in ax.get_xticklabels() + ax.get_yticklabels():
|
||||
label.set_fontsize(12)
|
||||
label.set_bbox(dict(facecolor='y', edgecolor='None', alpha=0.7))
|
||||
|
||||
plt.show()
|
||||
```
|
||||
|
||||
When executed, this will show the following line ticks plot:
|
||||
|
||||

|
||||
|
||||
## Line with asymptote
|
||||
|
||||
An asymptote can be added to the plot. To do that, use `plt.annotate()`. There’s lso a dotted line in the plot below. You can play around with the code to see how it works.
|
||||
|
||||
```python
|
||||
import matplotlib.pyplot as plt
|
||||
import numpy as np
|
||||
|
||||
x = np.linspace(-1, 1, 50)
|
||||
y1 = 2*x + 1
|
||||
y2 = 2**x + 1
|
||||
|
||||
plt.figure(figsize=(12, 8))
|
||||
plt.plot(x, y2)
|
||||
plt.plot(x, y1, color='red', linewidth=1.0, linestyle='--')
|
||||
|
||||
ax = plt.gca()
|
||||
|
||||
ax.spines['right'].set_color('none')
|
||||
ax.spines['top'].set_color('none')
|
||||
|
||||
ax.xaxis.set_ticks_position('bottom')
|
||||
ax.yaxis.set_ticks_position('left')
|
||||
|
||||
ax.spines['bottom'].set_position(('data', 0))
|
||||
ax.spines['left'].set_position(('data', 0))
|
||||
|
||||
|
||||
x0 = 1
|
||||
y0 = 2*x0 + 1
|
||||
|
||||
plt.scatter(x0, y0, s = 66, color = 'b')
|
||||
plt.plot([x0, x0], [y0, 0], 'k-.', lw= 2.5)
|
||||
|
||||
plt.annotate(r'$2x+1=%s$' %
|
||||
y0,
|
||||
xy=(x0, y0),
|
||||
xycoords='data',
|
||||
|
||||
xytext=(+30, -30),
|
||||
textcoords='offset points',
|
||||
fontsize=16,
|
||||
arrowprops=dict(arrowstyle='->',connectionstyle='arc3,rad=.2')
|
||||
)
|
||||
|
||||
plt.text(0, 3,
|
||||
r'$This\ is\ a\ good\ idea.\ \mu\ \sigma_i\ \alpha_t$',
|
||||
fontdict={'size':16,'color':'r'})
|
||||
|
||||
plt.show()
|
||||
```
|
||||
|
||||
When executed, this will show the following Line with asymptote plot:
|
||||
|
||||

|
||||
|
||||
## Line with text scale
|
||||
|
||||
It doesn’t have to be a numeric scale. The scale can also contain textual words like the example below. In `plt.yticks()` we just pass a list with text values. These values are then show against the `y axis`.
|
||||
|
||||
```python
|
||||
import matplotlib.pyplot as plt
|
||||
import numpy as np
|
||||
|
||||
x = np.linspace(-1, 1, 50)
|
||||
y1 = 2*x + 1
|
||||
y2 = 2**x + 1
|
||||
|
||||
plt.figure(num = 3, figsize=(8, 5))
|
||||
plt.plot(x, y2)
|
||||
|
||||
plt.plot(x, y1,
|
||||
color='red',
|
||||
linewidth=1.0,
|
||||
linestyle='--'
|
||||
)
|
||||
|
||||
plt.xlim((-1, 2))
|
||||
plt.ylim((1, 3))
|
||||
|
||||
new_ticks = np.linspace(-1, 2, 5)
|
||||
plt.xticks(new_ticks)
|
||||
plt.yticks([-2, -1.8, -1, 1.22, 3],
|
||||
[r'$really\ bad$', r'$bad$', r'$normal$', r'$good$', r'$readly\ good$'])
|
||||
|
||||
ax = plt.gca()
|
||||
ax.spines['right'].set_color('none')
|
||||
ax.spines['top'].set_color('none')
|
||||
|
||||
ax.xaxis.set_ticks_position('bottom')
|
||||
ax.yaxis.set_ticks_position('left')
|
||||
|
||||
ax.spines['bottom'].set_position(('data', 0))
|
||||
ax.spines['left'].set_position(('data', 0))
|
||||
|
||||
plt.show()
|
||||
```
|
||||
|
||||
When executed, this will show the following Line with text scale plot:
|
||||
|
||||

|
||||
|
||||
|
|
@ -0,0 +1,39 @@
|
|||
Seaborn helps you explore and understand your data. Its plotting functions operate on dataframes and arrays containing whole datasets and internally perform the necessary semantic mapping and statistical aggregation to produce informative plots. Its dataset-oriented, declarative API lets you focus on what the different elements of your plots mean, rather than on the details of how to draw them.
|
||||
|
||||
Here’s an example of what seaborn can do:
|
||||
```Python
|
||||
# Import seaborn
|
||||
import seaborn as sns
|
||||
|
||||
# Apply the default theme
|
||||
sns.set_theme()
|
||||
|
||||
# Load an example dataset
|
||||
tips = sns.load_dataset("tips")
|
||||
|
||||
# Create a visualization
|
||||
sns.relplot(
|
||||
data=tips,
|
||||
x="total_bill", y="tip", col="time",
|
||||
hue="smoker", style="smoker", size="size",
|
||||
)
|
||||
```
|
||||
Below is the output for the above code snippet:
|
||||
|
||||

|
||||
|
||||
```Python
|
||||
# Load an example dataset
|
||||
tips = sns.load_dataset("tips")
|
||||
```
|
||||
Most code in the docs will use the `load_dataset()` function to get quick access to an example dataset. There’s nothing special about these datasets: they are just pandas data frames, and we could have loaded them with `pandas.read_csv()` or build them by hand. Many users specify data using pandas data frames, but Seaborn is very flexible about the data structures that it accepts.
|
||||
|
||||
```Python
|
||||
# Create a visualization
|
||||
sns.relplot(
|
||||
data=tips,
|
||||
x="total_bill", y="tip", col="time",
|
||||
hue="smoker", style="smoker", size="size",
|
||||
)
|
||||
```
|
||||
This plot shows the relationship between five variables in the tips dataset using a single call to the seaborn function `relplot()`. Notice how only the names of the variables and their roles in the plot are provided. Unlike when using matplotlib directly, it wasn’t necessary to specify attributes of the plot elements in terms of the color values or marker codes. Behind the scenes, seaborn handled the translation from values in the dataframe to arguments that Matplotlib understands. This declarative approach lets you stay focused on the questions that you want to answer, rather than on the details of how to control matplotlib.
|
|
@ -0,0 +1,41 @@
|
|||
Seaborn is a Python data visualization library based on Matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.
|
||||
|
||||
## Seaborn Installation
|
||||
Before installing Matplotlib, ensure you have Python installed on your system. You can download and install Python from the [official Python website](https://www.python.org/).
|
||||
|
||||
Below are the steps to install and setup Seaborn:
|
||||
|
||||
1. Open your terminal or command prompt and run the following command to install Seaborn using `pip`:
|
||||
|
||||
```bash
|
||||
pip install seaborn
|
||||
```
|
||||
|
||||
2. The basic invocation of `pip` will install seaborn and, if necessary, its mandatory dependencies. It is possible to include optional dependencies that give access to a few advanced features:
|
||||
```bash
|
||||
pip install seaborn[stats]
|
||||
```
|
||||
|
||||
3. The library is also included as part of the Anaconda distribution, and it can be installed with `conda`:
|
||||
```bash
|
||||
conda install seaborn
|
||||
```
|
||||
|
||||
4. As the main Anaconda repository can be slow to add new releases, you may prefer using the conda-forge channel:
|
||||
```bash
|
||||
conda install seaborn -c conda-forge
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
### Supported Python versions
|
||||
- Python 3.8+
|
||||
|
||||
### Mandatory Dependencies
|
||||
- [numpy](https://numpy.org/)
|
||||
- [pandas](https://pandas.pydata.org/)
|
||||
- [matplotlib](https://matplotlib.org/)
|
||||
|
||||
### Optional Dependencies
|
||||
- [statsmodels](https://www.statsmodels.org/stable/index.html) for advanced regression plots
|
||||
- [scipy](https://scipy.org/) for clustering matrices and some advanced options
|
||||
- [fastcluster](https://pypi.org/project/fastcluster/) for faster clustering of large matrices
|