Merge branch 'animator:main' into main
|
@ -24,3 +24,4 @@
|
|||
- [K-nearest neighbor (KNN)](knn.md)
|
||||
- [Naive Bayes](naive-bayes.md)
|
||||
- [Neural network regression](neural-network-regression.md)
|
||||
- [PyTorch Fundamentals](pytorch-fundamentals.md)
|
||||
|
|
|
@ -0,0 +1,469 @@
|
|||
# PyTorch Fundamentals
|
||||
|
||||
|
||||
```python
|
||||
# Import pytorch in our codespace
|
||||
import torch
|
||||
print(torch.__version__)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
2.3.0+cu121
|
||||
```
|
||||
|
||||
|
||||
2.3.0 is the pytorch version and 121 is the cuda version
|
||||
|
||||
Now you have already seen how to create a tensor in pytorch. In this notebook i am going to show you the operations which can be applied on a tensor with a quick previous revision.
|
||||
|
||||
### 1. Creating tensors
|
||||
|
||||
Scalar tensor ( a zero dimension tensor)
|
||||
|
||||
```python
|
||||
scalar = torch.tensor(7)
|
||||
print(scalar)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
tensor(7)
|
||||
```
|
||||
|
||||
Check the dimension of the above tensor
|
||||
|
||||
```python
|
||||
print(scalar.ndim)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
0
|
||||
```
|
||||
|
||||
To retrieve the number from the tensor we use `item()`
|
||||
|
||||
```python
|
||||
print(scalar.item())
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
7
|
||||
```
|
||||
|
||||
Vector (It is a single dimension tensor but contain many numbers)
|
||||
|
||||
```python
|
||||
vector = torch.tensor([1,2])
|
||||
print(vector)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
tensor([1, 2])
|
||||
```
|
||||
|
||||
Check the dimensions
|
||||
|
||||
```python
|
||||
print(vector.ndim)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
1
|
||||
```
|
||||
|
||||
Check the shape of the vector
|
||||
|
||||
```python
|
||||
print(vector.shape)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
torch.Size([2])
|
||||
```
|
||||
|
||||
|
||||
The above returns torch.Size([2]) which means our vector has a shape of [2]. This is because of the two elements we placed inside the square brackets ([1,2])
|
||||
|
||||
Note:
|
||||
I'll let you in on a trick.
|
||||
|
||||
You can tell the number of dimensions a tensor in PyTorch has by the number of square brackets on the outside ([) and you only need to count one side.
|
||||
|
||||
|
||||
```python
|
||||
# Let's create a matrix
|
||||
MATRIX = torch.tensor([[1,2],
|
||||
[4,5]])
|
||||
print(MATRIX)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
tensor([[1, 2],
|
||||
[4, 5]])
|
||||
```
|
||||
|
||||
There are two brackets so it must be 2 dimensions , lets check
|
||||
|
||||
|
||||
```python
|
||||
print(MATRIX.ndim)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
2
|
||||
```
|
||||
|
||||
|
||||
```python
|
||||
# Shape
|
||||
print(MATRIX.shape)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
torch.Size([2, 2])
|
||||
```
|
||||
|
||||
It means MATRIX has 2 rows and 2 columns.
|
||||
|
||||
Let's create a TENSOR
|
||||
|
||||
```python
|
||||
TENSOR = torch.tensor([[[1,2,3],
|
||||
[4,5,6],
|
||||
[7,8,9]]])
|
||||
print(TENSOR)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
tensor([[[1, 2, 3],
|
||||
[4, 5, 6],
|
||||
[7, 8, 9]]])
|
||||
```
|
||||
|
||||
Let's check the dimensions
|
||||
```python
|
||||
print(TENSOR.ndim)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
3
|
||||
```
|
||||
|
||||
shape
|
||||
```python
|
||||
print(TENSOR.shape)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
torch.Size([1, 3, 3])
|
||||
```
|
||||
|
||||
The dimensions go outer to inner.
|
||||
|
||||
That means there's 1 dimension of 3 by 3.
|
||||
|
||||
##### Let's summarise
|
||||
|
||||
* scalar -> a single number having 0 dimension.
|
||||
* vector -> have many numbers but having 1 dimension.
|
||||
* matrix -> a array of numbers having 2 dimensions.
|
||||
* tensor -> a array of numbers having n dimensions.
|
||||
|
||||
### Random Tensors
|
||||
|
||||
We can create them using `torch.rand()` and passing in the `size` parameter.
|
||||
|
||||
Creating a random tensor of size (3,4)
|
||||
```python
|
||||
rand_tensor = torch.rand(size = (3,4))
|
||||
print(rand_tensor)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
tensor([[0.7462, 0.4950, 0.7851, 0.8277],
|
||||
[0.6112, 0.5159, 0.1728, 0.6847],
|
||||
[0.4472, 0.1612, 0.6481, 0.3236]])
|
||||
```
|
||||
|
||||
Check the dimensions
|
||||
|
||||
```python
|
||||
print(rand_tensor.ndim)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
2
|
||||
```
|
||||
|
||||
Shape
|
||||
```python
|
||||
print(rand_tensor.shape)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
torch.Size([3, 4])
|
||||
```
|
||||
|
||||
Datatype
|
||||
```python
|
||||
print(rand_tensor.dtype)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
torch.float32
|
||||
```
|
||||
|
||||
### Zeros and ones
|
||||
|
||||
Here we will create a tensor of any shape filled with zeros and ones
|
||||
|
||||
|
||||
```python
|
||||
# Create a tensor of all zeros
|
||||
zeros = torch.zeros(size = (3,4))
|
||||
print(zeros)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
tensor([[0., 0., 0., 0.],
|
||||
[0., 0., 0., 0.],
|
||||
[0., 0., 0., 0.]])
|
||||
```
|
||||
|
||||
Create a tensor of ones
|
||||
```python
|
||||
ones = torch.ones(size = (3,4))
|
||||
print(ones)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
tensor([[1., 1., 1., 1.],
|
||||
[1., 1., 1., 1.],
|
||||
[1., 1., 1., 1.]])
|
||||
```
|
||||
|
||||
### Create a tensor having range of numbers
|
||||
|
||||
You can use `torch.arange(start, end, step)` to do so.
|
||||
|
||||
Where:
|
||||
|
||||
* start = start of range (e.g. 0)
|
||||
* end = end of range (e.g. 10)
|
||||
* step = how many steps in between each value (e.g. 1)
|
||||
|
||||
> Note: In Python, you can use range() to create a range. However in PyTorch, torch.range() is deprecated show error, show use `torch.arange()`
|
||||
|
||||
|
||||
```python
|
||||
zero_to_ten = torch.arange(start = 0,
|
||||
end = 10,
|
||||
step = 1)
|
||||
print(zero_to_ten)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
|
||||
```
|
||||
|
||||
# 2. Manipulating tensors (tensor operations)
|
||||
|
||||
The operations are :
|
||||
|
||||
* Addition
|
||||
* Substraction
|
||||
* Multiplication (element-wise)
|
||||
* Division
|
||||
* Matrix multiplication
|
||||
|
||||
### 1. Addition
|
||||
|
||||
|
||||
```python
|
||||
tensor = torch.tensor([1,2,3])
|
||||
print(tensor+10)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
tensor([11, 12, 13])
|
||||
```
|
||||
|
||||
We have add 10 to each tensor element.
|
||||
|
||||
|
||||
```python
|
||||
tensor1 = torch.tensor([4,5,6])
|
||||
print(tensor+tensor1)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
tensor([5, 7, 9])
|
||||
```
|
||||
|
||||
We have added two tensors , remember that addition takes place element wise.
|
||||
|
||||
### 2. Subtraction
|
||||
|
||||
|
||||
```python
|
||||
print(tensor-8)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
tensor([-7, -6, -5])
|
||||
```
|
||||
|
||||
We've subtracted 8 from the above tensor.
|
||||
|
||||
|
||||
```python
|
||||
print(tensor-tensor1)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
tensor([-3, -3, -3])
|
||||
```
|
||||
|
||||
### 3. Multiplication
|
||||
|
||||
|
||||
```python
|
||||
# Multiply the tensor with 10 (element wise)
|
||||
print(tensor*10)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
tensor([10, 20, 30])
|
||||
```
|
||||
|
||||
Each element of tensor gets multiplied by 10.
|
||||
|
||||
Note:
|
||||
|
||||
PyTorch also has a bunch of built-in functions like `torch.mul()` (short for multiplication) and `torch.add()` to perform basic operations.
|
||||
|
||||
|
||||
```python
|
||||
# let's see them
|
||||
print(torch.add(tensor,10))
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
tensor([11, 12, 13])
|
||||
```
|
||||
|
||||
|
||||
```python
|
||||
print(torch.mul(tensor,10))
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
tensor([10, 20, 30])
|
||||
```
|
||||
|
||||
### Matrix multiplication (is all you need)
|
||||
One of the most common operations in machine learning and deep learning algorithms (like neural networks) is matrix multiplication.
|
||||
|
||||
PyTorch implements matrix multiplication functionality in the `torch.matmul()` method.
|
||||
|
||||
The main two rules for matrix multiplication to remember are:
|
||||
|
||||
The inner dimensions must match:
|
||||
* (3, 2) @ (3, 2) won't work
|
||||
* (2, 3) @ (3, 2) will work
|
||||
* (3, 2) @ (2, 3) will work
|
||||
The resulting matrix has the shape of the outer dimensions:
|
||||
* (2, 3) @ (3, 2) -> (2, 2)
|
||||
* (3, 2) @ (2, 3) -> (3, 3)
|
||||
|
||||
|
||||
Note: "@" in Python is the symbol for matrix multiplication.
|
||||
|
||||
|
||||
```python
|
||||
# let's perform the matrix multiplication
|
||||
tensor1 = torch.tensor([[[1,2,3],
|
||||
[4,5,6],
|
||||
[7,8,9]]])
|
||||
tensor2 = torch.tensor([[[1,1,1],
|
||||
[2,2,2],
|
||||
[3,3,3]]])
|
||||
|
||||
print(tensor1) , print(tensor2)
|
||||
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
tensor([[[1, 2, 3],
|
||||
[4, 5, 6],
|
||||
[7, 8, 9]]])
|
||||
tensor([[[1, 1, 1],
|
||||
[2, 2, 2],
|
||||
[3, 3, 3]]])
|
||||
```
|
||||
|
||||
Let's check the shape
|
||||
```python
|
||||
print(tensor1.shape) , print(tensor2.shape)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
torch.Size([1, 3, 3])
|
||||
torch.Size([1, 3, 3])
|
||||
```
|
||||
|
||||
Matrix multiplication
|
||||
```python
|
||||
print(torch.matmul(tensor1, tensor2))
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
tensor([[[14, 14, 14],
|
||||
[32, 32, 32],
|
||||
[50, 50, 50]]])
|
||||
```
|
||||
|
||||
Can also use the "@" symbol for matrix multiplication, though not recommended
|
||||
```python
|
||||
print(tensor1 @ tensor2)
|
||||
```
|
||||
|
||||
#### Output
|
||||
```
|
||||
tensor([[[14, 14, 14],
|
||||
[32, 32, 32],
|
||||
[50, 50, 50]]])
|
||||
```
|
||||
|
||||
Note:
|
||||
|
||||
If shape is not perfect you can transpose the tensor and perform the matrix multiplication.
|
Po Szerokość: | Wysokość: | Rozmiar: 25 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 24 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 26 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 27 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 26 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 23 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 21 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 29 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 22 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 27 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 22 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 52 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 37 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 106 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 119 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 28 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 26 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 16 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 25 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 43 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 22 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 34 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 20 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 13 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 16 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 25 KiB |
Po Szerokość: | Wysokość: | Rozmiar: 18 KiB |
|
@ -7,4 +7,6 @@
|
|||
- [Line Charts in Matplotlib](matplotlib-line-plots.md)
|
||||
- [Scatter Plots in Matplotlib](matplotlib-scatter-plot.md)
|
||||
- [Introduction to Seaborn and Installation](seaborn-intro.md)
|
||||
- [Seaborn Plotting Functions](seaborn-plotting.md)
|
||||
- [Getting started with Seaborn](seaborn-basics.md)
|
||||
- [Bar Plots in Plotly](plotly-bar-plots.md)
|
||||
|
|
|
@ -0,0 +1,348 @@
|
|||
# Bar Plots in Plotly
|
||||
|
||||
A bar plot or a bar chart is a type of data visualisation that represents data in the form of rectangular bars, with lengths or heights proportional to the values and data which they represent. The bar plots can be plotted both vertically and horizontally.
|
||||
|
||||
It is one of the most widely used type of data visualisation as it is easy to interpret and is pleasing to the eyes.
|
||||
|
||||
Plotly is a very powerful library for creating modern visualizations and it provides a very easy and intuitive method to create highly customized bar plots.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before creating bar plots in Plotly you must ensure that you have Python, Plotly and Pandas installed on your system.
|
||||
|
||||
## Introduction
|
||||
|
||||
There are various ways to create bar plots in `plotly`. One of the prominent and easiest one is using `plotly.express`. Plotly Express is the easy-to-use, high-level interface to Plotly, which operates on a variety of types of data and produces easy-to-style figures. On the other hand you can also use `plotly.graph_objects` to create various plots.
|
||||
|
||||
Here, we'll be using `plotly.express` to create the bar plots. Also we'll be converting our datasets into pandas DataFrames which makes it extremely convenient to create plots.
|
||||
|
||||
Also, note that when you execute the codes in a simple python file, the output plot will be shown in your **browser**, rather than a pop-up window like in matplotlib. If you do not want that, it is **recommended to create the plots in a notebook (like jupyter)**. For this, install an additional library `nbformat`. This way you can see the output on the notebook itself, and can also render its format to png, jpg, etc.
|
||||
|
||||
## Creating a simple bar plot using `plotly.express.bar`
|
||||
|
||||
With `plotly.express.bar`, each row of the DataFrame is represented as a rectangular mark.
|
||||
|
||||
```Python
|
||||
import plotly.express as px
|
||||
import pandas as pd
|
||||
|
||||
# Creating dataset
|
||||
years = ['1998', '1999', '2000', '2001', '2002']
|
||||
num_of_cars_sold = [200, 300, 500, 700, 1000]
|
||||
|
||||
# Converting dataset to pandas DataFrame
|
||||
dataset = {"Years":years, "Number of Cars sold":num_of_cars_sold}
|
||||
df = pd.DataFrame(dataset)
|
||||
|
||||
# Creating bar plot
|
||||
fig = px.bar(df, x='Years', y='Number of Cars sold')
|
||||
|
||||
# Showing plot
|
||||
fig.show()
|
||||
```
|
||||

|
||||
|
||||
Here, we are first creating the dataset and converting it into Pandas DataFrames using dictionaries, with its keys being DataFrame columns. Next, we are plotting the bar chart by using `px.bar`. In the `x` and `y` parameters, we have to specify a column name in the DataFrame.
|
||||
|
||||
**Note:** When you generate the image using above code, it will show you an **interactive plot**, if you want image, you can download it from their itself.
|
||||
|
||||
## Customizing Bar Plots
|
||||
|
||||
### Adding title to the graph
|
||||
|
||||
Let us create an imaginary graph of number of cars sold in a various years. Simply pass the title of your graph as a parameter in `px.bar`.
|
||||
|
||||
```Python
|
||||
import plotly.express as px
|
||||
import pandas as pd
|
||||
|
||||
# Creating dataset
|
||||
years = ['1998', '1999', '2000', '2001', '2002']
|
||||
num_of_cars_sold = [200, 300, 500, 700, 1000]
|
||||
|
||||
# Converting dataset to pandas DataFrame
|
||||
dataset = {"Years":years, "Number of Cars sold":num_of_cars_sold}
|
||||
df = pd.DataFrame(dataset)
|
||||
|
||||
# Creating bar plot
|
||||
fig = px.bar(df, x='Years', y='Number of Cars sold',
|
||||
title='Number of cars sold in various years')
|
||||
|
||||
# Showing plot
|
||||
fig.show()
|
||||
```
|
||||

|
||||
|
||||
### Adding bar colors and legends
|
||||
|
||||
To add different colors to different bars, simply pass the column name of the x-axis or a custom column which groups different bars in `color` parameter.
|
||||
|
||||
```Python
|
||||
import plotly.express as px
|
||||
import pandas as pd
|
||||
|
||||
# Creating dataset
|
||||
years = ['1998', '1999', '2000', '2001', '2002']
|
||||
num_of_cars_sold = [200, 300, 500, 700, 1000]
|
||||
|
||||
# Converting dataset to pandas DataFrame
|
||||
dataset = {"Years":years, "Number of Cars sold":num_of_cars_sold}
|
||||
df = pd.DataFrame(dataset)
|
||||
|
||||
# Creating bar plot
|
||||
fig = px.bar(df, x='Years', y='Number of Cars sold',
|
||||
title='Number of cars sold in various years',
|
||||
color='Years')
|
||||
|
||||
# Showing plot
|
||||
fig.show()
|
||||
```
|
||||

|
||||
|
||||
Now, let us consider our previous example of number of cars sold in various years and suppose that we want to add different colors to the bars from different centuries and respective legends for better interpretation.
|
||||
|
||||
The easiest way to achieve this is to add a new column to the dataframe and then pass it to the `color` parameter.
|
||||
|
||||
```Python
|
||||
import plotly.express as px
|
||||
import pandas as pd
|
||||
|
||||
# Creating dataset
|
||||
years = ['1998', '1999', '2000', '2001', '2002']
|
||||
num_of_cars_sold = [200, 300, 500, 700, 1000]
|
||||
# Creating the relevant colors dataset
|
||||
colors = ['1900s','1900s','2000s','2000s','2000s']
|
||||
|
||||
# Converting dataset to pandas DataFrame
|
||||
dataset = {"Years":years, "Number of Cars sold":num_of_cars_sold, "Century":colors}
|
||||
df = pd.DataFrame(dataset)
|
||||
|
||||
# Creating bar plot
|
||||
fig = px.bar(df, x='Years', y='Number of Cars sold',
|
||||
title='Number of cars sold in various years',
|
||||
color='Century')
|
||||
|
||||
# Showing plot
|
||||
fig.show()
|
||||
```
|
||||

|
||||
|
||||
### Adding labels to bars
|
||||
|
||||
We may want to add labels to bars representing their absolute (or truncated) values for instant and accurate reading. This can be achieved by setting `text_auto` parameter to `True`. If you want custom text then you can pass a column name to the `text` parameter.
|
||||
|
||||
```Python
|
||||
import plotly.express as px
|
||||
import pandas as pd
|
||||
|
||||
# Creating dataset
|
||||
years = ['1998', '1999', '2000', '2001', '2002']
|
||||
num_of_cars_sold = [200, 300, 500, 700, 1000]
|
||||
colors = ['1900s','1900s','2000s','2000s','2000s']
|
||||
|
||||
# Converting dataset to pandas DataFrame
|
||||
dataset = {"Years":years, "Number of Cars sold":num_of_cars_sold, "Century":colors}
|
||||
df = pd.DataFrame(dataset)
|
||||
|
||||
# Creating bar plot
|
||||
fig = px.bar(df, x='Years', y='Number of Cars sold',
|
||||
title='Number of cars sold in various years',
|
||||
color='Century',
|
||||
text_auto=True)
|
||||
|
||||
# Showing plot
|
||||
fig.show()
|
||||
```
|
||||

|
||||
|
||||
```Python
|
||||
import plotly.express as px
|
||||
import pandas as pd
|
||||
|
||||
# Creating dataset
|
||||
years = ['1998', '1999', '2000', '2001', '2002']
|
||||
num_of_cars_sold = [200, 300, 500, 700, 1000]
|
||||
colors = ['1900s','1900s','2000s','2000s','2000s']
|
||||
|
||||
# Converting dataset to pandas DataFrame
|
||||
dataset = {"Years":years, "Number of Cars sold":num_of_cars_sold, "Century":colors}
|
||||
df = pd.DataFrame(dataset)
|
||||
|
||||
# Creating bar plot
|
||||
fig = px.bar(df, x='Years', y='Number of Cars sold',
|
||||
title='Number of cars sold in various years',
|
||||
color='Century',
|
||||
text='Century')
|
||||
|
||||
# Showing plot
|
||||
fig.show()
|
||||
```
|
||||

|
||||
|
||||
You can also change the features of text (or any other element of your plot) using `fig.update_traces`.
|
||||
|
||||
Here, we are changing the position of text to position it outside the bars.
|
||||
|
||||
```Python
|
||||
import plotly.express as px
|
||||
import pandas as pd
|
||||
|
||||
# Creating dataset
|
||||
years = ['1998', '1999', '2000', '2001', '2002']
|
||||
num_of_cars_sold = [200, 300, 500, 700, 1000]
|
||||
colors = ['1900s','1900s','2000s','2000s','2000s']
|
||||
|
||||
# Converting dataset to pandas DataFrame
|
||||
dataset = {"Years":years, "Number of Cars sold":num_of_cars_sold, "Century":colors}
|
||||
df = pd.DataFrame(dataset)
|
||||
|
||||
# Creating bar plot
|
||||
fig = px.bar(df, x='Years', y='Number of Cars sold',
|
||||
title='Number of cars sold in various years',
|
||||
color='Century',
|
||||
text_auto=True)
|
||||
|
||||
# Updating bar text properties
|
||||
fig.update_traces(textposition="outside", cliponaxis=False)
|
||||
|
||||
# Showing plot
|
||||
fig.show()
|
||||
```
|
||||

|
||||
|
||||
### Rounded Bars
|
||||
|
||||
You can create rounded by specifying the radius value to `barcornerradius` in `fig.update_layout`.
|
||||
|
||||
```Python
|
||||
import plotly.express as px
|
||||
import pandas as pd
|
||||
|
||||
# Creating dataset
|
||||
years = ['1998', '1999', '2000', '2001', '2002']
|
||||
num_of_cars_sold = [200, 300, 500, 700, 1000]
|
||||
colors = ['1900s','1900s','2000s','2000s','2000s']
|
||||
|
||||
# Converting dataset to pandas DataFrame
|
||||
dataset = {"Years":years, "Number of Cars sold":num_of_cars_sold, "Century":colors}
|
||||
df = pd.DataFrame(dataset)
|
||||
|
||||
# Creating bar plot
|
||||
fig = px.bar(df, x='Years', y='Number of Cars sold',
|
||||
title='Number of cars sold in various years',
|
||||
color='Century',
|
||||
text_auto=True)
|
||||
|
||||
# Updating bar text properties
|
||||
fig.update_traces(textposition="outside", cliponaxis=False)
|
||||
|
||||
# Updating figure layout
|
||||
fig.update_layout({
|
||||
'plot_bgcolor': 'rgba(255, 255, 255, 1)',
|
||||
'paper_bgcolor': 'rgba(255, 255, 255, 1)',
|
||||
'barcornerradius': 15
|
||||
})
|
||||
|
||||
# Showing plot
|
||||
fig.show()
|
||||
```
|
||||
|
||||

|
||||
|
||||
## Horizontal Bar Plot
|
||||
|
||||
To create a horizontal bar plot, you just have to interchange your `x` and `y` DataFrame columns. Plotly takes care of the rest!
|
||||
|
||||
```Python
|
||||
import plotly.express as px
|
||||
import pandas as pd
|
||||
|
||||
# Creating dataset
|
||||
years = ['1998', '1999', '2000', '2001', '2002']
|
||||
num_of_cars_sold = [200, 300, 500, 700, 1000]
|
||||
colors = ['1900s','1900s','2000s','2000s','2000s']
|
||||
|
||||
# Converting dataset to pandas DataFrame
|
||||
dataset = {"Years":years, "Number of Cars sold":num_of_cars_sold, "Century":colors}
|
||||
df = pd.DataFrame(dataset)
|
||||
|
||||
# Creating bar plot
|
||||
fig = px.bar(df, x='Number of Cars sold', y='Years',
|
||||
title='Number of cars sold in various years',
|
||||
color='Century',
|
||||
text_auto=True)
|
||||
|
||||
# Updating bar text properties
|
||||
fig.update_traces(textposition="outside", cliponaxis=False)
|
||||
|
||||
# Updating figure layout
|
||||
fig.update_layout({
|
||||
'barcornerradius': 30
|
||||
})
|
||||
|
||||
# Showing plot
|
||||
fig.show()
|
||||
```
|
||||

|
||||
|
||||
## Plotting Long Format and Wide Format Data
|
||||
|
||||
Long-form data has one row per observation, and one column per variable. This is suitable for storing and displaying multivariate data i.e. with dimension greater than 2.
|
||||
|
||||
```Python
|
||||
# Plotting long format data
|
||||
|
||||
import plotly.express as px
|
||||
|
||||
# Long format dataset
|
||||
long_df = px.data.medals_long()
|
||||
|
||||
# Creating Bar Plot
|
||||
fig = px.bar(long_df, x="nation", y="count", color="medal", title="Long-Form Input")
|
||||
|
||||
# Showing Plot
|
||||
fig.show()
|
||||
```
|
||||

|
||||
|
||||
```Python
|
||||
print(long_df)
|
||||
|
||||
# Output
|
||||
nation medal count
|
||||
0 South Korea gold 24
|
||||
1 China gold 10
|
||||
2 Canada gold 9
|
||||
3 South Korea silver 13
|
||||
4 China silver 15
|
||||
5 Canada silver 12
|
||||
6 South Korea bronze 11
|
||||
7 China bronze 8
|
||||
8 Canada bronze 12
|
||||
```
|
||||
|
||||
Wide-form data has one row per value of one of the first variable, and one column per value of the second variable. This is suitable for storing and displaying 2-dimensional data.
|
||||
|
||||
```Python
|
||||
# Plotting wide format data
|
||||
import plotly.express as px
|
||||
|
||||
# Wide format dataset
|
||||
wide_df = px.data.medals_wide()
|
||||
|
||||
# Creating Bar Plot
|
||||
fig = px.bar(wide_df, x="nation", y=["gold", "silver", "bronze"], title="Wide-Form Input")
|
||||
|
||||
# Showing Plot
|
||||
fig.show()
|
||||
```
|
||||

|
||||
|
||||
```Python
|
||||
print(wide_df)
|
||||
|
||||
# Output
|
||||
nation gold silver bronze
|
||||
0 South Korea 24 13 11
|
||||
1 China 10 15 8
|
||||
2 Canada 9 12 12
|
|
@ -0,0 +1,259 @@
|
|||
# Seaborn
|
||||
|
||||
Seaborn is a powerful and easy-to-use data visualization library in Python built on top of Matplotlib.It provides a high-level interface for drawing attractive and informative statistical graphics.Now we will cover various functions covered by Seaborn, along with examples to illustrate their usage.
|
||||
Seaborn simplifies the process of creating complex visualizations with a few lines of code and it integrates closely with pandas data structure , making it an excellent choice for data analysis and exploration.
|
||||
|
||||
## Setting up Seaborn
|
||||
|
||||
Make sure seaborn library is installed in your system. If not use command
|
||||
`pip install seaborn`
|
||||
|
||||
After installing you are all set to experiment with plotting functions.
|
||||
|
||||
```python
|
||||
#import necessary libraries
|
||||
|
||||
import seaborn as sns
|
||||
import matplotlib.pyplot as plt
|
||||
import pandas as pd
|
||||
```
|
||||
|
||||
Seaborn includes several built-in datasets that you can use for practice
|
||||
You can list all available datasets using below command
|
||||
```python
|
||||
sns.get_dataset_names()
|
||||
```
|
||||
|
||||
Here we are using 'tips' dataset
|
||||
|
||||
```python
|
||||
# loading an example dataset
|
||||
tips=sns.load_dataset('tips')
|
||||
```
|
||||
|
||||
Before delving into plotting, make yourself comfortable with the dataset. To do that, use the pandas library to understand what information the dataset contains and preprocess the data. If you get stuck, feel free to refer to the pandas documentation.
|
||||
|
||||
## Relational Plots
|
||||
|
||||
Relational plots are used to visualize the relationship between two or more variables
|
||||
|
||||
### Scatter Plot
|
||||
A scatter plot displays data points based on two numerical variables.Seaborn `scatterplot` function allows you to create scatter plots with ease
|
||||
|
||||
```python
|
||||
# scatter plot using Seaborn
|
||||
|
||||
plt.figure(figsize=(5,5))
|
||||
sns.scatterplot(data=tips,x='total_bill',y='tip',hue='day',style='time')
|
||||
plt.title('Scatter Plot of Total Bill vs Tip')
|
||||
plt.show()
|
||||
```
|
||||

|
||||
|
||||
### Line Plot
|
||||
A line plot connects data points in the order they appear in the dataset.This is useful for time series data.`lineplot` function allows you to create lineplots.
|
||||
|
||||
```python
|
||||
# lineplot using seaborn
|
||||
|
||||
plt.figure(figsize=(5,5))
|
||||
sns.lineplot(data=tips,x='size',y='total_bill',hue='day')
|
||||
plt.title('Line Plot of Total Bill by Size and Day')
|
||||
plt.show()
|
||||
```
|
||||
|
||||

|
||||
|
||||
## Distribution Plots
|
||||
|
||||
Distribution Plots visualize the distribution of a single numerical variable
|
||||
|
||||
### HistPlot
|
||||
A histplot displays the distribution of a numerical variable by dividing the data into bins.
|
||||
|
||||
```python
|
||||
# Histplot using Seaborn
|
||||
|
||||
plt.figure(figsize=(5,5))
|
||||
sns.histplot(data=tips, x='total_bill', kde=True)
|
||||
plt.title('Histplot of Total Bill')
|
||||
plt.show()
|
||||
```
|
||||

|
||||
|
||||
### KDE Plot
|
||||
A Kernel Density Estimate (KDE) plot represents the distribution of a variable as a smooth curve.
|
||||
|
||||
```python
|
||||
# KDE Plot using Seaborn
|
||||
|
||||
plt.figure(figsize=(5,5))
|
||||
sns.kdeplot(data=tips, x='total_bill', hue='sex', fill=True)
|
||||
plt.title('KDE Plot of Total Bill by Sex')
|
||||
plt.show()
|
||||
```
|
||||

|
||||
|
||||
### ECDF Plot
|
||||
An Empirical Cumulative Distribution Function (ECDF) plot shows the proportion of data points below each value.
|
||||
|
||||
```python
|
||||
# ECDF Plot using Seaborn
|
||||
|
||||
plt.figure(figsize=(5,5))
|
||||
sns.ecdfplot(data=tips, x='total_bill', hue='sex')
|
||||
plt.title('ECDF Plot of Total Bill by Sex')
|
||||
plt.show()
|
||||
```
|
||||

|
||||
|
||||
### Rug Plot
|
||||
A rug plot in Seaborn is a simple way to show the distribution of a variable by drawing small vertical lines (or "rugs") at each data point along the x-axis.
|
||||
|
||||
```python
|
||||
# Rug Plot using Seaborn
|
||||
|
||||
plt.figure(figsize=(3,3))
|
||||
sns.rugplot(x='total_bill', data=tips)
|
||||
plt.title('Rug Plot of Total Bill Amounts')
|
||||
plt.show()
|
||||
```
|
||||

|
||||
|
||||
## Categorical Plots
|
||||
Categorical plots are used to visualize data where one or more variables are categorical.
|
||||
|
||||
### Bar Plot
|
||||
|
||||
A bar plot shows the relationship between a categorical variable and a numerical variable.
|
||||
```python
|
||||
# Bar Plot using Seaborn
|
||||
|
||||
plt.figure(figsize=(5,5))
|
||||
sns.barplot(data=tips,x='day',y='total_bill',hue='sex')
|
||||
plt.title('Bar Plot of Total Bill by Day and Sex')
|
||||
plt.show()
|
||||
```
|
||||

|
||||
|
||||
### Point Plot
|
||||
A point plot in Seaborn is used to show the relationship between two categorical variables, with the size of the points representing the values of third variable.
|
||||
|
||||
```python
|
||||
# Point Plot using Seaborn
|
||||
|
||||
plt.figure(figsize=(5,5))
|
||||
sns.pointplot(x='day',y='total_bill',hue='sex',data=tips)
|
||||
plt.title('Average Total Bill by Day and Sex')
|
||||
plt.show()
|
||||
```
|
||||

|
||||
|
||||
### Box Plot
|
||||
A box plot displays the distribution of a numerical variable across different categories.
|
||||
|
||||
```python
|
||||
# Box Plot using Seaborn
|
||||
|
||||
plt.figure(figsize=(5,5))
|
||||
sns.boxplot(data=tips, x='day', y='total_bill', hue='sex')
|
||||
plt.title('Box Plot of Total Bill by Day and Sex')
|
||||
plt.show()
|
||||
```
|
||||

|
||||
|
||||
### Violin Plot
|
||||
A violin plot combines aspects of a box plot and a KDE plot to show the distribution of data
|
||||
|
||||
```python
|
||||
# Violin Plot using Seaborn
|
||||
|
||||
plt.figure(figsize=(5,5))
|
||||
sns.violinplot(data=tips,x='day',y='total_bill',hue='sex',split=True)
|
||||
plt.title('Violin Plot of Total Bill by Day and Sex')
|
||||
plt.show()
|
||||
```
|
||||

|
||||
|
||||
## Matrix Plots
|
||||
Matrix plots are useful for visualizing data in a matrix format.
|
||||
|
||||
### Heatmap
|
||||
A heatmap displays data in a matrix where values are represented by color.
|
||||
|
||||
```python
|
||||
# Heatmap using Seaborn
|
||||
|
||||
plt.figure(figsize=(10,8))
|
||||
flights = sns.load_dataset('flights')
|
||||
flights_pivot = flights.pivot(index='month', columns='year', values='passengers')
|
||||
sns.heatmap(flights_pivot, annot=True, fmt='d', cmap='YlGnBu')
|
||||
plt.title('Heatmap of Flight Passengers')
|
||||
plt.show()
|
||||
```
|
||||

|
||||
|
||||
## Pair Plot
|
||||
A pair plot shows the pairwise relationships between multiple variables in a dataset.
|
||||
|
||||
```python
|
||||
#Pairplot using Seaborn
|
||||
|
||||
plt.figure(figsize=(5,5))
|
||||
sns.pairplot(tips, hue='sex')
|
||||
plt.title('Pair Plot of Tips Dataset')
|
||||
plt.show()
|
||||
```
|
||||

|
||||
|
||||
## FacetGrid
|
||||
FacetGrid allows you to create a grid of plots based on the values of one or more categorical variables.
|
||||
|
||||
```python
|
||||
#Facetgrid using Seaborn
|
||||
|
||||
plt.figure(figsize=(5,5))
|
||||
g = sns.FacetGrid(tips, col='sex', row='time', margin_titles=True)
|
||||
g.map(sns.scatterplot, 'total_bill', 'tip')
|
||||
plt.show()
|
||||
```
|
||||

|
||||
|
||||
## Customizing Seaborn Plots
|
||||
Seaborn plots can be customized to improve their appearance and convey more information.
|
||||
|
||||
### Changing the Aesthetic Style
|
||||
Seaborn comes with several built-in themes.
|
||||
|
||||
```python
|
||||
sns.set_style('whitegrid')
|
||||
sns.scatterplot(data=tips, x='total_bill', y='tip')
|
||||
plt.title('Scatter Plot with Whitegrid Style')
|
||||
plt.show()
|
||||
```
|
||||

|
||||
|
||||
### Customizing Colors
|
||||
You can use color palettes to customize the colors in your plots.
|
||||
|
||||
```python
|
||||
sns.set_palette('pastel')
|
||||
sns.barplot(data=tips, x='day', y='total_bill', hue='sex')
|
||||
plt.title('Bar Plot with Pastel Palette')
|
||||
plt.show()
|
||||
```
|
||||

|
||||
|
||||
### Adding Titles and Labels
|
||||
Titles and labels can be added to make plots more informative.
|
||||
|
||||
```python
|
||||
plot = sns.scatterplot(data=tips, x='total_bill', y='tip')
|
||||
plot.set_title('Scatter Plot of Total Bill vs Tip')
|
||||
plot.set_xlabel('Total Bill ($)')
|
||||
plot.set_ylabel('Tip ($)')
|
||||
plt.show()
|
||||
```
|
||||

|
||||
|
||||
Seaborn is a versatile library that simplifies the creation of complex visualizations. By using Seaborn's plotting functions, you can create a wide range of statistical graphics with minimal effort. Whether you're working with relational data, categorical data, or distributions, Seaborn provides the tools you need to visualize your data effectively.
|