kopia lustrzana https://github.com/animator/learn-python
Merge pull request #627 from HimakarC/main
Added Working with datetime using Pandas section under Pandas Modulepull/674/head
commit
038752c885
|
@ -0,0 +1,158 @@
|
|||
# Working with Date & Time in Pandas
|
||||
|
||||
While working with data, it is common to come across data containing date and time. Pandas is a very handy tool for dealing with such data and provides a wide range of date and time data processing options.
|
||||
|
||||
- **Parsing dates and times**: Pandas provides a number of functions for parsing dates and times from strings, including `to_datetime()` and `parse_dates()`. These functions can handle a variety of date and time formats, Unix timestamps, and human-readable formats.
|
||||
|
||||
- **Manipulating dates and times**: Pandas provides a number of functions for manipulating dates and times, including `shift()`, `resample()`, and `to_timedelta()`. These functions can be used to add or subtract time periods, change the frequency of a time series, and calculate the difference between two dates or times.
|
||||
|
||||
- **Visualizing dates and times**: Pandas provides a number of functions for visualizing dates and times, including `plot()`, `hist()`, and `bar()`. These functions can be used to create line charts, histograms, and bar charts of date and time data.
|
||||
|
||||
### `Timestamp` function
|
||||
|
||||
The timestamp function in Pandas is used to convert a datetime object to a Unix timestamp. A Unix timestamp is a numerical representation of datetime.
|
||||
|
||||
Example for retrieving day, month and year from given date:
|
||||
|
||||
```python
|
||||
import pandas as pd
|
||||
|
||||
ts = pd.Timestamp('2024-05-05')
|
||||
y = ts.year
|
||||
print('Year is: ', y)
|
||||
m = ts.month
|
||||
print('Month is: ', m)
|
||||
d = ts.day
|
||||
print('Day is: ', d)
|
||||
```
|
||||
|
||||
Output:
|
||||
|
||||
```python
|
||||
Year is: 2024
|
||||
Month is: 5
|
||||
Day is: 5
|
||||
```
|
||||
|
||||
Example for extracting time related data from given date:
|
||||
|
||||
```python
|
||||
import pandas as pd
|
||||
|
||||
ts = pd.Timestamp('2024-10-24 12:00:00')
|
||||
print('Hour is: ', ts.hour)
|
||||
print('Minute is: ', ts.minute)
|
||||
print('Weekday is: ', ts.weekday())
|
||||
print('Quarter is: ', ts.quarter)
|
||||
```
|
||||
|
||||
Output:
|
||||
|
||||
```python
|
||||
Hour is: 12
|
||||
Minute is: 0
|
||||
Weekday is: 1
|
||||
Quarter is: 4
|
||||
```
|
||||
|
||||
### `Timestamp.now()`
|
||||
|
||||
Example for getting current date and time:
|
||||
|
||||
```python
|
||||
import pandas as pd
|
||||
|
||||
ts = pd.Timestamp.now()
|
||||
print('Current date and time is: ', ts)
|
||||
```
|
||||
|
||||
Output:
|
||||
```python
|
||||
Current date and time is: 2024-05-25 11:48:25.593213
|
||||
```
|
||||
|
||||
### `date_range` function
|
||||
|
||||
Example for generating dates' for next five days:
|
||||
|
||||
```python
|
||||
import pandas as pd
|
||||
|
||||
ts = pd.date_range(start = pd.Timestamp.now(), periods = 5)
|
||||
for i in ts:
|
||||
print(i.date())
|
||||
```
|
||||
|
||||
Output:
|
||||
|
||||
```python
|
||||
2024-05-25
|
||||
2024-05-26
|
||||
2024-05-27
|
||||
2024-05-28
|
||||
2024-05-29
|
||||
```
|
||||
|
||||
Example for generating dates' for previous five days:
|
||||
|
||||
```python
|
||||
import pandas as pd
|
||||
|
||||
ts = pd.date_range(end = pd.Timestamp.now(), periods = 5)
|
||||
for i in ts:
|
||||
print(i.date())
|
||||
```
|
||||
|
||||
Output:
|
||||
```python
|
||||
2024-05-21
|
||||
2024-05-22
|
||||
2024-05-23
|
||||
2024-05-24
|
||||
2024-05-25
|
||||
```
|
||||
|
||||
### Built-in vs pandas date & time operations
|
||||
|
||||
In `pandas`, you may add a time delta to a full column of dates in a single action, but Python's datetime requires a loop.
|
||||
|
||||
Example in Pandas:
|
||||
|
||||
```python
|
||||
import pandas as pd
|
||||
|
||||
dates = pd.DataFrame(pd.date_range('2023-01-01', periods=100000, freq='T'))
|
||||
dates += pd.Timedelta(days=1)
|
||||
print(dates)
|
||||
```
|
||||
|
||||
Output:
|
||||
```python
|
||||
0
|
||||
0 2023-01-02 00:00:00
|
||||
1 2023-01-02 00:01:00
|
||||
2 2023-01-02 00:02:00
|
||||
3 2023-01-02 00:03:00
|
||||
4 2023-01-02 00:04:00
|
||||
... ...
|
||||
99995 2023-03-12 10:35:00
|
||||
99996 2023-03-12 10:36:00
|
||||
99997 2023-03-12 10:37:00
|
||||
99998 2023-03-12 10:38:00
|
||||
99999 2023-03-12 10:39:00
|
||||
```
|
||||
|
||||
Example using Built-in datetime library:
|
||||
|
||||
```python
|
||||
from datetime import datetime, timedelta
|
||||
|
||||
dates = [datetime(2023, 1, 1) + timedelta(minutes=i) for i in range(100000)]
|
||||
dates = [date + timedelta(days=1) for date in dates]
|
||||
```
|
||||
|
||||
Why use pandas functions?
|
||||
|
||||
- Pandas employs NumPy's datetime64 dtype, which takes up a set amount of bytes (usually 8 bytes per date), to store datetime data more compactly and efficiently.
|
||||
- Each datetime object in Python takes up extra memory since it contains not only the date and time but also the additional metadata and overhead associated with Python objects.
|
||||
- Pandas Offers a wide range of convenient functions and methods for date manipulation, extraction, and conversion, such as `pd.to_datetime()`, `date_range()`, `timedelta_range()`, and more. datetime library requires manual implementation for many of these operations, leading to longer and less efficient code.
|
|
@ -5,5 +5,6 @@
|
|||
- [Pandas Descriptive Statistics](Descriptive_Statistics.md)
|
||||
- [Group By Functions with Pandas](GroupBy_Functions_Pandas.md)
|
||||
- [Excel using Pandas DataFrame](excel_with_pandas.md)
|
||||
- [Working with Date & Time in Pandas](datetime.md)
|
||||
- [Importing and Exporting Data in Pandas](import-export.md)
|
||||
- [Handling Missing Values in Pandas](handling-missing-values.md)
|
||||
|
|
Ładowanie…
Reference in New Issue