4.8 KiB
Pandas DateTime
Pandas is a robust Python library that is available as free source. The Pandas library is used to manipulate and analyse data. Pandas are made up of data structures and functions that allow for efficient data processing.
While working with data, it is common to come across time series data. Pandas is a very handy tool for dealing with time series data. Pandas is a strong Python data analysis toolkit that provides a wide range of date and time data processing options. Many data science jobs require working with time series data, time zones, and date arithmetic, and pandas simplifies these processes.
Features of Pandas Date_Time
:
-
Parsing dates and times: Pandas provides a number of functions for parsing dates and times from strings, including
to_datetime()
andparse_dates()
. These functions can handle a variety of date and time formats, Unix timestamps, and human-readable formats. -
Manipulating dates and times: Pandas provides a number of functions for manipulating dates and times, including
shift()
,resample()
, andto_timedelta()
. These functions can be used to add or subtract time periods, change the frequency of a time series, and calculate the difference between two dates or times. -
Visualizing dates and times: Pandas provides a number of functions for visualizing dates and times, including
plot()
,hist()
, andbar()
. These functions can be used to create line charts, histograms, and bar charts of date and time data.
Installation of libraries
pip install pandas
- Note: There is no need to install a seperate library for date_time operations, pandas module itself has built-in functions.
Example for retrieving day, month and year from given date:
import pandas as pd
ts = pd.Timestamp('2024-05-05')
y = ts.year
print('Year is: ', y)
m = ts.month
print('Month is: ', m)
d = ts.day
print('Day is: ', d)
Output:
Year is: 2024
Month is: 5
Day is: 5
- Note: The timestamp function in Pandas is used to convert a datetime object to a Unix timestamp. A Unix timestamp is a numerical representation of datetime.
Example for extracting time related data from given date:
import pandas as pd
ts = pd.Timestamp('2024-10-24 12:00:00')
print('Hour is: ', ts.hour)
print('Minute is: ', ts.minute)
print('Weekday is: ', ts.weekday())
print('Quarter is: ', ts.quarter)
Output:
Hour is: 12
Minute is: 0
Weekday is: 1
Quarter is: 4
Example for getting current date and time:
import pandas as pd
ts = pd.Timestamp.now()
print('Current date and time is: ', ts)
Output:
Current date and time is: 2024-05-25 11:48:25.593213
Example for generating dates' for next five days:
import pandas as pd
ts = pd.date_range(start = pd.Timestamp.now(), periods = 5)
for i in ts:
print(i.date())
Output:
2024-05-25
2024-05-26
2024-05-27
2024-05-28
2024-05-29
Example for generating dates' for previous five days:
import pandas as pd
ts = pd.date_range(end = pd.Timestamp.now(), periods = 5)
for i in ts:
print(i.date())
Output:
2024-05-21
2024-05-22
2024-05-23
2024-05-24
2024-05-25
Pandas DateTime
is Efficient than Built-in DateTime
library in various aspects like:
- In pandas, you may add a time delta to a full column of dates in a single action, but Python's datetime requires a loop.
Example using Pandas DateTime:
import pandas as pd
dates = pd.DataFrame(pd.date_range('2023-01-01', periods=100000, freq='T'))
dates += pd.Timedelta(days=1)
print(dates)
Output:
0
0 2023-01-02 00:00:00
1 2023-01-02 00:01:00
2 2023-01-02 00:02:00
3 2023-01-02 00:03:00
4 2023-01-02 00:04:00
... ...
99995 2023-03-12 10:35:00
99996 2023-03-12 10:36:00
99997 2023-03-12 10:37:00
99998 2023-03-12 10:38:00
99999 2023-03-12 10:39:00
Example using Built-In datetime library:
from datetime import datetime, timedelta
dates = [datetime(2023, 1, 1) + timedelta(minutes=i) for i in range(100000)]
dates = [date + timedelta(days=1) for date in dates]
Output:
The output is very large to display and taking more time to display
-
Pandas employs NumPy's datetime64 dtype, which takes up a set amount of bytes (usually 8 bytes per date), to store datetime data more compactly and efficiently.
-
Each datetime object in Python takes up extra memory since it contains not only the date and time but also the additional metadata and overhead associated with Python objects.
-
Pandas Offers a wide range of convenient functions and methods for date manipulation, extraction, and conversion, such as
pd.to_datetime()
,date_range()
,timedelta_range()
, and more. -
datetime library requires manual implementation for many of these operations, leading to longer and less efficient code.