Date Functionality and its Importance in Pandas
Learn via video courses
Overview
Pandas was originally developed to assist with financial modeling, which means it has a range of tools for working with dates and times. Python language provides various representations of dates, times, deltas, and timespans. We need a way to ease our work when we are dealing with tonnes of data with n number of timestamps. In this article, we will talk about Date Functionality and its importance in Pandas.
Scope
In this article, we will cover Date Functionality in Pandas.
- First, we talk about what exactly datetime modules are and their importance.
- Functionalities supported by date functionality module in Pandas.
- Lastly, we will learn how to create ranges and frequencies of dates.
Introduction
Apart from texts and numbers, data is also a very important data type in datasets. The Pandas library has a different set of tools that help us in performing all the necessary tasks on date-time data. In financial data analysis, date functionality plays a very important role.
The datetime Module in Pandas
In Python language, date and time are not considered separate datatypes, but a single module named datetime is imported to work with both. It is a built-in module in python. It contains a number of classes to deal with dates, times, and time intervals. There are six main classes in the datetime module:-
- date: Its attributes are year, month, and day. It follows the current Gregorian calendar.
- time: Its attributes are hour, minute, second, microsecond, and tzinfo.
- datetime : It's a combination of date and time with the attributes year, month, day, hour, minute, second, microsecond, and tzinfo.
- timedelta: A duration that is the difference between two date, time, or datetime instances accurate to the resolution of a microsecond.
- tzinfo: Gives time zone information objects.
- timezone: Used to implement the tzinfo abstract base class.
Importance of Date Functionality
Dates and times are good sources of information. They have a big impact on financial and healthcare machine-learning models. Not just in these domains but everywhere around you, there exists some data which has a lot of valuable information stored in the form of date and time.
The Date Functionality
While working on TimeSeries data, we might come across date data, and that is exactly where data functionalities come into play. Two major use case of date functionality is generating sequences of dates and converting the date series to different frequencies. Pandas support multiple functionalities for the manipulation of time-series data. Pandas support functionalities like:
- Parsing time series information from various sources and formats
- Generate sequences of fixed-frequency dates and time spans
- Manipulating and converting date times with timezone information.
- Resampling or converting a time series to a particular frequency
- Performing date and time arithmetic with absolute or relative time increments.
Code Example 1:
Output:
Code Example 2:
Output:
Code Example 3:
Output:
Code Example 4:
Output:
Code Example 5:
Output:
Creating a Range of Dates
This function is used to create a series of dates with respect to the given parameter values.
Code Example 6:
Output:
Changing the Date Frequency
This function is used to change the frequency of the given dates to some other frequency and produce output according to it. For eg: Initially, we create a range of dates on the basis of days the frequency will be set to ('D'), in order to get the output range in terms of months, we change the frequency to months('M') and hence we get the desired output. We look into a similar example in the code given below.
Code Example 7:
Output:
b_date() Function
The function bdate_range() stands for business date ranges. It behaves very similarly to the date_range function. Unlike date_range(), it excludes Saturday and Sunday.
Code Example 8:
Output:
Conclusion
In this article, we looked into various Date functionalities in Pandas and why it is important while working on data. Let's take a quick recap of all that we have studied.
- We studied the importance of the datetime module and the six main modules under it. They were date, time, datetime, timedelta, tzinfo, and timezone.
- We then studied the functionalities supported by Pandas like Resampling dates on the basis of frequency, converting timezones from one to another, generating sequences of dates, etc.
- And in the end, we learned how to create a range of dates, how to play around with different frequencies, and how exactly bdate_range() differs from the date_range() function.
Dealing with time series data is always fun. All you have to do is tweak a few parameters and observe the change in output. The better you understand these functions, the easier your life becomes because the datetime series dataset is a must. So play around and understand your dataset well. Till then, Keep Experimenting, Keep Learning.