Introduction to Series
Learn via video courses
Overview
Pandas objects can be regarded as upgraded forms of NumPy structured arrays, with rows and columns identifiable by labels rather than plain integer indices. Pandas provide many helpful features, methods, and functionality on top of the fundamental data structure, but practically everything that follows will need a comprehension of this data structure. So, before we proceed any deeper, let's go through the basic Pandas data structures: the Series.
Scope of the Article
In this tutorial,
- We will dig into the complexities of the Pandas Series Object.
- Begin with a quick overview of the pandas series object, including syntax and parameters.
- Furthermore, using a fleshed-out example, we will look at numerous ways to build, read, index, and select data from pandas series objects.
- We'll also look at booleans and conversion operations that can be performed on Pandas Series Objects.
- Finally, the Pandas Series object has diverse attributes and methods to interact with, fleshed out with examples.
What is a Series?
So, what exactly is a series? Like a list in Python, a series is employed to represent 1-dimensional data. Furthermore, the Series object has some extra information, such as an index & a name. It can store numerous data types such as integers, strings, floats, NumPy array objects, etc., Using the Series() function. We can simply turn a list, tuple, or dictionary into a Series. The row labels of the Series are referred to as the index in the Pandas Series. The Panda series can only have one column.
Understanding Series is critical not just because it is one of the fundamental data structures but also because it is the foundation of a DataFrame, a 2-dimensional labeled data structure` with rows and columns, similar to a spreadsheet. Let's look at an example of data to better grasp this:
Assume we wish to represent this data in Python :
Code:
Output:
The index is in the leftmost column. It is pretty simple to evaluate the data in the manner that we have just created.
Now that you understand what we meant by series let us go on to the syntax of the pandas Series function, which is employed to generate Series objects.
Syntax & Parameter
The syntax for the Series() method, which is used to construct pandas series objects, is as follows:
Syntax:
Parameter List
It requires the following set of parameters:
Sr NO. | Parameter Name | Parameter Description |
---|---|---|
1 | data | Contains data that will be saved in Series. |
2 | index | Index values must have the same length as data and be hashable. If no index value is specified, values ranging from 0, 1, 2,..., n-1 are the default. |
3 | dtype | It is used to specify the data type that will be stored in the series object. |
4 | name | Assigns the name to the pandas series object. |
5 | copy | It accepts a Boolean value as input, set to False by default. It is used to determine whether or not to replicate input data. |
Creating a Panda Series
It's time to put our theory into practice now that we've mastered the Series() function. Let's start with something simple. Let's start with an empty Pandas Series object.
Create an Empty Pandas Series Object
An Empty Series is a type of basic series that can be formed. To further understand it, look at the code below.
Code:
Output:
Let us now change gears a little.
Create a Pandas Series Object from NumPy ndarray Object
If the data is a NumPy ndarray object, the index must have the same length as the data. If no index is specified, the index will have values ranging from [0,1,2,3....,n-1], where n is the array's length. Consider the following example:
Code:
Output:
In the last code example, we imported the NumPy library, then constructed a data array and provided it to the series method to produce a Pandas Series object.
Create a Pandas Series Object from a Dictionary
The Pandas Series() method can accept dictionaries as data. If no index is supplied, the dictionary keys are used to build the index sequentially. If the index is supplied, the data values corresponding to the index's items will be extracted. Consider the following code example to better understand it.
Code:
Output:
Create a Pandas Series Object from a Scalar
The index parameter should be supplied when the data is a scalar value. The value will be replicated till the index length is reached. Consider the following instance:
Code:
Output:
As you can see in the above code example, value 17 was repeated to construct a panda series object with a length of equal to the length of the index argument we provided.
These are just a few examples of generating Pandas Series objects. Now that we've seen how to build series objects let's examine how to access them.
Accessing Data from Panda Series along with Position
There are primarily two methods for retrieving an element from the pandas series object: using position or label. Let us go through them one by one in further detail. But first, let's examine how we may retrieve an element based on its position.
Using Position to Access an Element from the Pandas Series
The index number is employed to retrieve the Pandas Series object's element. We may use the index operator **"[]"**to retrieve an element in a pandas series. The index ought to be an integer. We implement the slice operation to retrieve numerous entries from a panda series, which returns a subarray from the list. Consider the following illustration :
Code:
Output:
Using Label to Access an Element
If we have predetermined indexes label, we may utilize them to access the Pandas Series Object. You may picture the series as a fixed-size dictionary where you can retrieve and modify values by index label. Consider the following example to better understand it:
Code:
Output:
Indexing and Selecting Data in Panda Series
In Pandas, indexing simply implies picking certain data from a Series object. For example, indexing might involve picking all the data or only a portion of the data from specific columns. Subset Selection is another name for indexing. Review the various methods for selecting specific data from Pandas Series Object individually.
Using Indexing Operator
The indexing operator refers to the square parentheses after a series object. The indexing operator is also used by the ".loc" and ".iloc" indexers (discussed later in this text). Consider the following example :
Code:
Output:
Using .loc[ ] Function
The “.loc()” method makes obtaining data values from a pandas series object simple. We may retrieve the data values fitting in the column using the “.loc()” method depending on the index value given to the function. This method retrieves data by using the explicit index. It could also be employed to select data subsets. For example:
Code:
Output:
Using .iloc[ ] Function
Pandas' "Series.iloc" property offers completely integer-location driven indexing for position selection over a particular Series object. This method enables us to gather data based on its location. To accomplish so, we'll need to identify the locations of the data we desire. The "Series.iloc" indexer is somewhat similar to "Series.loc," except it only selects integer positions. Consider the following code example:
Code:
Output:
Binary Operation on Pandas Series Object
A binary operation is a rule that permits combining two elements to generate another element. We can execute binary operations on pandas series, such as addition, subtraction, etc. We must employ functions such as Series to conduct binary operations on a series.add(), Series.sub(), and so on.
In this code example, we will use the "Series.add()" function to add two pandas Series objects.
Code:
Output:
Let's explore how we may subtract two series objects using the "Series.sub()" function.
Code:
Output:
These are just a few examples of boolean operations that we may conduct on series objects; you can always visit the Python's official documentation to learn more about the numerous different boolean functions available.
Conversion Operation on Pandas Series Object
As the name implies, in this part, we will look at different conversion operations such as altering the datatype of a series, turning a series into a list, and so on. To execute conversion operations, we commonly use functions such as "Series.astype()", "Series.tolist()", and so on.
Code:
Output:
You may have noticed that we successfully modified the data type of our pandas series object and even converted a series object to a list. So let's look at the various pandas series attributes now.
Panda Series Attribute
Python objects contain Attributes and Methods. Attributes allow us to find facts without modifying or deleting anything. A Pandas Series is one example of a Python object. Let's take a peek at some of its most regularly employed attributes. Let's start by making a Pandas Series to experiment with.
Code:
Output:
Retrieving Index Array and Data Array of a Pandas Series Object
The values attribute is used to get the data array. The values property provides an array containing all of the series' values.
Code:
Output:
To obtain an index array, we may utilize the index attribute. For example, the index attribute returns the RangeIndex object.
Code:
Output:
The RangeIndex object has three arguments: start, end, and step. We can see that it begins at 0 and ends at 8. The final part is known as step, which tells us that it is incrementing by one in our case.
Retrieving Types (dtype) and Size of Type (itemsize)
Using the dtype attribute, we can determine the data type of our panda series object. For Example:
Code:
Output:
In the preceding code example, we obtained the output dtype('o') , which is an abbreviation for object.
Let's look at how to compute the number of bytes needed to hold the underlying data. The itemsize property is often employed to obtain the size of a pandas series' elements. However, the Itemsize has been deleted in updated versions because it is no longer useful. It is because the size of most variables (perhaps all) has always been 8 bytes, and employing it would often yield 8.
However, you may use the syntax listed below to see how it functions if you still wish to:
Code:
Output:
As you can see in the above example, the itemsize attribute returns 8, indicating that the underlying data for the supplied Pandas Series object is 8 bytes in size.
Retrieving Shape
We will utilize the shape attribute to get the shape of our Pandas series object. The shape attribute provides a tuple with several rows and columns.
Code:
Output:
Retrieving Dimension, Size, and Number of bytes
We will utilize the ndim, nbytes, and size attributes to retrieve the dimension, number of bytes, and size, respectively. Pandas' nbytes attribute returns the size of the underlying data item's dtype for the specified Pandas Series object. The size attribute gives the number of Series elements. In addition, there is also the ndim attribute, which stands for the number of dimensions, and a Panda Series Object is always a one-dimensional object. Let's look at a code sample to better comprehend it.
Code:
Output:
Checking Emptiness and Presence of NaNs
The empty attribute provides a True/False same-sized object indicating whether or not the values are missing, or in other words, whether or not the values are Nans. What exactly do you mean by Nans over here? Pandas will employ NaN if it finds that a series contains numeric values but cannot identify a numeral representing an entry. This value represents Not A Number and is often disregarded in arithmetic operations. For instance,
Code:
Output:
Panda Series Function
Methods are actions/ operations performed on the pandas series object. For example, they may modify a pandas series, subtract value from it, or perform some computation using the series object's values. Let's take a look at some of the Pandas series' methods.
Displaying Rows from a Pandas Series Object
Head() and tail() functions retrieve the first and last n rows, respectively. If we don't specify a number for n, it defaults to 5. They are handy for swiftly verifying data, such as when we have a large amount of data.
Code:
Output:
And
Code:
Output:
Count of Pandas Series Object's Values
The unique() and nunique() functions return the distinct values and the count of distinct values. These methods are useful when we want to check the many groups into which our data may have been divided. As an example,
Code:
Output:
We have the value_counts() function in addition to the unique() and nunique() functions. The value_counts() function outputs the count of times each unique value appears in a Series. It is beneficial to understand the pandas series object's value distribution. Consider the following example:
Code:
Output:
Performing Statistical Calculation on Pandas Series Object’s Values
On a Series object's values, we may execute statistical operation s such as mean(), sum(), product(), max(), min(), median() and so on . For example:
Code :
Output:
If we require numerous statistical operations to be performed simultaneously, we may send them to the agg() function as a list.
Code:
Output:
When it comes to Pandas Series methods, this is only the tip of the iceberg; you can visit the Python's official documentation to learn about all of the many ways available to a series object.
Kudos :tada: ! You now have a strong grip on numerous Pandas Series object concepts. You can now experiment with pandas series objects with ease.
Conclusion
This article taught us:
- Pandas Series arrays are more adaptable than Numpy arrays. Pandas Series have a formally defined index, whereas Numpy arrays have an implicitly defined integer index.
- Pandas Series is a fundamental data structure in Pandas and the foundation of a DataFrame.
- We can perform CRUD operations on Pandas Series Objects.
- The Pandas Series provides us with diverse attributes and methods to work with.
- We can simply execute statistical calculations on the values of Pandas Series objects to gain a powerful understanding of our data.