What is Groupby.count in Pandas?
Learn via video courses
Pandas groupby() function is used to group similar data into groups and execute aggregate operations like size/count on the grouped data.
The groupby() function and count() function of Pandas can be used together to group the columns and calculate the count or size aggregate. This generates a row count for each group combination. In simple words, the pandas groupby count function is used to count the values in each group by ignoring the NaN or missing values in the DataFrame.
Syntax
Let's see the syntax for the groupby() function :
Following is the syntax for the pandas groupby count function i.e. using the groupby and count functions together :
Parameters
There are required and optional parameters for the groupby function. Let's take a quick view of each parameter one by one.
Required Parameter:
- by :
mapping, function, str, or iterable. It specifies how to group the DataFrame.
Optional Parameters:
-
axis :
int, default . It specifies along which axis the DataFrame should be split. Split the axis along the rows for 'axis=0' and split the axis along columns for 'axis=1'. -
level:
Group by one or more specific levels if the axis is a MultiIndex (hierarchical). -
as_index:
Return an object with group labels as the index for aggregated output. Applicable only to DataFrame input. Effectively, "SQL-style" grouped output is when as_index=False. -
sort :
Group keys are sorted. By turning this off, the performance will improve. Note that the sequence of observations within each group is unaffected by this. The order of the rows inside each group is maintained by groupby. -
group_keys:
Add group keys to the index when calling apply so that parts can be identified. -
squeeze:
If at all possible, reduce the return type's dimensionality; otherwise, return a consistent type.
Return Value
The Pandas GroupBy() function returns the GroupBy object that contains the information about the groups of similar data from the pandas DataFrame.
Examples
Let's see some examples of using the groupby() and .count() functions together.
Basic Pandas groupby Usage
Let's create a Dataset using Pandas on which we will perform various methods.
Code - 1 :
Output:
Explanation:
In the above example, Pandas is imported as pd, and DataFrame df is created from the Dictionary data using pd.DataFrame().
Our DataSet as pandas DataFramedf is created. Now let's apply the groupby() function on it.
Code - 2 :
Output:
Explanation:
In the above code example, the.groupby() function is used on Salary to group Salaries in one separate group.
Code - 3 :
Output:
Explanation:
In the above code example, Salary is grouped using the groupby() function and stored in Grouped_Salary. The groups containing the same Value of Salary are printed using .groups like indexes 2 and 8 have the same Salaries, so they are in the same group. Index 0, 5, and 7 have the same group as they have the same values of Salaries and so on.
A tuple containing all the match's subgroups is returned by the groups() method.
Code - 4 :
Output:
Explanation:
In this example, the First_Name column is divided into groups using the groupby() function and stored in Grouped_data, and that grouped data is printed using the .groups method of python as explained in Code#3. Index 4 and 10 have the same First name so they are in one group.
How Do groupby Columns Get the Count for Each Group from DataFrame?
We can use Groupby on columns by splitting the columns into groups and can find the count for each group using the pandas groupby count method as groupby().count().
Let's see how to do this with the help of the following examples.
Code - 5:
Output:
Explanation:
In the above code example, groupby().count() is applied to the Salary column of the DataFrame. The similar data is divided into the group like salary with value 56000 now have one group with count 2, i.e. 56000 is present two times in the salary column and so on. In this way, the pandas groupby count() method is applied to the DataFrame.
Code - 6:
Output:
Explanation:
Here the same values in the Age column are grouped using the .groupby() function, and .count() is used here to calculate the number of times the same value appears in a group. In this example Age column is grouped as 24 is present two times, 22 is present only one time, etc.
We can also use the pandas groupby count function on multiple columns. Let's try to understand how to do this with the help of the examples given below.
Code - 7:
Output:
Explanation:
In the above code example, the panda's groupby count function is applied to the DataFrame first. The data is grouped, like in the First_Name column, where there are two of the same names, i.e., Akash, so there are corresponding salaries grouped, and each salary value has a count of 1 in the group. In this way, multiple columns are grouped.
How to Use count() by Column Name?
We can get the count of the grouped data by using the pandas groupby count function on the column name of the Dataset.
There are some examples given below in which the panda's groupby count function is used on the name of the column.
Code - 8:
Output:
Explanation:
In the above example, items of the Last_Name column are grouped using the groupby() function, and .count() is used to count similar items. Here, Jamwal is present 2 times, Mudgal is present 2 times, and so on.
Code - 9:
Output:
Explanation:
Here, The items of the Salary column are grouped and counted the similar values.
Groupby count() Function to Give the “count” of Values in Each Column for Each Group
Now, let's see, with the help of examples given below, how to get the count of each value in each column for each group using the pandas groupby count function.
Code - 10:
Output:
Explanation:
In the above example, the items of the Salary column are grouped using the groupby() function, and counts each value in each column is using the count() function. It works like there are 2 values in the First_Name column, 2 items in the Last_Name column, and 2 items in the Age column with the same salary value as '56000', like this the pandas groupby count function is applied on whole data.
Code - 11:
Output:
Explanation:
As explained in code - 10, items of the First_Name column are grouped, and then using the .count() function count of each value in each column is given.
Conclusion
- The .groupby() function of pandas is used to group similar data and helps to perform operations on the grouped data.
- The pandas groupby count function of python is used to count the number of times a value appears in the data.
- The .count() and .groupby() functions can be used together to count the items in each group.
- We can group the data and then can count the items in each group using the groupby.count() function.
- The column name can also be passed to the pandas groupby count() method.
- The pandas groupby count() function can also be used on multiple columns.