Read Excel File in Python
Learn via video course
Abstract
Excel files are spreadsheet documents in which data is stored in form of tables. These files can be read in python by using the xlrd module. The retreived data can also be filtered according to user's choice.
Introduction
Let's assume you need to train a Machine Learning model on an excel dataset of 10000 students. Ever wondered how would you transfer so much data to Python. Well the answer is reading Excel file in Python.
Python files can be read in python by using the xlrd module. Records stored in an excel file can be read and processed in a python program using the xlrd module.
Excel Documents
Excel documents are spreadsheets. Excel documents are stored with the .xlsx or .xls extension. Excel files have data stored in the form of tables having rows and columns. Excel is a versatile tool. It has provisions for mathematical formulae, graphs and many other things. Excel is used in almost each and every industry and is thus an important software.
Below is a sample excel file.
:::
How to Read Data From Excel File in Python
Data from an excel file can be read in Python using the xlrd module of python. xlrd provides functions to filter sheets from an excel file and not read the whole workbook.
Data from a specific cell of an excel file can be read in python by using cell_value() which takes row and column number of that sepcific cell as its input. Let's see an example:
- import xlrd
- workbook = xlrd.open_workbook('file_name.xlsx')
- worksheet = workbook.sheet_by_index(sheet_index)
- worksheet.cell_value(row,column)
In this sample above, first we have imported the xlrd module required for working with excel files in python. Then we have created a reference workbook pointing to our excel file, using the xlrd.open_workbook() function. file_name.xlsx is the name of the excel file we wish to open. Then we have created a variable worksheet referencing to the sheet at index sheet_index, where sheet_index specifies the index of the sheet we wish to open. Indices in excel start from 1.
Finally the worksheet.cell_value() is used to get the value of the cell a the row, column position in the excel file. Here row specifies the row index and column specifies the column index.
The output of this code would the value at the specified cell in the excel file. This is how an excel file can be read in python.
Example
Suppose we have to read the excel file shown in the above image. We first opened is as a workbook using the xlrd.open() command.
Then we have the first sheet using the sheet_by_index() command. Then we have read the values at cells (1,2) and (2,2) and have got the data present at those cells as output.
Source Code
Output
Conclusion
- In this example we have read about the need to read excel file in python.
- That is followed by a description of need and importance of excel files.
- Then we have seen the sample syntax to read an excel file in python.
- That is followed by a real life working example of reading an excel file in python.