In this tutorial, we will see how we can read Excel file in pandas using examples.

Read Excel file in Pandas as Data Frame

read_excel() method of pandas will read the data from excel files having xls, xlsx, xlsm, xlsb, odf, ods and odt file extensions as a pandas data-frame and also provide some arguments to give some flexibility according to the requirement.

Pandas read_excel

The official documentation provides the syntax below, We will learn the most commonly used arguments of this method in the following sections with examples.

pandas.read_excel(*args, **kwargs)

1. Reading an Excel file:

In this example, we will try to read an Excel file with the file name and sheet name defined below and then customize the reading using different arguments along with the file path.

File name - Fruit.xlsx
Sheet name - sweet or sour
FruitSweetnessSorenessFruit Type
Lemon19Sour
Grapefruit28Sour
Orange37Sour
Raspberry28Sour
Cherry64Sweet
Banana91Sweet
Grapes82Sweet
Watermelon91Sweet
Avacado11None
Strawberry55Sour
  • file-path – This is the path to the file in string format.
  • sheet_name – Name of the sheet to open, or its 0-indexed integer number.
  • header – integer list of rows to be used as the columns. If multiple rows are passed then we will get a multi-column index data.

See the code below where we will use these arguments to read the file.

# importing pandas module
import pandas as pd
# defining File path
fpath = "F:/onlinetutorialspoint/Fruit.xlsx"
# method to be used to read the excel file
data2 = pd.read_excel(fpath,sheet_name='sweet or sour',header=[0])
print(data2)

Output:

        Fruit  Sweetness  Soreness Fruit Type
0       Lemon          1         9       Sour
1  Grapefruit          2         8       Sour
2      Orange          3         7       Sour
3   Raspberry          2         8       Sour
4      Cherry          6         4      Sweet
5      Banana          9         1      Sweet
6      Grapes          8         2      Sweet
7  Watermelon          9         1      Sweet
8     Avacado          1         1       None
9  Strawberry          5         5       Sour

2. Reading custom no. of rows and columns:

2.1. Reading selective column and index

  • usecols – List of column names from data to be read.
  • index_col – This defines the names of row labels, it can be a column from the data or the list of integer or string, None by default.

Let’s see an example code to read the data by changing a column to index in data.

import pandas as pd
# File path
fpath = "F:/onlinetutorialspoint/Fruit.xlsx"
# method to be used to read the data
data2 = pd.read_excel(fpath,usecols=['Fruit','Sweetness','Soreness'],index_col='Fruit',nrows = 5)
print(data2)

Output:

            Sweetness  Soreness
Fruit
Lemon               1         9
Grapefruit          2         8
Orange              3         7
Raspberry           2         8
Cherry              6         4
Banana              9         1
Grapes              8         2
Watermelon          9         1
Avacado             1         1
Strawberry          5         5

2.2. Reading selective rows

  • skiprows – list of rows number / No. or rows to be skipped from the top. It is 0-indexed.
  • skipfooter – No. or rows to be skipped from the bottom.
  • nrows – The number of rows to be read from the file.

Let’s see an example code to read only 4 rows from data without reading the top 2 rows.

import pandas as pd
fpath = "F:/onlinetutorialspoint/Fruit.xlsx"
data3 = pd.read_excel(fpath,header=None,nrows=4,skiprows=2)
print(data3)

Output:

            0  1  2      3
0  Grapefruit  2  8   Sour
1      Orange  3  7   Sour
2   Raspberry  2  8   Sour
3      Cherry  6  4  Sweet

These are the most commonly used arguments that are used when reading an Excel file in pandas.

References

Happy Learning 🙂

About the Author:

Aspiring Data Scientist who loves Python Programming, Software Development and wants to Solve Real-world Problems.

Leave A Comment