Indexing-Selection-Filtering in Pandas¶
In [1]:
#import the libraries
import pandas as pd
import numpy as np
1. Series Indexing¶
In [2]:
a=pd.Series([14,68,46,24,83],index=["a","b","c","d","e"])
a
Out[2]:
a 14 b 68 c 46 d 24 e 83 dtype: int64
selecting the value at index d.¶
In [3]:
a['d']
Out[3]:
24
You can do the same thing by entering the index number in square brackets.¶
In [4]:
a[2]
Out[4]:
46
Selecting multiple rows by using slicing.¶
In [5]:
a[0:4]
Out[5]:
a 14 b 68 c 46 d 24 dtype: int64
Selecting specific rows.¶
In [6]:
a[["a","c"]]
Out[6]:
a 14 c 46 dtype: int64
Selecting specific rows by entering the index number.¶
In [7]:
a[[0,3]]
Out[7]:
a 14 d 24 dtype: int64
2. Filtering in Series¶
print the values less than 40¶
In [8]:
a[a<40]
Out[8]:
a 14 d 24 dtype: int64
Assigning a value to the sliced part.¶
In [9]:
a['a':'c']=77
a
Out[9]:
a 77 b 77 c 77 d 24 e 83 dtype: int64
3. DataFrame Indexing¶
In [10]:
data = [['Robin',26,45.34],['Karan',25,78.5],['Priya',23,87.67],['Varun',22,56],['Keisha',23,97]]
df=pd.DataFrame(data,columns=['Name','Age','Marks'])
df
Out[10]:
Name | Age | Marks | |
---|---|---|---|
0 | Robin | 26 | 45.34 |
1 | Karan | 25 | 78.50 |
2 | Priya | 23 | 87.67 |
3 | Varun | 22 | 56.00 |
4 | Keisha | 23 | 97.00 |
Selecting the column named Name¶
In [11]:
df['Name']
Out[11]:
0 Robin 1 Karan 2 Priya 3 Varun 4 Keisha Name: Name, dtype: object
Selecting multiple columns¶
In [12]:
df[['Name','Marks']]
Out[12]:
Name | Marks | |
---|---|---|
0 | Robin | 45.34 |
1 | Karan | 78.50 |
2 | Priya | 87.67 |
3 | Varun | 56.00 |
4 | Keisha | 97.00 |
slicing the rows.¶
In [13]:
df[:2]
Out[13]:
Name | Age | Marks | |
---|---|---|---|
0 | Robin | 26 | 45.34 |
1 | Karan | 25 | 78.50 |
Filtering in DataFrame¶
In [14]:
df[df["Marks"]>60]
Out[14]:
Name | Age | Marks | |
---|---|---|---|
1 | Karan | 25 | 78.50 |
2 | Priya | 23 | 87.67 |
4 | Keisha | 23 | 97.00 |
Selecting Rows and Columns with loc and iloc method¶
- .loc() Label based
- .iloc() Integer based
Using iloc [ ] method¶
- df.iloc[row no, col no]
In [15]:
df
Out[15]:
Name | Age | Marks | |
---|---|---|---|
0 | Robin | 26 | 45.34 |
1 | Karan | 25 | 78.50 |
2 | Priya | 23 | 87.67 |
3 | Varun | 22 | 56.00 |
4 | Keisha | 23 | 97.00 |
select 2nd index row¶
In [16]:
df.iloc[2]
Out[16]:
Name Priya Age 23 Marks 87.67 Name: 2, dtype: object
Selecting specific columns (Name, Marks) of multiple rows (1 and 3)¶
In [17]:
df.iloc[[1,3],[0,2]]
Out[17]:
Name | Marks | |
---|---|---|
1 | Karan | 78.5 |
3 | Varun | 56.0 |
selecting all rows and all columns¶
In [18]:
df
Out[18]:
Name | Age | Marks | |
---|---|---|---|
0 | Robin | 26 | 45.34 |
1 | Karan | 25 | 78.50 |
2 | Priya | 23 | 87.67 |
3 | Varun | 22 | 56.00 |
4 | Keisha | 23 | 97.00 |
or¶
In [19]:
df.iloc[:,:]
Out[19]:
Name | Age | Marks | |
---|---|---|---|
0 | Robin | 26 | 45.34 |
1 | Karan | 25 | 78.50 |
2 | Priya | 23 | 87.67 |
3 | Varun | 22 | 56.00 |
4 | Keisha | 23 | 97.00 |
selecting all rows and columns (Name to Age)¶
In [20]:
df.iloc[:,:2]
Out[20]:
Name | Age | |
---|---|---|
0 | Robin | 26 |
1 | Karan | 25 |
2 | Priya | 23 |
3 | Varun | 22 |
4 | Keisha | 23 |
selecting all rows and columns (Name and Marks)¶
In [21]:
df.iloc[:,[0,2]]
Out[21]:
Name | Marks | |
---|---|---|
0 | Robin | 45.34 |
1 | Karan | 78.50 |
2 | Priya | 87.67 |
3 | Varun | 56.00 |
4 | Keisha | 97.00 |
2. Using loc Method¶
In [22]:
df
Out[22]:
Name | Age | Marks | |
---|---|---|---|
0 | Robin | 26 | 45.34 |
1 | Karan | 25 | 78.50 |
2 | Priya | 23 | 87.67 |
3 | Varun | 22 | 56.00 |
4 | Keisha | 23 | 97.00 |
select 2nd index row¶
In [23]:
df.loc[2]
Out[23]:
Name Priya Age 23 Marks 87.67 Name: 2, dtype: object
Selecting specific columns (Name, Marks) of multiple rows (1 and 3)¶
In [24]:
df.iloc[[1,3],[0,2]]
Out[24]:
Name | Marks | |
---|---|---|
1 | Karan | 78.5 |
3 | Varun | 56.0 |
selecting all rows and all columns¶
In [25]:
df
Out[25]:
Name | Age | Marks | |
---|---|---|---|
0 | Robin | 26 | 45.34 |
1 | Karan | 25 | 78.50 |
2 | Priya | 23 | 87.67 |
3 | Varun | 22 | 56.00 |
4 | Keisha | 23 | 97.00 |
or¶
In [26]:
df.loc[:,:]
Out[26]:
Name | Age | Marks | |
---|---|---|---|
0 | Robin | 26 | 45.34 |
1 | Karan | 25 | 78.50 |
2 | Priya | 23 | 87.67 |
3 | Varun | 22 | 56.00 |
4 | Keisha | 23 | 97.00 |
selecting all rows and columns (Name to Age)¶
In [27]:
df.loc[:,:'Age']
Out[27]:
Name | Age | |
---|---|---|
0 | Robin | 26 |
1 | Karan | 25 |
2 | Priya | 23 |
3 | Varun | 22 |
4 | Keisha | 23 |
selecting all rows and columns (Name and Marks)¶
In [28]:
df.loc[:,['Name','Marks']]
Out[28]:
Name | Marks | |
---|---|---|
0 | Robin | 45.34 |
1 | Karan | 78.50 |
2 | Priya | 87.67 |
3 | Varun | 56.00 |
4 | Keisha | 97.00 |
3. Methods to Select a Single Columns as Series or DataFrame¶
Method 1¶
- Selection Single Column as Series
In [29]:
df['Name']
Out[29]:
0 Robin 1 Karan 2 Priya 3 Varun 4 Keisha Name: Name, dtype: object
Method 2¶
- Selection single columns as DataFrame
In [30]:
df[['Name']]
Out[30]:
Name | |
---|---|
0 | Robin |
1 | Karan |
2 | Priya |
3 | Varun |
4 | Keisha |
Method 3¶
- Selection Single Column as Series
In [31]:
df.Name
Out[31]:
0 Robin 1 Karan 2 Priya 3 Varun 4 Keisha Name: Name, dtype: object
Method 4¶
- Selection Single Column as Series
In [32]:
df.iloc[:,0]
Out[32]:
0 Robin 1 Karan 2 Priya 3 Varun 4 Keisha Name: Name, dtype: object
Method 5¶
- Selection single columns as DataFrame
In [33]:
df.iloc[:,0:1]
Out[33]:
Name | |
---|---|
0 | Robin |
1 | Karan |
2 | Priya |
3 | Varun |
4 | Keisha |
Method 6¶
- Selection Single Column as Series
In [34]:
df.loc[:,'Name']
Out[34]:
0 Robin 1 Karan 2 Priya 3 Varun 4 Keisha Name: Name, dtype: object
Method 7¶
- Selection single columns as DataFrame
In [35]:
df.loc[:,'Name':'Name']
Out[35]:
Name | |
---|---|
0 | Robin |
1 | Karan |
2 | Priya |
3 | Varun |
4 | Keisha |