In [1]:
import numpy as np
import pandas as pd
In [2]:
data=pd.read_csv('Melbourne.csv')
OrgData=data
OrgData
Out[2]:
Suburb | Address | Rooms | Type | Price | Method | SellerG | Date | Distance | Postcode | … | Bathroom | Car | Landsize | BuildingArea | YearBuilt | CouncilArea | Lattitude | Longtitude | Regionname | Propertycount | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Abbotsford | 68 Studley St | 2 | h | NaN | SS | Jellis | 03-09-2016 | 2.5 | 3067.0 | … | 1.0 | 1.0 | 126.0 | NaN | NaN | Yarra | -37.80140 | 144.99580 | Northern Metropolitan | 4019.0 |
1 | Abbotsford | 85 Turner St | 2 | h | 1480000.0 | S | Biggin | 03-12-2016 | 2.5 | 3067.0 | … | 1.0 | 1.0 | 202.0 | NaN | NaN | Yarra | -37.79960 | 144.99840 | Northern Metropolitan | 4019.0 |
2 | Abbotsford | 25 Bloomburg St | 2 | h | 1035000.0 | S | Biggin | 04-02-2016 | 2.5 | 3067.0 | … | 1.0 | 0.0 | 156.0 | 79.0 | 1900.0 | Yarra | -37.80790 | 144.99340 | Northern Metropolitan | 4019.0 |
3 | Abbotsford | 18/659 Victoria St | 3 | u | NaN | VB | Rounds | 04-02-2016 | 2.5 | 3067.0 | … | 2.0 | 1.0 | 0.0 | NaN | NaN | Yarra | -37.81140 | 145.01160 | Northern Metropolitan | 4019.0 |
4 | Abbotsford | 5 Charles St | 3 | h | 1465000.0 | SP | Biggin | 04-03-2017 | 2.5 | 3067.0 | … | 2.0 | 0.0 | 134.0 | 150.0 | 1900.0 | Yarra | -37.80930 | 144.99440 | Northern Metropolitan | 4019.0 |
… | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … |
23542 | Wyndham Vale | 25 Clitheroe Dr | 3 | u | NaN | PN | Harcourts | 26-08-2017 | 27.2 | 3024.0 | … | 1.0 | 0.0 | 552.0 | 119.0 | 1990.0 | NaN | -37.90032 | 144.61839 | Western Metropolitan | 5262.0 |
23543 | Wyndham Vale | 19 Dalrymple Bvd | 4 | h | NaN | S | hockingstuart | 26-08-2017 | 27.2 | 3024.0 | … | NaN | NaN | NaN | NaN | NaN | NaN | -37.87882 | 144.60184 | Western Metropolitan | 5262.0 |
23544 | Yallambie | 17 Amaroo Wy | 4 | h | 1100000.0 | S | Buckingham | 26-08-2017 | 12.7 | 3085.0 | … | 3.0 | 2.0 | NaN | NaN | NaN | NaN | -37.72006 | 145.10547 | Northern Metropolitan | 1369.0 |
23545 | Yarraville | 6 Agnes St | 4 | h | 1285000.0 | SP | Village | 26-08-2017 | 6.3 | 3013.0 | … | 1.0 | 1.0 | 362.0 | 112.0 | 1920.0 | NaN | -37.81188 | 144.88449 | Western Metropolitan | 6543.0 |
23546 | Yarraville | 33 Freeman St | 4 | h | 1050000.0 | VB | Village | 26-08-2017 | 6.3 | 3013.0 | … | 2.0 | 2.0 | NaN | 139.0 | 1950.0 | NaN | -37.81829 | 144.87404 | Western Metropolitan | 6543.0 |
23547 rows × 21 columns
In [3]:
data.shape
Out[3]:
(23547, 21)
In [4]:
data.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 23547 entries, 0 to 23546 Data columns (total 21 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Suburb 23547 non-null object 1 Address 23547 non-null object 2 Rooms 23547 non-null int64 3 Type 23547 non-null object 4 Price 18396 non-null float64 5 Method 23547 non-null object 6 SellerG 23547 non-null object 7 Date 23547 non-null object 8 Distance 23546 non-null float64 9 Postcode 23546 non-null float64 10 Bedroom2 19066 non-null float64 11 Bathroom 19063 non-null float64 12 Car 18921 non-null float64 13 Landsize 17410 non-null float64 14 BuildingArea 10018 non-null float64 15 YearBuilt 11540 non-null float64 16 CouncilArea 15656 non-null object 17 Lattitude 19243 non-null float64 18 Longtitude 19243 non-null float64 19 Regionname 23546 non-null object 20 Propertycount 23546 non-null float64 dtypes: float64(12), int64(1), object(8) memory usage: 3.8+ MB
Summing up the missing values (column-wise)¶
In [5]:
data.isnull().sum()
Out[5]:
Suburb 0 Address 0 Rooms 0 Type 0 Price 5151 Method 0 SellerG 0 Date 0 Distance 1 Postcode 1 Bedroom2 4481 Bathroom 4484 Car 4626 Landsize 6137 BuildingArea 13529 YearBuilt 12007 CouncilArea 7891 Lattitude 4304 Longtitude 4304 Regionname 1 Propertycount 1 dtype: int64
In [6]:
data.isnull().sum(axis=0)
Out[6]:
Suburb 0 Address 0 Rooms 0 Type 0 Price 5151 Method 0 SellerG 0 Date 0 Distance 1 Postcode 1 Bedroom2 4481 Bathroom 4484 Car 4626 Landsize 6137 BuildingArea 13529 YearBuilt 12007 CouncilArea 7891 Lattitude 4304 Longtitude 4304 Regionname 1 Propertycount 1 dtype: int64
Summing up the missing values (row-wise)¶
In [7]:
data.isnull().sum(axis=1)
Out[7]:
0 3 1 2 2 0 3 3 4 0 .. 23542 2 23543 8 23544 4 23545 1 23546 2 Length: 23547, dtype: int64
Columns having at least one missing value¶
In [8]:
data.isnull().sum() > 0
Out[8]:
Suburb False Address False Rooms False Type False Price True Method False SellerG False Date False Distance True Postcode True Bedroom2 True Bathroom True Car True Landsize True BuildingArea True YearBuilt True CouncilArea True Lattitude True Longtitude True Regionname True Propertycount True dtype: bool
In [9]:
d=data.isnull().any()
d.index
Out[9]:
Index(['Suburb', 'Address', 'Rooms', 'Type', 'Price', 'Method', 'SellerG', 'Date', 'Distance', 'Postcode', 'Bedroom2', 'Bathroom', 'Car', 'Landsize', 'BuildingArea', 'YearBuilt', 'CouncilArea', 'Lattitude', 'Longtitude', 'Regionname', 'Propertycount'], dtype='object')
In [10]:
d.index[d.values]
Out[10]:
Index(['Price', 'Distance', 'Postcode', 'Bedroom2', 'Bathroom', 'Car', 'Landsize', 'BuildingArea', 'YearBuilt', 'CouncilArea', 'Lattitude', 'Longtitude', 'Regionname', 'Propertycount'], dtype='object')
By default any() operates on columns¶
In [11]:
data.isnull().any(axis=0)
Out[11]:
Suburb False Address False Rooms False Type False Price True Method False SellerG False Date False Distance True Postcode True Bedroom2 True Bathroom True Car True Landsize True BuildingArea True YearBuilt True CouncilArea True Lattitude True Longtitude True Regionname True Propertycount True dtype: bool
Find the missing value row wise if any found return True/False¶
In [12]:
d=data.isnull().any(axis=1)
i=d.index[d.values]
data.iloc[i,:]
Out[12]:
Suburb | Address | Rooms | Type | Price | Method | SellerG | Date | Distance | Postcode | … | Bathroom | Car | Landsize | BuildingArea | YearBuilt | CouncilArea | Lattitude | Longtitude | Regionname | Propertycount | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Abbotsford | 68 Studley St | 2 | h | NaN | SS | Jellis | 03-09-2016 | 2.5 | 3067.0 | … | 1.0 | 1.0 | 126.0 | NaN | NaN | Yarra | -37.80140 | 144.99580 | Northern Metropolitan | 4019.0 |
1 | Abbotsford | 85 Turner St | 2 | h | 1480000.0 | S | Biggin | 03-12-2016 | 2.5 | 3067.0 | … | 1.0 | 1.0 | 202.0 | NaN | NaN | Yarra | -37.79960 | 144.99840 | Northern Metropolitan | 4019.0 |
3 | Abbotsford | 18/659 Victoria St | 3 | u | NaN | VB | Rounds | 04-02-2016 | 2.5 | 3067.0 | … | 2.0 | 1.0 | 0.0 | NaN | NaN | Yarra | -37.81140 | 145.01160 | Northern Metropolitan | 4019.0 |
5 | Abbotsford | 40 Federation La | 3 | h | 850000.0 | PI | Biggin | 04-03-2017 | 2.5 | 3067.0 | … | 2.0 | 1.0 | 94.0 | NaN | NaN | Yarra | -37.79690 | 144.99690 | Northern Metropolitan | 4019.0 |
7 | Abbotsford | 16 Maugie St | 4 | h | NaN | SN | Nelson | 06-08-2016 | 2.5 | 3067.0 | … | 2.0 | 2.0 | 400.0 | 220.0 | 2006.0 | Yarra | -37.79650 | 144.99650 | Northern Metropolitan | 4019.0 |
… | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … |
23542 | Wyndham Vale | 25 Clitheroe Dr | 3 | u | NaN | PN | Harcourts | 26-08-2017 | 27.2 | 3024.0 | … | 1.0 | 0.0 | 552.0 | 119.0 | 1990.0 | NaN | -37.90032 | 144.61839 | Western Metropolitan | 5262.0 |
23543 | Wyndham Vale | 19 Dalrymple Bvd | 4 | h | NaN | S | hockingstuart | 26-08-2017 | 27.2 | 3024.0 | … | NaN | NaN | NaN | NaN | NaN | NaN | -37.87882 | 144.60184 | Western Metropolitan | 5262.0 |
23544 | Yallambie | 17 Amaroo Wy | 4 | h | 1100000.0 | S | Buckingham | 26-08-2017 | 12.7 | 3085.0 | … | 3.0 | 2.0 | NaN | NaN | NaN | NaN | -37.72006 | 145.10547 | Northern Metropolitan | 1369.0 |
23545 | Yarraville | 6 Agnes St | 4 | h | 1285000.0 | SP | Village | 26-08-2017 | 6.3 | 3013.0 | … | 1.0 | 1.0 | 362.0 | 112.0 | 1920.0 | NaN | -37.81188 | 144.88449 | Western Metropolitan | 6543.0 |
23546 | Yarraville | 33 Freeman St | 4 | h | 1050000.0 | VB | Village | 26-08-2017 | 6.3 | 3013.0 | … | 2.0 | 2.0 | NaN | 139.0 | 1950.0 | NaN | -37.81829 | 144.87404 | Western Metropolitan | 6543.0 |
17351 rows × 21 columns
Columns having all missing values¶
In [13]:
data.isnull().all(axis=0)
Out[13]:
Suburb False Address False Rooms False Type False Price False Method False SellerG False Date False Distance False Postcode False Bedroom2 False Bathroom False Car False Landsize False BuildingArea False YearBuilt False CouncilArea False Lattitude False Longtitude False Regionname False Propertycount False dtype: bool
Rows having all missing values¶
In [14]:
data.isnull().all(axis=1).sum()
Out[14]:
0
Summing up the missing values (column-wise) : Cal in %¶
In [15]:
data.isnull().sum().sort_values(ascending=False)/len(data)*100
Out[15]:
BuildingArea 57.455302 YearBuilt 50.991634 CouncilArea 33.511700 Landsize 26.062768 Price 21.875398 Car 19.645815 Bathroom 19.042766 Bedroom2 19.030025 Longtitude 18.278337 Lattitude 18.278337 Regionname 0.004247 Propertycount 0.004247 Postcode 0.004247 Distance 0.004247 Address 0.000000 Date 0.000000 SellerG 0.000000 Method 0.000000 Type 0.000000 Rooms 0.000000 Suburb 0.000000 dtype: float64
In [16]:
col=data.isnull().sum().sort_values(ascending=False).head(3).index.values
col
Out[16]:
array(['BuildingArea', 'YearBuilt', 'CouncilArea'], dtype=object)
Remoing the three columns where the max null value percentage¶
In [17]:
data=data.drop(col,axis='columns')
data
Out[17]:
Suburb | Address | Rooms | Type | Price | Method | SellerG | Date | Distance | Postcode | Bedroom2 | Bathroom | Car | Landsize | Lattitude | Longtitude | Regionname | Propertycount | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Abbotsford | 68 Studley St | 2 | h | NaN | SS | Jellis | 03-09-2016 | 2.5 | 3067.0 | 2.0 | 1.0 | 1.0 | 126.0 | -37.80140 | 144.99580 | Northern Metropolitan | 4019.0 |
1 | Abbotsford | 85 Turner St | 2 | h | 1480000.0 | S | Biggin | 03-12-2016 | 2.5 | 3067.0 | 2.0 | 1.0 | 1.0 | 202.0 | -37.79960 | 144.99840 | Northern Metropolitan | 4019.0 |
2 | Abbotsford | 25 Bloomburg St | 2 | h | 1035000.0 | S | Biggin | 04-02-2016 | 2.5 | 3067.0 | 2.0 | 1.0 | 0.0 | 156.0 | -37.80790 | 144.99340 | Northern Metropolitan | 4019.0 |
3 | Abbotsford | 18/659 Victoria St | 3 | u | NaN | VB | Rounds | 04-02-2016 | 2.5 | 3067.0 | 3.0 | 2.0 | 1.0 | 0.0 | -37.81140 | 145.01160 | Northern Metropolitan | 4019.0 |
4 | Abbotsford | 5 Charles St | 3 | h | 1465000.0 | SP | Biggin | 04-03-2017 | 2.5 | 3067.0 | 3.0 | 2.0 | 0.0 | 134.0 | -37.80930 | 144.99440 | Northern Metropolitan | 4019.0 |
… | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … |
23542 | Wyndham Vale | 25 Clitheroe Dr | 3 | u | NaN | PN | Harcourts | 26-08-2017 | 27.2 | 3024.0 | 3.0 | 1.0 | 0.0 | 552.0 | -37.90032 | 144.61839 | Western Metropolitan | 5262.0 |
23543 | Wyndham Vale | 19 Dalrymple Bvd | 4 | h | NaN | S | hockingstuart | 26-08-2017 | 27.2 | 3024.0 | NaN | NaN | NaN | NaN | -37.87882 | 144.60184 | Western Metropolitan | 5262.0 |
23544 | Yallambie | 17 Amaroo Wy | 4 | h | 1100000.0 | S | Buckingham | 26-08-2017 | 12.7 | 3085.0 | 4.0 | 3.0 | 2.0 | NaN | -37.72006 | 145.10547 | Northern Metropolitan | 1369.0 |
23545 | Yarraville | 6 Agnes St | 4 | h | 1285000.0 | SP | Village | 26-08-2017 | 6.3 | 3013.0 | 4.0 | 1.0 | 1.0 | 362.0 | -37.81188 | 144.88449 | Western Metropolitan | 6543.0 |
23546 | Yarraville | 33 Freeman St | 4 | h | 1050000.0 | VB | Village | 26-08-2017 | 6.3 | 3013.0 | 4.0 | 2.0 | 2.0 | NaN | -37.81829 | 144.87404 | Western Metropolitan | 6543.0 |
23547 rows × 18 columns
In [18]:
round(data.isnull().sum().sort_values(ascending=False)/len(data)*100,2)
Out[18]:
Landsize 26.06 Price 21.88 Car 19.65 Bathroom 19.04 Bedroom2 19.03 Longtitude 18.28 Lattitude 18.28 Postcode 0.00 Regionname 0.00 Propertycount 0.00 Distance 0.00 Address 0.00 Date 0.00 SellerG 0.00 Method 0.00 Type 0.00 Rooms 0.00 Suburb 0.00 dtype: float64
Check the rows where you have more then 5 missing Values¶
In [19]:
data[data.isnull().sum(axis=1) > 5]
Out[19]:
Suburb | Address | Rooms | Type | Price | Method | SellerG | Date | Distance | Postcode | Bedroom2 | Bathroom | Car | Landsize | Lattitude | Longtitude | Regionname | Propertycount | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
15 | Abbotsford | 217 Langridge St | 3 | h | 1000000.0 | S | Jellis | 08-10-2016 | 2.5 | 3067.0 | NaN | NaN | NaN | NaN | NaN | NaN | Northern Metropolitan | 4019.0 |
16 | Abbotsford | 18a Mollison St | 2 | t | 745000.0 | S | Jellis | 08-10-2016 | 2.5 | 3067.0 | NaN | NaN | NaN | NaN | NaN | NaN | Northern Metropolitan | 4019.0 |
19 | Abbotsford | 403/609 Victoria St | 2 | u | 542000.0 | S | Dingle | 08-10-2016 | 2.5 | 3067.0 | NaN | NaN | NaN | NaN | NaN | NaN | Northern Metropolitan | 4019.0 |
21 | Abbotsford | 25/84 Trenerry Cr | 2 | u | 760000.0 | SP | Biggin | 10-12-2016 | 2.5 | 3067.0 | NaN | NaN | NaN | NaN | NaN | NaN | Northern Metropolitan | 4019.0 |
22 | Abbotsford | 106/119 Turner St | 1 | u | 481000.0 | SP | Purplebricks | 10-12-2016 | 2.5 | 3067.0 | NaN | NaN | NaN | NaN | NaN | NaN | Northern Metropolitan | 4019.0 |
… | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … |
23495 | Templestowe Lower | 1/207 Manningham Rd | 2 | u | 550000.0 | PI | Barry | 26-08-2017 | 12.4 | 3107.0 | NaN | NaN | NaN | NaN | NaN | NaN | Eastern Metropolitan | 5420.0 |
23499 | Thornbury | 1/128 Dundas St | 2 | t | 770000.0 | PI | McGrath | 26-08-2017 | 7.0 | 3071.0 | NaN | NaN | NaN | NaN | NaN | NaN | Northern Metropolitan | 8870.0 |
23502 | Thornbury | 111 Pender St | 2 | t | 858000.0 | S | Jellis | 26-08-2017 | 7.0 | 3071.0 | NaN | NaN | NaN | NaN | NaN | NaN | Northern Metropolitan | 8870.0 |
23508 | Toorak | 21/1059 Malvern Rd | 2 | u | 720000.0 | VB | Beller | 26-08-2017 | 4.1 | 3142.0 | NaN | NaN | NaN | NaN | NaN | NaN | Southern Metropolitan | 7217.0 |
23531 | Werribee | 91 Latham St | 4 | h | 540000.0 | PI | Triwest | 26-08-2017 | 14.7 | 3030.0 | NaN | NaN | NaN | NaN | NaN | NaN | Western Metropolitan | 16166.0 |
4278 rows × 18 columns
Count the number of rows having >5 missing values¶
In [20]:
len(data[data.isnull().sum(axis=1) > 5])
Out[20]:
4278
In [21]:
Greater5=data[data.isnull().sum(axis=1) > 5]
Greater5.shape
Out[21]:
(4278, 18)
Calculte the percentage¶
In [22]:
round(len(data[data.isnull().sum(axis=1) > 5])/len(data)*100,2)
Out[22]:
18.17
Retaining the rows having <=5 NaNs¶
Method 1:¶
In [23]:
data.dropna(thresh=5)
Out[23]:
Suburb | Address | Rooms | Type | Price | Method | SellerG | Date | Distance | Postcode | Bedroom2 | Bathroom | Car | Landsize | Lattitude | Longtitude | Regionname | Propertycount | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Abbotsford | 68 Studley St | 2 | h | NaN | SS | Jellis | 03-09-2016 | 2.5 | 3067.0 | 2.0 | 1.0 | 1.0 | 126.0 | -37.80140 | 144.99580 | Northern Metropolitan | 4019.0 |
1 | Abbotsford | 85 Turner St | 2 | h | 1480000.0 | S | Biggin | 03-12-2016 | 2.5 | 3067.0 | 2.0 | 1.0 | 1.0 | 202.0 | -37.79960 | 144.99840 | Northern Metropolitan | 4019.0 |
2 | Abbotsford | 25 Bloomburg St | 2 | h | 1035000.0 | S | Biggin | 04-02-2016 | 2.5 | 3067.0 | 2.0 | 1.0 | 0.0 | 156.0 | -37.80790 | 144.99340 | Northern Metropolitan | 4019.0 |
3 | Abbotsford | 18/659 Victoria St | 3 | u | NaN | VB | Rounds | 04-02-2016 | 2.5 | 3067.0 | 3.0 | 2.0 | 1.0 | 0.0 | -37.81140 | 145.01160 | Northern Metropolitan | 4019.0 |
4 | Abbotsford | 5 Charles St | 3 | h | 1465000.0 | SP | Biggin | 04-03-2017 | 2.5 | 3067.0 | 3.0 | 2.0 | 0.0 | 134.0 | -37.80930 | 144.99440 | Northern Metropolitan | 4019.0 |
… | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … |
23542 | Wyndham Vale | 25 Clitheroe Dr | 3 | u | NaN | PN | Harcourts | 26-08-2017 | 27.2 | 3024.0 | 3.0 | 1.0 | 0.0 | 552.0 | -37.90032 | 144.61839 | Western Metropolitan | 5262.0 |
23543 | Wyndham Vale | 19 Dalrymple Bvd | 4 | h | NaN | S | hockingstuart | 26-08-2017 | 27.2 | 3024.0 | NaN | NaN | NaN | NaN | -37.87882 | 144.60184 | Western Metropolitan | 5262.0 |
23544 | Yallambie | 17 Amaroo Wy | 4 | h | 1100000.0 | S | Buckingham | 26-08-2017 | 12.7 | 3085.0 | 4.0 | 3.0 | 2.0 | NaN | -37.72006 | 145.10547 | Northern Metropolitan | 1369.0 |
23545 | Yarraville | 6 Agnes St | 4 | h | 1285000.0 | SP | Village | 26-08-2017 | 6.3 | 3013.0 | 4.0 | 1.0 | 1.0 | 362.0 | -37.81188 | 144.88449 | Western Metropolitan | 6543.0 |
23546 | Yarraville | 33 Freeman St | 4 | h | 1050000.0 | VB | Village | 26-08-2017 | 6.3 | 3013.0 | 4.0 | 2.0 | 2.0 | NaN | -37.81829 | 144.87404 | Western Metropolitan | 6543.0 |
23547 rows × 18 columns
Method 2:¶
In [24]:
data=data[data.isnull().sum(axis=1) <=5]
data
Out[24]:
Suburb | Address | Rooms | Type | Price | Method | SellerG | Date | Distance | Postcode | Bedroom2 | Bathroom | Car | Landsize | Lattitude | Longtitude | Regionname | Propertycount | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Abbotsford | 68 Studley St | 2 | h | NaN | SS | Jellis | 03-09-2016 | 2.5 | 3067.0 | 2.0 | 1.0 | 1.0 | 126.0 | -37.80140 | 144.99580 | Northern Metropolitan | 4019.0 |
1 | Abbotsford | 85 Turner St | 2 | h | 1480000.0 | S | Biggin | 03-12-2016 | 2.5 | 3067.0 | 2.0 | 1.0 | 1.0 | 202.0 | -37.79960 | 144.99840 | Northern Metropolitan | 4019.0 |
2 | Abbotsford | 25 Bloomburg St | 2 | h | 1035000.0 | S | Biggin | 04-02-2016 | 2.5 | 3067.0 | 2.0 | 1.0 | 0.0 | 156.0 | -37.80790 | 144.99340 | Northern Metropolitan | 4019.0 |
3 | Abbotsford | 18/659 Victoria St | 3 | u | NaN | VB | Rounds | 04-02-2016 | 2.5 | 3067.0 | 3.0 | 2.0 | 1.0 | 0.0 | -37.81140 | 145.01160 | Northern Metropolitan | 4019.0 |
4 | Abbotsford | 5 Charles St | 3 | h | 1465000.0 | SP | Biggin | 04-03-2017 | 2.5 | 3067.0 | 3.0 | 2.0 | 0.0 | 134.0 | -37.80930 | 144.99440 | Northern Metropolitan | 4019.0 |
… | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … |
23542 | Wyndham Vale | 25 Clitheroe Dr | 3 | u | NaN | PN | Harcourts | 26-08-2017 | 27.2 | 3024.0 | 3.0 | 1.0 | 0.0 | 552.0 | -37.90032 | 144.61839 | Western Metropolitan | 5262.0 |
23543 | Wyndham Vale | 19 Dalrymple Bvd | 4 | h | NaN | S | hockingstuart | 26-08-2017 | 27.2 | 3024.0 | NaN | NaN | NaN | NaN | -37.87882 | 144.60184 | Western Metropolitan | 5262.0 |
23544 | Yallambie | 17 Amaroo Wy | 4 | h | 1100000.0 | S | Buckingham | 26-08-2017 | 12.7 | 3085.0 | 4.0 | 3.0 | 2.0 | NaN | -37.72006 | 145.10547 | Northern Metropolitan | 1369.0 |
23545 | Yarraville | 6 Agnes St | 4 | h | 1285000.0 | SP | Village | 26-08-2017 | 6.3 | 3013.0 | 4.0 | 1.0 | 1.0 | 362.0 | -37.81188 | 144.88449 | Western Metropolitan | 6543.0 |
23546 | Yarraville | 33 Freeman St | 4 | h | 1050000.0 | VB | Village | 26-08-2017 | 6.3 | 3013.0 | 4.0 | 2.0 | 2.0 | NaN | -37.81829 | 144.87404 | Western Metropolitan | 6543.0 |
19269 rows × 18 columns
In [25]:
data.isnull().sum().sort_values(ascending=False)/len(data)*100
Out[25]:
Price 21.708444 Landsize 9.647621 Car 1.806010 Bathroom 1.069075 Bedroom2 1.053506 Longtitude 0.134932 Lattitude 0.134932 Suburb 0.000000 Regionname 0.000000 Postcode 0.000000 Address 0.000000 Distance 0.000000 Date 0.000000 SellerG 0.000000 Method 0.000000 Type 0.000000 Rooms 0.000000 Propertycount 0.000000 dtype: float64
Removing NAN Price rows¶
In [26]:
data[data['Price'].notnull()]
Out[26]:
Suburb | Address | Rooms | Type | Price | Method | SellerG | Date | Distance | Postcode | Bedroom2 | Bathroom | Car | Landsize | Lattitude | Longtitude | Regionname | Propertycount | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Abbotsford | 85 Turner St | 2 | h | 1480000.0 | S | Biggin | 03-12-2016 | 2.5 | 3067.0 | 2.0 | 1.0 | 1.0 | 202.0 | -37.79960 | 144.99840 | Northern Metropolitan | 4019.0 |
2 | Abbotsford | 25 Bloomburg St | 2 | h | 1035000.0 | S | Biggin | 04-02-2016 | 2.5 | 3067.0 | 2.0 | 1.0 | 0.0 | 156.0 | -37.80790 | 144.99340 | Northern Metropolitan | 4019.0 |
4 | Abbotsford | 5 Charles St | 3 | h | 1465000.0 | SP | Biggin | 04-03-2017 | 2.5 | 3067.0 | 3.0 | 2.0 | 0.0 | 134.0 | -37.80930 | 144.99440 | Northern Metropolitan | 4019.0 |
5 | Abbotsford | 40 Federation La | 3 | h | 850000.0 | PI | Biggin | 04-03-2017 | 2.5 | 3067.0 | 3.0 | 2.0 | 1.0 | 94.0 | -37.79690 | 144.99690 | Northern Metropolitan | 4019.0 |
6 | Abbotsford | 55a Park St | 4 | h | 1600000.0 | VB | Nelson | 04-06-2016 | 2.5 | 3067.0 | 3.0 | 1.0 | 2.0 | 120.0 | -37.80720 | 144.99410 | Northern Metropolitan | 4019.0 |
… | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … |
23540 | Williamstown | 8/2 Thompson St | 2 | t | 622500.0 | SP | Greg | 26-08-2017 | 6.8 | 3016.0 | 2.0 | 2.0 | 1.0 | NaN | -37.86393 | 144.90484 | Western Metropolitan | 6380.0 |
23541 | Williamstown | 96 Verdon St | 4 | h | 2500000.0 | PI | Sweeney | 26-08-2017 | 6.8 | 3016.0 | 4.0 | 1.0 | 5.0 | 866.0 | -37.85908 | 144.89299 | Western Metropolitan | 6380.0 |
23544 | Yallambie | 17 Amaroo Wy | 4 | h | 1100000.0 | S | Buckingham | 26-08-2017 | 12.7 | 3085.0 | 4.0 | 3.0 | 2.0 | NaN | -37.72006 | 145.10547 | Northern Metropolitan | 1369.0 |
23545 | Yarraville | 6 Agnes St | 4 | h | 1285000.0 | SP | Village | 26-08-2017 | 6.3 | 3013.0 | 4.0 | 1.0 | 1.0 | 362.0 | -37.81188 | 144.88449 | Western Metropolitan | 6543.0 |
23546 | Yarraville | 33 Freeman St | 4 | h | 1050000.0 | VB | Village | 26-08-2017 | 6.3 | 3013.0 | 4.0 | 2.0 | 2.0 | NaN | -37.81829 | 144.87404 | Western Metropolitan | 6543.0 |
15086 rows × 18 columns
Another Method¶
In [27]:
data[~np.isnan(data['Price'])]
Out[27]:
Suburb | Address | Rooms | Type | Price | Method | SellerG | Date | Distance | Postcode | Bedroom2 | Bathroom | Car | Landsize | Lattitude | Longtitude | Regionname | Propertycount | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Abbotsford | 85 Turner St | 2 | h | 1480000.0 | S | Biggin | 03-12-2016 | 2.5 | 3067.0 | 2.0 | 1.0 | 1.0 | 202.0 | -37.79960 | 144.99840 | Northern Metropolitan | 4019.0 |
2 | Abbotsford | 25 Bloomburg St | 2 | h | 1035000.0 | S | Biggin | 04-02-2016 | 2.5 | 3067.0 | 2.0 | 1.0 | 0.0 | 156.0 | -37.80790 | 144.99340 | Northern Metropolitan | 4019.0 |
4 | Abbotsford | 5 Charles St | 3 | h | 1465000.0 | SP | Biggin | 04-03-2017 | 2.5 | 3067.0 | 3.0 | 2.0 | 0.0 | 134.0 | -37.80930 | 144.99440 | Northern Metropolitan | 4019.0 |
5 | Abbotsford | 40 Federation La | 3 | h | 850000.0 | PI | Biggin | 04-03-2017 | 2.5 | 3067.0 | 3.0 | 2.0 | 1.0 | 94.0 | -37.79690 | 144.99690 | Northern Metropolitan | 4019.0 |
6 | Abbotsford | 55a Park St | 4 | h | 1600000.0 | VB | Nelson | 04-06-2016 | 2.5 | 3067.0 | 3.0 | 1.0 | 2.0 | 120.0 | -37.80720 | 144.99410 | Northern Metropolitan | 4019.0 |
… | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … |
23540 | Williamstown | 8/2 Thompson St | 2 | t | 622500.0 | SP | Greg | 26-08-2017 | 6.8 | 3016.0 | 2.0 | 2.0 | 1.0 | NaN | -37.86393 | 144.90484 | Western Metropolitan | 6380.0 |
23541 | Williamstown | 96 Verdon St | 4 | h | 2500000.0 | PI | Sweeney | 26-08-2017 | 6.8 | 3016.0 | 4.0 | 1.0 | 5.0 | 866.0 | -37.85908 | 144.89299 | Western Metropolitan | 6380.0 |
23544 | Yallambie | 17 Amaroo Wy | 4 | h | 1100000.0 | S | Buckingham | 26-08-2017 | 12.7 | 3085.0 | 4.0 | 3.0 | 2.0 | NaN | -37.72006 | 145.10547 | Northern Metropolitan | 1369.0 |
23545 | Yarraville | 6 Agnes St | 4 | h | 1285000.0 | SP | Village | 26-08-2017 | 6.3 | 3013.0 | 4.0 | 1.0 | 1.0 | 362.0 | -37.81188 | 144.88449 | Western Metropolitan | 6543.0 |
23546 | Yarraville | 33 Freeman St | 4 | h | 1050000.0 | VB | Village | 26-08-2017 | 6.3 | 3013.0 | 4.0 | 2.0 | 2.0 | NaN | -37.81829 | 144.87404 | Western Metropolitan | 6543.0 |
15086 rows × 18 columns
In [28]:
data=data[data['Price'].notnull()]
data
Out[28]:
Suburb | Address | Rooms | Type | Price | Method | SellerG | Date | Distance | Postcode | Bedroom2 | Bathroom | Car | Landsize | Lattitude | Longtitude | Regionname | Propertycount | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Abbotsford | 85 Turner St | 2 | h | 1480000.0 | S | Biggin | 03-12-2016 | 2.5 | 3067.0 | 2.0 | 1.0 | 1.0 | 202.0 | -37.79960 | 144.99840 | Northern Metropolitan | 4019.0 |
2 | Abbotsford | 25 Bloomburg St | 2 | h | 1035000.0 | S | Biggin | 04-02-2016 | 2.5 | 3067.0 | 2.0 | 1.0 | 0.0 | 156.0 | -37.80790 | 144.99340 | Northern Metropolitan | 4019.0 |
4 | Abbotsford | 5 Charles St | 3 | h | 1465000.0 | SP | Biggin | 04-03-2017 | 2.5 | 3067.0 | 3.0 | 2.0 | 0.0 | 134.0 | -37.80930 | 144.99440 | Northern Metropolitan | 4019.0 |
5 | Abbotsford | 40 Federation La | 3 | h | 850000.0 | PI | Biggin | 04-03-2017 | 2.5 | 3067.0 | 3.0 | 2.0 | 1.0 | 94.0 | -37.79690 | 144.99690 | Northern Metropolitan | 4019.0 |
6 | Abbotsford | 55a Park St | 4 | h | 1600000.0 | VB | Nelson | 04-06-2016 | 2.5 | 3067.0 | 3.0 | 1.0 | 2.0 | 120.0 | -37.80720 | 144.99410 | Northern Metropolitan | 4019.0 |
… | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … |
23540 | Williamstown | 8/2 Thompson St | 2 | t | 622500.0 | SP | Greg | 26-08-2017 | 6.8 | 3016.0 | 2.0 | 2.0 | 1.0 | NaN | -37.86393 | 144.90484 | Western Metropolitan | 6380.0 |
23541 | Williamstown | 96 Verdon St | 4 | h | 2500000.0 | PI | Sweeney | 26-08-2017 | 6.8 | 3016.0 | 4.0 | 1.0 | 5.0 | 866.0 | -37.85908 | 144.89299 | Western Metropolitan | 6380.0 |
23544 | Yallambie | 17 Amaroo Wy | 4 | h | 1100000.0 | S | Buckingham | 26-08-2017 | 12.7 | 3085.0 | 4.0 | 3.0 | 2.0 | NaN | -37.72006 | 145.10547 | Northern Metropolitan | 1369.0 |
23545 | Yarraville | 6 Agnes St | 4 | h | 1285000.0 | SP | Village | 26-08-2017 | 6.3 | 3013.0 | 4.0 | 1.0 | 1.0 | 362.0 | -37.81188 | 144.88449 | Western Metropolitan | 6543.0 |
23546 | Yarraville | 33 Freeman St | 4 | h | 1050000.0 | VB | Village | 26-08-2017 | 6.3 | 3013.0 | 4.0 | 2.0 | 2.0 | NaN | -37.81829 | 144.87404 | Western Metropolitan | 6543.0 |
15086 rows × 18 columns
In [29]:
round(data.isnull().sum().sort_values(ascending=False)/len(data)*100,2)
Out[29]:
Landsize 9.83 Car 1.76 Bathroom 1.07 Bedroom2 1.05 Longtitude 0.15 Lattitude 0.15 Suburb 0.00 Address 0.00 Regionname 0.00 Postcode 0.00 Distance 0.00 Date 0.00 SellerG 0.00 Method 0.00 Price 0.00 Type 0.00 Rooms 0.00 Propertycount 0.00 dtype: float64
Still Landsize columns has 9.83% nan data, Now Describe the row¶
In [30]:
data['Landsize'].describe()
Out[30]:
count 13603.000000 mean 558.116371 std 3987.326586 min 0.000000 25% 176.500000 50% 440.000000 75% 651.000000 max 433014.000000 Name: Landsize, dtype: float64
In [31]:
data=data[data['Landsize'].notnull()]
data
Out[31]:
Suburb | Address | Rooms | Type | Price | Method | SellerG | Date | Distance | Postcode | Bedroom2 | Bathroom | Car | Landsize | Lattitude | Longtitude | Regionname | Propertycount | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Abbotsford | 85 Turner St | 2 | h | 1480000.0 | S | Biggin | 03-12-2016 | 2.5 | 3067.0 | 2.0 | 1.0 | 1.0 | 202.0 | -37.79960 | 144.99840 | Northern Metropolitan | 4019.0 |
2 | Abbotsford | 25 Bloomburg St | 2 | h | 1035000.0 | S | Biggin | 04-02-2016 | 2.5 | 3067.0 | 2.0 | 1.0 | 0.0 | 156.0 | -37.80790 | 144.99340 | Northern Metropolitan | 4019.0 |
4 | Abbotsford | 5 Charles St | 3 | h | 1465000.0 | SP | Biggin | 04-03-2017 | 2.5 | 3067.0 | 3.0 | 2.0 | 0.0 | 134.0 | -37.80930 | 144.99440 | Northern Metropolitan | 4019.0 |
5 | Abbotsford | 40 Federation La | 3 | h | 850000.0 | PI | Biggin | 04-03-2017 | 2.5 | 3067.0 | 3.0 | 2.0 | 1.0 | 94.0 | -37.79690 | 144.99690 | Northern Metropolitan | 4019.0 |
6 | Abbotsford | 55a Park St | 4 | h | 1600000.0 | VB | Nelson | 04-06-2016 | 2.5 | 3067.0 | 3.0 | 1.0 | 2.0 | 120.0 | -37.80720 | 144.99410 | Northern Metropolitan | 4019.0 |
… | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … |
23537 | Wheelers Hill | 12 Strada Cr | 4 | h | 1245000.0 | S | Barry | 26-08-2017 | 16.7 | 3150.0 | 4.0 | 2.0 | 2.0 | 652.0 | -37.90562 | 145.16761 | South-Eastern Metropolitan | 7392.0 |
23538 | Williamstown | 77 Merrett Dr | 3 | h | 1031000.0 | SP | Williams | 26-08-2017 | 6.8 | 3016.0 | 3.0 | 2.0 | 2.0 | 333.0 | -37.85927 | 144.87904 | Western Metropolitan | 6380.0 |
23539 | Williamstown | 83 Power St | 3 | h | 1170000.0 | S | Raine | 26-08-2017 | 6.8 | 3016.0 | 3.0 | 2.0 | 4.0 | 436.0 | -37.85274 | 144.88738 | Western Metropolitan | 6380.0 |
23541 | Williamstown | 96 Verdon St | 4 | h | 2500000.0 | PI | Sweeney | 26-08-2017 | 6.8 | 3016.0 | 4.0 | 1.0 | 5.0 | 866.0 | -37.85908 | 144.89299 | Western Metropolitan | 6380.0 |
23545 | Yarraville | 6 Agnes St | 4 | h | 1285000.0 | SP | Village | 26-08-2017 | 6.3 | 3013.0 | 4.0 | 1.0 | 1.0 | 362.0 | -37.81188 | 144.88449 | Western Metropolitan | 6543.0 |
13603 rows × 18 columns
In [32]:
data.isnull().sum().sort_values(ascending=False)
Out[32]:
Car 63 Longtitude 22 Lattitude 22 Bathroom 1 Suburb 0 Address 0 Regionname 0 Landsize 0 Bedroom2 0 Postcode 0 Distance 0 Date 0 SellerG 0 Method 0 Price 0 Type 0 Rooms 0 Propertycount 0 dtype: int64
Describe Lattitude and Longtitude and later Imputing Lattitude and Longtitude by Mean Values¶
In [33]:
data.loc[:,['Lattitude','Longtitude'] ].describe()
Out[33]:
Lattitude | Longtitude | |
---|---|---|
count | 13581.000000 | 13581.000000 |
mean | -37.809204 | 144.995221 |
std | 0.079257 | 0.103913 |
min | -38.182550 | 144.431810 |
25% | -37.856820 | 144.929600 |
50% | -37.802360 | 145.000100 |
75% | -37.756400 | 145.058320 |
max | -37.408530 | 145.526350 |
In [34]:
data['Lattitude'].mean()
Out[34]:
-37.809203506369194
In [35]:
data.loc[:,'Lattitude'].fillna(data['Lattitude'].mean(), inplace = True)
c:\users\karan\appdata\local\programs\python\python38\lib\site-packages\pandas\core\series.py:4433: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy return super().fillna(
In [36]:
data.loc[:,'Longtitude'].fillna(data['Longtitude'].mean(), inplace = True)
In [37]:
data.isnull().sum().sort_values(ascending=False)
Out[37]:
Car 63 Bathroom 1 Suburb 0 Address 0 Regionname 0 Longtitude 0 Lattitude 0 Landsize 0 Bedroom2 0 Postcode 0 Distance 0 Date 0 SellerG 0 Method 0 Price 0 Type 0 Rooms 0 Propertycount 0 dtype: int64
In [38]:
data.loc[:,['Bathroom','Car']].describe()
Out[38]:
Bathroom | Car | |
---|---|---|
count | 13602.000000 | 13540.000000 |
mean | 1.534921 | 1.610414 |
std | 0.691834 | 0.962244 |
min | 0.000000 | 0.000000 |
25% | 1.000000 | 1.000000 |
50% | 1.000000 | 2.000000 |
75% | 2.000000 | 2.000000 |
max | 8.000000 | 10.000000 |
In [39]:
data.groupby('Car').Car.count().sort_values(ascending=False)
Out[39]:
Car 2.0 5606 1.0 5515 0.0 1026 3.0 748 4.0 507 5.0 63 6.0 54 8.0 9 7.0 8 10.0 3 9.0 1 Name: Car, dtype: int64
In [40]:
data['Car'].astype('category').value_counts()
Out[40]:
2.0 5606 1.0 5515 0.0 1026 3.0 748 4.0 507 5.0 63 6.0 54 8.0 9 7.0 8 10.0 3 9.0 1 Name: Car, dtype: int64
In [41]:
data.loc[:,'Car'].fillna(2, inplace = True)
c:\users\karan\appdata\local\programs\python\python38\lib\site-packages\pandas\core\series.py:4433: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy return super().fillna(
In [42]:
data.isnull().sum().sort_values(ascending=False)/len(data)
Out[42]:
Bathroom 0.000074 Suburb 0.000000 Address 0.000000 Regionname 0.000000 Longtitude 0.000000 Lattitude 0.000000 Landsize 0.000000 Car 0.000000 Bedroom2 0.000000 Postcode 0.000000 Distance 0.000000 Date 0.000000 SellerG 0.000000 Method 0.000000 Price 0.000000 Type 0.000000 Rooms 0.000000 Propertycount 0.000000 dtype: float64
Now check for Bathroom¶
In [43]:
data.groupby('Bathroom').Bathroom.count().sort_values(ascending=False)
Out[43]:
Bathroom 1.0 7517 2.0 4987 3.0 921 4.0 106 0.0 34 5.0 28 6.0 5 7.0 2 8.0 2 Name: Bathroom, dtype: int64
In [44]:
data.loc[:,'Bathroom'].fillna(1, inplace = True)
In [45]:
data.isnull().sum().sort_values(ascending=False)/len(data)
Out[45]:
Suburb 0.0 Address 0.0 Regionname 0.0 Longtitude 0.0 Lattitude 0.0 Landsize 0.0 Car 0.0 Bathroom 0.0 Bedroom2 0.0 Postcode 0.0 Distance 0.0 Date 0.0 SellerG 0.0 Method 0.0 Price 0.0 Type 0.0 Rooms 0.0 Propertycount 0.0 dtype: float64
Check No of Row Retain¶
In [46]:
len(data)/len(OrgData)*100
Out[46]:
57.769567248481756
Data Analysis¶
In [47]:
data
Out[47]:
Suburb | Address | Rooms | Type | Price | Method | SellerG | Date | Distance | Postcode | Bedroom2 | Bathroom | Car | Landsize | Lattitude | Longtitude | Regionname | Propertycount | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Abbotsford | 85 Turner St | 2 | h | 1480000.0 | S | Biggin | 03-12-2016 | 2.5 | 3067.0 | 2.0 | 1.0 | 1.0 | 202.0 | -37.79960 | 144.99840 | Northern Metropolitan | 4019.0 |
2 | Abbotsford | 25 Bloomburg St | 2 | h | 1035000.0 | S | Biggin | 04-02-2016 | 2.5 | 3067.0 | 2.0 | 1.0 | 0.0 | 156.0 | -37.80790 | 144.99340 | Northern Metropolitan | 4019.0 |
4 | Abbotsford | 5 Charles St | 3 | h | 1465000.0 | SP | Biggin | 04-03-2017 | 2.5 | 3067.0 | 3.0 | 2.0 | 0.0 | 134.0 | -37.80930 | 144.99440 | Northern Metropolitan | 4019.0 |
5 | Abbotsford | 40 Federation La | 3 | h | 850000.0 | PI | Biggin | 04-03-2017 | 2.5 | 3067.0 | 3.0 | 2.0 | 1.0 | 94.0 | -37.79690 | 144.99690 | Northern Metropolitan | 4019.0 |
6 | Abbotsford | 55a Park St | 4 | h | 1600000.0 | VB | Nelson | 04-06-2016 | 2.5 | 3067.0 | 3.0 | 1.0 | 2.0 | 120.0 | -37.80720 | 144.99410 | Northern Metropolitan | 4019.0 |
… | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … |
23537 | Wheelers Hill | 12 Strada Cr | 4 | h | 1245000.0 | S | Barry | 26-08-2017 | 16.7 | 3150.0 | 4.0 | 2.0 | 2.0 | 652.0 | -37.90562 | 145.16761 | South-Eastern Metropolitan | 7392.0 |
23538 | Williamstown | 77 Merrett Dr | 3 | h | 1031000.0 | SP | Williams | 26-08-2017 | 6.8 | 3016.0 | 3.0 | 2.0 | 2.0 | 333.0 | -37.85927 | 144.87904 | Western Metropolitan | 6380.0 |
23539 | Williamstown | 83 Power St | 3 | h | 1170000.0 | S | Raine | 26-08-2017 | 6.8 | 3016.0 | 3.0 | 2.0 | 4.0 | 436.0 | -37.85274 | 144.88738 | Western Metropolitan | 6380.0 |
23541 | Williamstown | 96 Verdon St | 4 | h | 2500000.0 | PI | Sweeney | 26-08-2017 | 6.8 | 3016.0 | 4.0 | 1.0 | 5.0 | 866.0 | -37.85908 | 144.89299 | Western Metropolitan | 6380.0 |
23545 | Yarraville | 6 Agnes St | 4 | h | 1285000.0 | SP | Village | 26-08-2017 | 6.3 | 3013.0 | 4.0 | 1.0 | 1.0 | 362.0 | -37.81188 | 144.88449 | Western Metropolitan | 6543.0 |
13603 rows × 18 columns
In [51]:
print("hello")
hello
In [52]:
%%javascript
IPython.notebook.save_notebook()
In [53]:
import os
os.system('jupyter nbconvert --to html yourNotebook.ipynb')
Out[53]:
0
In [55]:
!jupyter nbconvert yourNotebook.ipynb --to html --template basic --output output.html
[NbConvertApp] Converting notebook yourNotebook.ipynb to html [NbConvertApp] Writing 129203 bytes to output.html
In [ ]: