Pandas is a Python library used for working with data sets. It has functions for analyzing, cleaning, exploring, and manipulating data. The name “Pandas” has a reference to both “Panel Data”, and “Python Data Analysis” and was created by Wes McKinney in 2008. It provides special data structures and operations for the manipulation of numerical tables and time series. Pandas is free software released under the three-clause BSD license.
Introduction to Pandas¶
Pandas is a python library used for working with datasets. It has functions for analyzing, cleaning, exploring and manipulating data. Pandas is fast and it has high performance & productivity for users.
Why use Pandas?¶
- Pandas allows us to analyze big data and make conclusions based on satistical theories.
- Pandas can clean messy datasets and make them readable.
- Easy handling of missing data (represented as NaN) in floating point as well as non-floating point data.
Pandas deals with the following twoe data structures −¶
- Series
- DataFrame
Pandas Installation¶
!pip install pandas
Import Pandas¶
import pandas
Pandas is usually imported under the pd alias¶
import pandas as pd
Check Python and Pandas version¶
import platform
import pandas as pd
print('Python version: ' + platform.python_version())
print('Pandas version: ' + pd.__version__)
Python version: 3.10.4 Pandas version: 1.4.2