Plot a Histogram Plot using Matplotlib¶
A histogram is a graphical representation of a set of data points arranged in a user-defined range. Similar to a bar chart, a bar chart compresses a series of data into easy-to-interpret visual objects by grouping multiple data points into logical areas or containers.
To draw this we will use:
random.normal() method for finding the normal distribution of the data. It has three parameters:
- loc – (average) where the top of the bell is located.
- Scale – (standard deviation) how uniform you want the graph to be distributed.
- size – Shape of the returning Array
The function hist() in the Pyplot module of the Matplotlib library is used to draw histograms. It has parameters like:
- data: This parameter is a data sequence.
- bin: This parameter is optional and contains integers, sequences or strings.
- Density: This parameter is optional and contains a Boolean value.
- Alpha: Value is an integer between 0 and 1, which represents the transparency of each histogram. The smaller the value of n, the more transparent the histogram.
In [4]:
import matplotlib.pyplot as plt
ages = [18, 18, 21, 25, 26, 30, 32, 38, 45, 55]
plt.hist(ages, edgecolor='black')
plt.title('Age of Respondants')
plt.xlabel('Ages')
plt.ylabel('Total Respondants')
plt.tight_layout()
plt.show()
In [36]:
import matplotlib.pyplot as plt
ages = [18, 18, 21, 25, 26, 30, 32, 38, 45, 55]
plt.hist(ages, bins=5, edgecolor='black')
plt.title('Age of Respondants')
plt.xlabel('Ages')
plt.ylabel('Total Respondants')
plt.tight_layout()
plt.show()
In [3]:
import matplotlib.pyplot as plt
ages = [18, 18, 21, 25, 26, 30, 32, 38, 45, 55]
bins= [10, 20, 30, 40, 50, 60]
plt.hist(ages, bins=bins, edgecolor='black')
plt.title('Age of Respondants')
plt.xlabel('Ages')
plt.ylabel('Total Respondants')
plt.tight_layout()
plt.show()
Plot Line in Histogram¶
In [13]:
import matplotlib.pyplot as plt
import numpy as np
ages = [18, 18, 33,33,21, 25, 26, 30, 32, 38, 45, 55,20,34,40,41,42,35,48]
bins= [10, 20, 30, 40, 50, 60]
median_age= np.median(np.array(ages))
plt.hist(ages, bins=bins, edgecolor='black')
plt.axvline(median_age, color='red', linewidth=2, label='Age Median')
plt.title('Age of Respondants')
plt.xlabel('Ages')
plt.ylabel('Total Respondants')
plt.tight_layout()
plt.show()
Best Fit Line for Histogram¶
In [29]:
import numpy as np
import matplotlib.pyplot as plt
data = np.random.normal(170, 10, 250)
plt.hist(data, bins=25, density=True, alpha=0.6, color='b')
plt.show()
Plotting the Normal Distribution¶
In [39]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
# Plot between -30 and 30 with 0.1 steps.
x = np.arange(-30, 30, 0.1)
# Calculating mean and standard deviation
mean = np.mean(x)
sd = np.std(x)
plt.plot(x, norm.pdf(x, mean, sd))
plt.show()
Normal Distribution over Histogram¶
In [42]:
import numpy as np
from scipy.stats import norm
import matplotlib.pyplot as plt
data = np.random.normal(170, 10, 250)
# Fit a normal distribution to the data: mean and standard deviation
mean, std = norm.fit(data)
# Plot the histogram.
plt.hist(data, bins=25, density=True, alpha=0.6, color='b')
# Plot the PDF.
xmin, xmax = plt.xlim()
x = np.linspace(xmin, xmax, 100)
p = norm.pdf(x, mean, std)
plt.plot(x, p, 'k', linewidth=2)
title = "Fit Values: {:.2f} and {:.2f}".format(mean, std)
plt.title(title)
plt.show()