ML-Rainfall prediction using linear regression

Rainfall prediction

Rainfall prediction is the application of scientific knowledge and technological resources to determine the volume and inches of rain for a particular period of time and location. Rainfall prediction is vital to plan power production, crop irrigation, and educate people on weather dangers.

Linear regression

Linear Regression is a method that describes the relationship between a dependent variable and a set of independent variables. The equation of the line is given as Y=wx+by=output, w= slope of the line, x=input, b= the y-intercept or bias. It provides an estimate of rainfall using various atmospheric variables like cloud cover, humidity, wind, and average temperature to predict rainfall. An estimate of rainfall is easy to determine at any given point since the regression method uses the previous correlation between the various atmospheric variables. Therefore, our equation will look like: y=b+w1w_{1}x1x_{1}+w2w_{2}x2x_{2}wnw_{n}xnx_{n}.

Example

Here is extracted data from Kaggle-Link

Rainfall Average Temp Humidity
0 16.15 48.5
3.6 20.45 58
3.6 14.4 75.5
39.8 11.85 59
2.8 11.55 58.5
0 12.15 63.5
0.2 12.65 55
0 14.15 61
0 15.6 59
16.2 17.15 57
0 17.9 54
0.2 19 44.5
0 21.5 45.5
0 20.65 43.5

We will plot a line that fits our scatter plot (with minimum errors) that shows the relationship between each atmospheric variabletemperature and humidity and rainfall. A predicted value will be provided showing the correlation between the two variables and rainfall. Since the data was extracted and modified manually, the input of the data will be manual. The necessary librariesnumpy, matplotlib, and scipy will be made available by importation. These will help to plot our diagrams and make technical computations.

#importing libraries
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
#create arrays that show the x and y values
x = [16.15, 20.45, 14.4, 11.85, 11.55, 12.15, 12.65, 14.15, 15.6, 17.15, 17.9, 19, 21.5, 20.65]
y = [0, 3.6, 3.6, 39.8, 2.8, 0, 0.2, 0, 0, 16.2, 0, 0.2, 0, 0]
#create a method that provides the key values of linear regression
slope, intercept, r, p, std_err = stats.linregress(x, y)
#create a function that makes use of the slope and intercept to provide new values.
def myfunc(x):
return slope * x + intercept
#to create a new array with new values on the y-axis, run each value of the x array through the function
mymodel = list(map(myfunc, x))
#draw the scatter plot
plt.scatter(x, y)
#draw the line of regression
plt.plot(x, mymodel)
#label the x axis and y axis
plt.xlabel('Temperature')
plt.ylabel('Rainfall')
#show the diagram
plt.show()
plt.savefig('output/legend.png')
#view the correlation between the x and y variables
print (r)
#importing libraries
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
#create arrays that show the x and y values
x = [48.5, 58, 75.5, 59, 58.5, 63.5, 55, 61, 59, 57, 54, 44.5, 45.5, 43.5]
y = [0, 3.6, 3.6, 39.8, 2.8, 0, 0.2, 0, 0, 16.2, 0, 0.2, 0, 0]
#create a method that provides the key values of linear regression
slope, intercept, r, p, std_err = stats.linregress(x, y)
#create a function that makes use of the slope and interceot to provide new values
def myfunc(x):
return slope * x + intercept
#to create a new array with new values on the y axis, run each value of the x array through the function
mymodel = list(map(myfunc, x))
#draw the scatter diagram
plt.scatter(x, y)
#draw the line of regression
plt.plot(x, mymodel)
#label the x and y axis
plt.xlabel('Humidity')
plt.ylabel('Rainfall')
#show the diagram
plt.show()
plt.savefig('output/legend.png')
#view the correlation betwwen the x and y variables
print (r)

Conclusion

We can conclude from the results above that when the temperature is high, then the precipitation is low. However, when the temperature is low, then precipitation is high. This is not the same for humidity because the precipitation is high with higher humidity and low with lower humidity. This goes to show that rainfall can be predicted using average temperature and average humidity.