Analysis of Variance (ANOVA) is used in statistics to find any statistical differences between three or more samples. It is a hypothesis testing method for checking variance in a population. It is mostly used in regressions to filter out random and systematic variables, which helps us identify where the similarity between two samples might exist and what variables might affect the data the most.
The formula for ANOVA is:
where,
MST
= Mean square of treatment
MSE
= Mean square of the error
There are two main types of ANOVA.
To execute ANOVA in R, we can simply use the built-in aov
function.
#load datasetplant = PlantGrowth#visualize datasethead(plant)#conduct anova with weight as dep and group as ind var.owanova = aov(plant$weight ~ factor(plant$group))#visualize resultssummary(owanova)
We can see that the control group shows a one-star relation with the weight, which indicates that they are not strongly correlated, and one thing will not the other majorly.
Line 2: We copy the built-in data library from R called PlantGrowth
. This contains the data for the effect on the length of plant growth under three different conditions.
Line 4: We use the head
function to visualize our dataset.
Line 6: To conduct ANOVA, we use the aov
function. The function requires two data variables (one independent and one dependent), separated by ~
, for conducting a two-way ANOVA on the dataset. We add len
as our dependant and supp
and dose
as our independent variables. This will then conduct the ANOVA testing and return the result.
Line 8: To visualize our results, we use the summary
function.
#load datasettooth = ToothGrowth#visualize datasethead(tooth)#conduct anova with len as dep and supp and dose as ind vars.twanova = aov(tooth$len ~ factor(tooth$supp)*factor(tooth$dose))#visualize results.summary(twanova)
We can observe three-star relations of length with both supplement and dosage, which indicates a strong correlation. However, we can see both the values aren't correlated similarly as their combined correlation is one-star.
Line 2: We copy the built-in data library from R called ToothGrowth
. This contains the data for the effect on tooth growth in guinea pigs who are fed Vitamin C.
Line 4: We use the head
function to visualize our dataset.
Line 6: To conduct ANOVA, we use the aov
function. The function requires three data variables (two independent and one dependent) for conducting two-way ANOVA on the dataset. We add len
as our dependant and supp
and dose
as our independent variables. This will then conduct the ANOVA testing and return the result.
Line 8: To visualize our results, we use the summary
function.
Free Resources