ANOVA Tables

Let’s get a brief overview of ANOVA tables.

R packages

We’ll use the following R packages in this chapter:

  • ggplot2
  • SemiPar
  • SMPracticals
  • arm

Introduction to ANOVA tables

It’s likely obvious that a “back-to-the-basics” aspect of modern statistics is that it maintains a strong preference for the estimation-based approach. It’s relatively straightforward to apply for simple situations like Darwin’s maize and the Janka timber hardness data. However, as designs become more complicated, the number of comparisons can overwhelming, and it may become difficult to work with estimates and intervals alone.

For example, as designs get more elaborate, the number of potential pairwise t-tests increases rapidly. The risk of false positive results also increases; the more tests we make, the more likely we are to get a false positive. It’s often useful to employ a complementary approach that can check if there’s support for any differences among means before we dive into the detail of pairwise comparisons.

Ronald Fisher invented such an approach with the analysis of variance and the ANOVA tables that are used to present it. Although ANOVA tables tend to be associated primarily with analyses that have categorical explanatory variables (factors), these tables can be generalized to analyses that include continuous explanatory variables. It may be more useful to think of them as linear model analysis tables). While ANOVA is very useful with complex datasets, we’ll first apply it to Darwin’s maize data for instructional purposes.

ANOVA tables and Darwin’s maize

The following line of code shows our simple linear model for Darwin’s maize data that was developed in the Linear Models chapter:

ls1 <- lm(height ~ type, data = darwin)

Since we only have one explanatory variable in this example, the model is called a one-way ANOVA. The general strategy of ANOVA is as follows:

  • Quantify the overall variability in the dataset.
  • Divide the overall variability into the variability between and within groups (self-pollinated and cross-pollinated).
  • Calculate a signal-to-noise ratio. The greater our ratio of signal to noise, the more confident we are that we’ve detected a real effect.

Get hands-on with 1400+ tech skills courses.