Summary
Get a summary of the essential points that were covered in this chapter.
Let’s summarize what we have learned so far in this section.
-
Error distribution: We saw that data from a normal or Gaussian error distribution has a relatively even spread of values above and below the mean. This data can be made up of any real number(s), positive or negative.
-
Normality tests: We learned that data can be tested for normality with a normality test, such as the Shapiro-Wilk Test. Still, it’s often better to use a function like
fitdistr()
to estimate which predefined error distribution the data is most similar to. -
Log-transformation: We saw that log-transformation is a common technique to help normalize many biological data.
-
Non-parametric tests are not enough: We discussed that many old non-parametric statistical tests can analyze non-normally distributed data, but these have fallen out of favor with the rise of generalized linear models.
-
T-tests: We learned that Student’s t-test is designed for analyzing the difference in means of normally distributed data in two categories. A t-test is designed for small sample sizes, and large sample sizes are equivalent to a linear model.
-
Linear models: We also discussed that linear models are one of the most commonly used forms of statistics available.
Linear models allow us to analyze a normally distributed response variable with any combination of categorical or continuous predictor variables.
-
Useful functions: We also saw many useful functions for all linear models, such as
summary()
,Anova()
from the car package, andplot()
. Post-hoc tests can be accomplished using the emmeans package.
Get hands-on with 1400+ tech skills courses.