Anscombe’s Quartet and Failures of the Correlation Coefficient
Learn about Anscombe’s Quartet and failures of the correlation coefficient.
We'll cover the following
As we noted earlier, the correlation coefficient does not pick up any non-linear relationship in the data, and it is heavily influenced by outliers. This can be best illustrated using Anscombe’s quartet, referring to the four datasets with 11 observations each, constructed by Francis Anscombe in 1973. Anscombe illustrated that across the four artificial datasets, the two variables produce an identical correlation coefficient. But when displayed in a scatter plot, the relationship between the two variables appears to differ dramatically among the four datasets.
Get hands-on with 1400+ tech skills courses.