Centering

Let’s learn about the details of centering the variables in this lesson.

We'll cover the following

Centering the explanatory variable

We can also look at the effects of arsenic concentration and distance to the nearest safe well in the same model. Before fitting the GLM, we can make life easier by subtracting the mean value of the explanatory variables to center them. This has advantages when a regression intercept of zero is unhelpful or makes no sense. We have a distance of zero meters here—if we take this literally, the new and old wells should be in the same place. This has advantages when a regression intercept of zero is unhelpful (or doesn’t make sense) and when examining the interactions below:

wells$c.dist100 <- wells$dist100 - mean(wells$dist100) 
wells$c.arsenic <- wells$arsenic - mean(wells$arsenic)

Considering arsenic and distance at the same time introduces the possibility of interaction. For a given distance, a household may be more likely to switch to another well if the level of arsenic is higher (the model is named fit.5) (Gelman & Hill 2006).Gelman, A. & Hill, J.Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press (2006).

Get hands-on with 1400+ tech skills courses.