Handling Overplotting and Outlier Values
Learn how to tackle overplotting and outlier values in datasets.
We'll cover the following
Let’s say we are now interested in seeing the relationship between our variable and population for the same year that we have been working on. We want to have Population, total
on the axis and perc_pov_19
on the axis.
We first create a subset of poverty
in which year
is equal to 2010 and is_country
is True
, and sort the values using Population, total
:
df =\
poverty[poverty['year'].eq(2010) & poverty['is_country']]
.sort_values('Population, total')
Now let’s see how to plot those two variables. Here is the code:
px.scatter(df,
y=perc_pov_19,
x='Population, total',
title=' - '.join([perc_pov_19, '2010']),
height=500)
Get hands-on with 1400+ tech skills courses.