Solution: Visualizing Datasets
Let's solve the challenge to check your understanding of visualizing datasets.
We'll cover the following...
Here’s the solution to the problem that creates a bar chart and scatterplot after creating the desired subsets and dropping the NULL
values.
Solution
Press + to interact
import pandas as pdimport plotly.graph_objects as goimport plotly.express as px## Required datasetspoverty = pd.read_csv('../data/poverty.csv', low_memory=False)## Code from heregini = 'GINI index (World Bank estimate)'year = 2016## first plotdf =\poverty[poverty['year']==year].sort_values(gini).dropna(subset=[gini])fig = go.Figure()fig =px.bar (df,x='Country Name',y=gini,title=' - '.join([gini, str(year)]))fig.write_image("/usercode/output/abc.png", width=2000, height=500)fig.show()## Second plotperc_pov_cols = poverty.filter(regex='Poverty gap').columnsperc_pov_55 = perc_pov_cols[2]country = 'United States'mode= "markers"df =poverty[poverty['Country Name']==country][['year', perc_pov_55]].dropna()fig = go.Figure()fig.add_scatter(x=df['year'],y=df[perc_pov_55],text=df[perc_pov_55],mode=mode)fig.layout.title = str(perc_pov_55) +' in the ' + country + ' by Year 'fig.show()
Explanation
-
Lines 8–9: We create a
gini
variable and set it to'GINI index (World Bank estimate)'
, and we create ayear
variable and set it to2016
. -
Lines 11–12: We create subset
df
from thepoverty
DataFrame by sorting based on thegini
variable and dropping theNULL
values aftersort
. ...