Compare Total Review of 2016 and 2017
Learn to compare data in pandas and PySpark.
We'll cover the following
Comparison in Pandas
To compare the total reviews of 2016 and 2017, we first need to aggregate the data by the review year and month. Next, we need to count the number of asin
for each month. Then we can subset the new DataFrame with a filter to create new, separate DataFrames for 2016 and 2017. Finally, we join the two new DataFrames to create a wide DataFrame, where the total reviews for each month for the years 2016 and 2017 will be side by side.
Get hands-on with 1400+ tech skills courses.