...

/

Comparing Ratings by Gender and Plotting Them

Comparing Ratings by Gender and Plotting Them

Learn how to filter the movie lens database by gender.

We'll cover the following...

Let’s see which movies are top-rated by females in the movie lens database.

Look at the highlighted line 26 of the code widget below. We’re sorting ratings_by_gender in the 'F’ column, which stands for female:

Press + to interact
import pandas as pd
import matplotlib.pyplot as plt
def movie(nogui=False, movielenspath=''):
user_columns = ['user_id', 'age', 'gender']
users = pd.read_csv(movielenspath + 'movie_lens/u.user', sep='|', names=user_columns, usecols=range(3))
rating_columns = ['user_id', 'movie_id', 'rating']
ratings = pd.read_csv(movielenspath + 'movie_lens/u.data', sep='\t', names=rating_columns, usecols=range(3))
movie_columns = ['movie_id', 'title']
movies = pd.read_csv(movielenspath + 'movie_lens/u.item', sep='|', names=movie_columns, usecols=range(2), encoding="iso-8859-1")
# create one merged DataFrame
movie_ratings = pd.merge(movies, ratings)
movie_data = pd.merge(movie_ratings, users)
ratings_by_title = movie_data.groupby('title').size()
popular_movies = ratings_by_title.index[ratings_by_title >= 250]
ratings_by_gender = movie_data.pivot_table('rating', index='title',columns='gender')
ratings_by_gender = ratings_by_gender.loc[popular_movies]
top_movies_individuals_tagged_as_female = ratings_by_gender.sort_values(by='F', ascending=False)
print("Top rated movies by individuals_tagged_as_female \n", top_movies_individuals_tagged_as_female.head())
print("\n")
if __name__ == "__main__":
movie()

As we can see, most ...