Search⌘ K

Comparing Ratings by Gender and Plotting Them

Learn how to filter the movie lens database by gender.

We'll cover the following...

Let’s see which movies are top-rated by females in the movie lens database.

Look at the highlighted line 26 of the code widget below. We’re sorting ratings_by_gender in the 'F’ column, which stands for female:

Python 3.8
import pandas as pd
import matplotlib.pyplot as plt
def movie(nogui=False, movielenspath=''):
user_columns = ['user_id', 'age', 'gender']
users = pd.read_csv(movielenspath + 'movie_lens/u.user', sep='|', names=user_columns, usecols=range(3))
rating_columns = ['user_id', 'movie_id', 'rating']
ratings = pd.read_csv(movielenspath + 'movie_lens/u.data', sep='\t', names=rating_columns, usecols=range(3))
movie_columns = ['movie_id', 'title']
movies = pd.read_csv(movielenspath + 'movie_lens/u.item', sep='|', names=movie_columns, usecols=range(2), encoding="iso-8859-1")
# create one merged DataFrame
movie_ratings = pd.merge(movies, ratings)
movie_data = pd.merge(movie_ratings, users)
ratings_by_title = movie_data.groupby('title').size()
popular_movies = ratings_by_title.index[ratings_by_title >= 250]
ratings_by_gender = movie_data.pivot_table('rating', index='title',columns='gender')
ratings_by_gender = ratings_by_gender.loc[popular_movies]
top_movies_individuals_tagged_as_female = ratings_by_gender.sort_values(by='F', ascending=False)
print("Top rated movies by individuals_tagged_as_female \n", top_movies_individuals_tagged_as_female.head())
print("\n")
if __name__ == "__main__":
movie()

As we can see, most ...