
Movie Genres Data Analysis Using Pandas
​
​​
To download database: Click here
​
Objective:
This project aims to analyze trends in the movie industry by focusing on genres, profitability, and key metrics such as budget, revenue and popularity.
The goal is to uncover insights related to movie performance and validate hypotheses about the relationship between various factors.
​
Tools Used:
​
Python, Pandas, Matplotlib, Seaborn
​
​
Process:
​
The project begins by loading the dataset and cleaning the data to ensure accuracy.
Duplicate rows are removed to guarantee that each movie is represented only once. Missing values in the genres column are dropped to maintain the integrity of genre-specific analyses.
A new column, profit, is created to capture the difference between revenue and budget, providing a clear measure of each movie's financial success.
To reduce unnecessary complexity, only essential columns—such as popularity, budget, revenue, runtime, and vote_average—are kept.
Additionally, the genres column is split into multiple rows for movies with multiple genres, to be able to do detailed per-genre analysis.
Research Questions & Hypothesis Testing:
​
The analysis explored the most common genres, the financial performance of each genre in terms of average budget, revenue, and profit, and the popularity of genres based on audience ratings and votes. Hypotheses about the relationships between variables such as budget, popularity, and profitability were tested, revealing significant insights about the dynamics of successful movies.
​
Key Results:
​
The analysis shows that Drama and Comedy are the most commonly produced genres, followed by Thriller and Action, which reflects their broad appeal.
Adventure and Fantasy have the highest average budgets and revenues, often tied to large-scale blockbusters, while Action and Science-Fiction also perform well financially. Adventure is the most profitable genre, followed by Fantasy and Animation
In terms of popularity, Adventure leads, with Science-Fiction and Fantasy also being highly favored by audiences.
Drama and Crime have the most highly-rated movies, while Documentaries, despite smaller production scales, are also highly praised.
​
​