Genre Analysis

Are you ready to look into all the genre facettes? Learn about how many movie genres actually exist and what topics shine in which genre? Then come along, Barbie will take your hand and lead your first dance...

Research Questions

The first part of our analysis relates to genre analysis, and the following research questions are to be answered:

  • Do Barbie's and Oppenheimer’s plots differ enough from one another to draw a clear line of differentiation?
  • How accurately do the two movies belong to their respective genres?
  • Can trends be discovered when looking into the sentiment of movies, their grossing, as well as ratings (number and value)?


LDA Technique and Plot Analysis

When constructing our barbie/oppenheimer datasets (plots and movie data), we realized one thing: each movie has too many genres associated with it according to google. We therefore decided to use Latent Dirichelet Allocation to guess how likely are each movie to fall within the categories: Drama, Comedy, Thriller, Fantasy, Adventure and War. The topics obtained were clear with few overlap. We then decided to train 6 models, looking for 6 topics each in only genre specific plots (Adventure, Comedy, Drama, War, Fantasy and Thriller) which we thought would be interesting genres to look at, as Oppenheimer is categorized as a Drama, Biography on IMDb; whereas Barbie is a Comedy, Fantasy, and Adventure.

Here is a light and breezy example of how the different extracted topics look like for the Comedy genre. If you wish to learn more about the other genre's topics and the methods used to conduct this section's analysis visit our More Info page.


Now to the actual Analysis

If you have seen the movies, you could have guessed it: probability distributions are really interesting. Barbie and Oppenheimer seem to have quite different distributions across the board.

In terms of Comedy: Comedy itself is a rather weird genre since it is situational rather than Action-based (physical Comedy could be the same as action without any context). Nonetheless, it seems that Barbie and Oppenheimer belong to different types of Comedy: Barbie's Comedy is more adventurous (topics 1 and 3) whereas Oppenheimer is in a more professional setting (work/school setting with a nice aspect of decision to be made (topics 4,5)).

In terms of Fantasy we are a bit surprised: it seems that Oppenheimer coincides with more sci-fi like Fantasy (topic 2) whereas Barbie is more even across the board. This would make sense since Barbie is more of a realist Fantasy (no mythical creatures and Barbieland itself is a reflection of reality with different societal dynamics).

Adventure is a genre that suffered from our dataset. When running an LDA on adventure movies, we first noticed that some topics put great emphasis on tropical locations which makes sense because adventure makes us think of ruin exploration (Indiana Jones) or treasure hunting with pirates.Topic 3 in Adventure is very urban and city based which would make sense for it to score high in both movies. Still, Barbie does show other aspects of adventure, especially the relational aspect (topic 4) and does not fare too badly in naval adventure (topic 2).

Blinded by the bright pink and light topics of Barbie? Don't worry, Oppenheimer will take it from here... The next genres might just be a bit darker.

Drama is of utmost interest: Barbie coincides with Drama on the relational aspect (involvement of related entities, Barbie/Ken interactions and Barbie/Barbie friendships) but Opppenheimer coincides with a Thriller/Crime ridden Drama which makes sense since the movie contains a juridic trial.

Action and war were investigated as a bonus, Barbie is quite a dynamic movie, it is action-packed, but not too much. At least not enough to be considered an action movie by ImDB. Oppenheimer is a fairly still movie but it does check the destructive and conflictual aspect of Action.

For War, both movie plots contain part of the war lexicon. Barbie contains the marine-related one, whereas Oppenheimer is duty and naval offense.

Overall, the analysis was revealing: Barbie and Oppenheimer are fundamentally different. Oppenheimer specializes in Drama and does so well. On the other hand, Barbie offers a twist on previous Comedy, Adventure and Fantasy formulas by taking aspects of each genre and innovating enough to become its own category. Overall, we believe Barbie prevailed due its dynamicity.

Maybe you share some character traits with Oppenheimer and would like to see the plots, here you go:

Sentiment Analysis

Note that the above hints at another potential problem: change in genres throughout times and the reception of those changes by viewers. Genres evolve similarly to society and what is popular now may not be popular later as such we decided to not consider Fantasy in our analysis and to just compare Dramas vs Comedy/Adventures. The plots below show the evolution of the scores over the years:

The results are interesting, it seems that over the years, the scores were fairly similar overall. The standard deviations tell an interesting story: Comedies in general have a limited amount of negativity whereas Dramas are a series of ups and downs. There exists much more diversity in Dramas than in Comedies.


The following plot shows the t-test p-values for the positive, negative and compound scores:

Right-scroll the table below to see the years in which Comedy scored higher than Drama in terms of compound score:

1900 1905 1910 1915 1920 1925 1930 1935 1940 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010
False False True True True True True False True True True True True True False False False False False True True True True

In general, Comedy/Adventures are statistically significantly better than drama in terms of compound score which was predictable except between 1970 and 1990. It seems that during those years, Drama was more hopeful. Even more interesting is that between 1970 and 1980 drama had a statistically significant higher compound score than comedy which may be due to more positive tones in movies during the end of the Vietnam war.

The negative score itself is much less interesting, Drama quasi always prevails and the year where it does not the difference is not statistically significant.

For future genre based analysis, we will just include compound scores to get a better idea whether there is a relation between success and compound score.

Finally, with respect to Barbie and Oppenheimer. Both movies are significantly different from the mean scores of their respective genres in terms of compound score: Barbie is insanely positive and Oppenheimer is insanely negative. This might explain why they are so memorable and successful but they do not deviate from the trends of their predeceddors.


Revenue Analysis

Let's have a look at the average revenue over the years:

It seems that for the most part, the revenues are the same over the years except for more recent years where Comedy/Adventure has been overtaking Drama by a lot. This may be due to the fact that many more Drama movies are released which decreases the average revenue. We will conduct a independant t-test over the years binning them in groups of 5 to check whether the differences are statistically significant.


Analysis of average rating and votes

Let’s have a look at the average ratings of movies and their number of videos depending on the release year.:

It seems that Drama movies prevail in terms of ratings with non-overlapping confidence intervals. This occurs despite the number of votes overlapping. Since ImDb was created in the 1990s, this makes sense for old movies to have fewer numbers of votes and weird average ratings. For more recent ones, we see that people are more expressive of their opinions and that people are more likely to seek comedy movies by watching older ones. This would essentially mean that Barbie would prevail in terms of viewership but would suffer in terms of reviews. This would also likely translate to an increase in box office revenue.

It seems that overly positive or negative movies perform better than their peers, especially negative ones. It also seems that movies with strong sentiments have more variability to their rating, very hit or miss. It seems that Barbie and Oppenheimer are well set to be blockbusters even at different points in time.


Does pondering over diagrams and formulas suit you as well as does Oppenheimer? It might just be a shared character trait - speaking of charater traits, come take a peak to this next Character Analysis to see how Barbenheimer character's indicate movie success.