The folks over at Polygraph have decided there's "all rhetoric and no data" surrounding the debate around diversity in Hollywood, so they've taken 2,000 screenplays and put them through the magic data machine to lots of graphs with big blue bars and smaller red bars. Have a guess which is which.
Polygraph arranged the data according to how many lines each character has. What exactly is a "line"? It's not 100% clear, though it does talk about it more on the site. It's an important point, especially in the Disney section, with movies featuring songs — naturally consisting of long stretches of words from one character.
The post is worth sharing not just because of its scope, but its readability. Scrolling down on the page will give you more easily readable data that you can put your mouse over to see exactly which films are on the extreme ends of things. No surprises which side of the spectrum The Craft is at, or Foxcatcher. It's also really easy to sort the data by different categories.
It's worth peeking at the methodology. As Polygraph states, a lot can change between the screenplay stage and the final product. Lines are cut, added, new actors brought in, genders are changed, just to name a few things. The study has taken what info it can from the screenplays and IMDB pages to determine gender and age. It openly admits individual pieces of data will be flawed, but maintains that the results are in the right ballpark. It's also worth noting that the creators absolutely do have an agenda — but it appears they've been open, honest, and competent with all the data.
Also check out the interesting data on age, with what seems like some pretty clear cut trends.
Image via Shutterstock