Blog Archives

Quantitative methods and the study of film

UPDATE: By sheer coincidence the day on which I gave this talk in Glasgow was also the day on which the Korean research on movie types was published online by the Journal of Media Economics. You can find a link to the published paper here.

On 14 and 15 May I gave a talk and a workshop at the University of Glasgow of quantitative methods and the study of film. It was very gratifying to meet a group of researchers who were interested in using, were already using, 0r had used quantitative methods and were looking to develop this more, but were a little tentative about moving forward. One thing that occurred to me on the (long) train journeys back from Glasgow is that there are some researchers out there studying film (and other media) who are ready to kick on with developing their quantitative skills but need a push; someone to tell them that it’s OK to do this, that it’s not completely alien and that you don’t need anyone’s permission to do something that is the ‘core process’ of the discipline. In my talk I argued that a change of mindset away from ‘Film Studies’ to the ‘study of film’ is the first step to adding quantitative methods to our toolbox for understanding the cinema. The second step it seems should be building the confidence of researchers to sustain that momentum. Once you’ve got your toes wet you want to get in the pool – but you might need your arm bands for a few weeks.

No-one from Screen attended the talk or workshop.

The text of my talk can be accessed here:

Nick Redfern – Quantitative methods and the study of film

This talk addresses the analysis of film – its texts, its audiences, its political economy – in higher education, arguing for the abandonment Film Studies as either a subject or a discipline and approaching the cinema as a complex object of inquiry that demands an ecumenical methodological perspective in order that its numerous and various dimensions are fully comprehended. Though used widely by those studying the cinema beyond the narrow methodological confines of Film Studies, quantitative methods are at present underused by film scholars. To fix their place in the study of film and place the study of film in the wider world – particularly the BFI’s recent recognition of the importance of evidence-based policy making – I argue there is much to be gained from the application of quantitative methods in studying film and its audiences, and I illustrate this claim by drawing on a range of empirical studies.

This piece refers to some material available online.

The work on audiences and genre from KAIST can be accessed here: Shon, J.-H., Kim, Y.-G., & Yim, S.-J. (2012) Dissecting Movie Genres from an Audience Perspective: MTI Movie Classification Method, KAIST Business School Working Paper No. 2012-008.

Andrew McGregor Olney’s work on film genres can be accessed here: Olney, A.M. (2013) Predicting film genres with implicit ideals, Frontiers in Psychology 3: 565.

The summary of the 2011 Research and Policymaking symposium can be accessed here: Research and Policymaking for Film – A Symposium, 26 October 2011, Report of the Day.

My account of this symposium was published on this blog a week later and can be found here.

(The rhubarb crumble was also very good – and I say that as someone from Yorkshire were all the world’s rhubarb comes from).

Age, gender, and television in the UK

UPDATE: This article has now been published – in a corrected form (see the comments below) – as Age, Gender, and Television in the United Kingdom, Journal of Popular Television 3 (1) 2015: 57-73. DOI: 10.1386/jptv.3.1.57_1. The post print of the article can be accessed here: Nick_Redfern – Age Gender and Television post print.

In December 2011 I published a post on genre preferences among UK cinema audiences, applying correspondence analysis to data from the BFI’s Opening Our Eyes report. You can read the article that was subsequently published in Participations last year here.

At the time I meant to write a follow up piece on genre preferences for UK television audiences using data from the same source but I never quite got round to it. I have now finished this analysis and the draft article can be found in the pdf file attached to this post. I also look at how age and gender affect audiences perceptions of television as a medium

We apply correspondence analysis to data produced for the BFI’s Opening Our Eyes report published in 2011 to discover how age and gender shape the experience of television for audiences in the UK. Age is an important factor in shaping how audience perceive television, with older viewers describing the medium as ‘informative,’ ‘thought provoking,’ ‘artistic,’ ‘good for people’s self-development,’ and ‘escapist’ and while younger viewers are more likely to describe television as ‘exciting,’ ‘fashionable,’ and ‘sociable.’ Younger respondents are also more likely to describe the effect of television on people/society as negative. Variation in programme choice is highly structured in terms of age and gender, though the extent to which of these factors determine audience choice varies greatly. Gender is the dominant factor in explaining preferences for some programme types with age a secondary factor in several cases, while age is the explanatory factor for other genres for which gender seemingly has little influence. Male audiences prefer sports, factual entertainment, and culture programmes and female audiences reality TV/talent shows, game/quiz/panel shows, chat shows, and soap operas. Older audiences prefer news, documentaries, and wildlife/nature programmes, while music shows/concerts and comedy/sitcoms are more popular with younger viewers.

The BFI report and the raw data can be accessed here.

Genre trends in five European countries, 2006 to 2010

This post is an updated and extended piece I wrote last year on genre trends at the box office in five Eurpoean countries with the data cleaned up and new variables considered. Although the numbers have changed slightly from lasty year’s version the orignal conclusions remain valid.

The pdf can be accessed here: Nick Redfern – Genre trends in five European countries


This paper analyses box office trends of the genres for the top 50 grossing films in each year from 2006 to 2010, inclusive, in five European countries – France, Germany, Italy, Spain, and the United Kingdom. We find that, generally, the frequency of genres is homogeneous and that the same types of films dominate the highest reaches of the box office charts; while the number of films unique to a country and the variation among production sources within a country is strongly associated with the distinction between international ‘technology-friendly’ films (action/adventure, fantasy/science fiction, and animated family films) and domestically produced ‘technology-unamenable’ genres (comedy, drama, crime/thriller, romance, and non-animated family films). The results suggest the concepts of national cinema and genre are closely interrelated, and that for audiences in these five European countries the decision about which films to see presents itself as a choice between genres that is often also a choice between Hollywood films and domestic films.


The romance genre at the box office in five European countries, 2006 to 2010

Assuming I have not been defeated by the rivers Wharfe, Aire, and Ouse I shall today be presenting a paper at the International Association for the Study of Popular Romance conference in York (though it is entirely possible that I’m stuck in York railway station). Below is the basic text of my presentation from which I will have inevitably digressed enormously. The pdf file is below. This is based on the same data I used in earlier post on genre and European box office although it has been cleaned up a little so the results are slightly different, though this does not have any impact on the conclusions.

Nick Redfern – The romance film at the box office in five European countries

We analyze the box office performance of romance films in five European countries – France, Germany, Italy, Spain, and the United Kingdom – from 2006 to 2010, inclusive, based on the top 50 grossing films in each country in each year. The results show that romance films account for only a small proportion of the films to reach the top 50 highest grossing films, and that there is no statistically significant variation in the proportion of romance films among the highest grossing films in each country. However, few romance films achieve a high box office ranking in more than one of these countries, indicating a lack of commonality across different markets with different audiences watching different romance films. Romance films achieving top 50 rankings in Germany, Spain, and the UK originate almost exclusively from outside these countries, whereas domestically produced films account for a larger proportion of romance films in France and Italy. Romance films perform consistently at the box office in three of the five countries, albeit lacking the very high grosses achieved by action/adventure, family, and fantasy/science fictions films; while this genre performs particularly poorly in Italy and Spain. Romance films emerge as a fixed part of the exhibition market in all five countries, but the variation in the films viewed, source of productions, and box office grosses indicates some important national differences.

Sequels and remakes in European cinema

In recent years there has been increasing interest in remakes and sequels in the cinema such as Constantine Verevis’s Film Remakes (2006), Anat Zanger’s Film Remakes as Ritual and Disguise: From Carmen to Ripley (2006), and the essays in Andrew Horton and Stuart Y. McDougal’s Play It Again Sam: Retakes on Remakes (1998) and Jennifer Forrest and Leonard R. Koos’s Dead Ringers: The Remake in Theory and Practice (2002) on the one hand and Carolyn Jess-Cooke’s Film Sequels: Theory and Practice from Hollywood to Bollywood (2009) and the essays in Carolyn Jess-Cooke and Constantine Verevis’s Second Takes: Critical Approaches to the Film Sequel (2010). See my earlier post on Hollywood remakes and sequels here.

In this post I look at the number of remakes and sequels to make the top 50 grossing films in France, Germany, and the UK from 2006 to 2010 (see here for a description of the sample).

To take remakes first the first thing we notice is that there are so few of them: seven in Germany, five in France, and nine in the UK. Given that the sample used here covers 250 films over a five-year period, it is clear that remakes constitute only a small proportion of the highest grossing films in these countries. Three action and adventure (AAD) films are common to each country (Casino Royale, Clash of the Titans, and The Karate Kid), while of the comedy (COM) films The Pink Panther features in both Germany and the UK. The Departed made the top 50 in all three countries, while Fun with Dick and Jane achieved a high-ranking in Germany and the UK in the crime and thriller genre (CTH). Only one Fantasy and Science Fiction (FSF) remake made the top 50: the 2008 version of The Day the Earth Stood Still. The 2007 version of Hairspray made the top 50 in the UK. Interestingly, there are no remakes in the drama (DRA) genre. It is notable that these remakes are all Hollywood films. The only remake to make the top 50 in any of these countries that was not a Hollywood film was St. Trinian’s, which ranked in the UK only.

Sequels account for 62 films in the total sample for Germany and the UK, and 54 in France. Figure 1 shows the percentage of sequels in each genre for each country. What is immediately apparent from Figure 1 is that sequels account for a large proportion of film in some genres but not others, and that the proportion of sequels in each genre is similar in each country with the exception of films classed as ‘other’ (OTH).

Figure 1 Percentage of sequels in eight genres in the top 50 grossing films from 2006 to 2010 in three European countries

Sequels account for between 43 and 52 percent of action and adventure films, and these are all Hollywood franchise films (The Dark Knight, Spider-man, Mission Impossible, Die Hard, Pirates of the Caribbean, Transformers, etc). Similarly, between 26 and 31 percent of fantasy and science fictions are sequels from Hollywood franchises (Harry Potter, Terminator, The Chronicles of Narnia, etc). Although many of the films in these genres are Hollywood productions produced in Europe (and can thereby classed as some sort of co-production), there are no sequels in the top 50 of these countries that can classed as domestic productions.

Sequels also account for a substantial proportion of family films in these countries (between 26 and 34 percent). In France and Germany this includes some domestically produced films that belong to franchises (e.g. Asterix and Arthur in France and Die Wilden Kerle in Germany), though the majority of the sequels are films from Hollywood series (Garfield, Ice Age, Shrek, Toy Story, Madagascar, etc). In the UK family films that are sequels are all Hollywood films and there are no domestically produced series of family films.

Sequels account for a much smaller percentage of the other genres. Comedy film sequels in Germany and the UK are dominated by Hollywood films, but in France there are some domestically produced sequels (Camping 2, the OSS 117 series). Crime and thriller sequels are all Hollywood films (Ocean’s Thirteen, The Bourne Ultimatum) in each country. The single drama sequel in Germany and the UK is Elizabeth: The Golden Age. The sequels in the romance genre are exclusively Hollywood films (mostly Sex and the City and Twilight films), with the exception of Zweiohrküken in Germany. France has a much smaller percentage of sequels in the ‘other’ genre due to the lack of horror films and dance films. In both Germany and the UK films from the Saw and Final Destination franchises made the top 50, as did films such as Step Up 2 and Step Up 3D.

In summary, remakes comprise only a small proportion of films to make the top 50 in France, Germany, and the UK between 2006 and 2010, while genre is clearly important in understanding the frequency with which sequels occur in these countries. Though there are some remakes and sequels of European origin the overwhelming majority of these films are from Hollywood and this accounts for the consistency of the proportion of films across the different countries. Some European films have produced sequels but many have not and it is a key area of research on this type of film to understand why not. Another question to address is the lack of European remakes: why is that Hollywood is able to remake both its own films as well as films from other countries while European film industries can do neither? It is perhaps the absence of European remakes and sequels that is the most interesting thing about them.

On researching genre

Last year I wrote a piece on genre trends at the US box office over the past two decades, which you can find here. I submitted this piece to the European Journal of American Culture, and having done some revisions I heard from the editor yesterday that it is likely to be published later in the year. This week I want to comment briefly on a point raised in the peer review process regarding the problems of researching genre.

In my paper I sorted films achieving high box office rankings into nine broad categories: ‘action/adventure,’ ‘comedy,’ ‘crime/thriller,’ ‘drama,’ ‘family,’ ‘fantasy/science fiction,’ ‘horror,’ ‘romance,’ and ‘other.’ The reviewer raised the following point:

… it was never clear to me, at least, on what basis the generic trends they isolated and analysed were identified, are they drawn from industry accepted classifications, or are they drawn from the authors’ observations? ‘Family,’ ‘romance,’ ‘comedy,’ ‘fantasy/science fiction’ maybe self-explanatory, but what’s the difference between action/adventure and the latter, or between it and crime/thriller? And what constitutes a “drama”? Perhaps a fuller discussion/review of the cycles of films that make up the trends they have identified would make classification less problematic …

This clearly relates to the four problems of genre definition described by Robert Stam (2000: 128-129):

  • Extension: generic labels are either too broad or too narrow;
  • Normativism: having preconceived ideas of criteria for genre membership;
  • Monolithic definitions: as if an item belonged to only one genre;
  • Biologism: a kind of essentialism in which genres are seen as evolving through a standardised life cycle.

To these we can add the ‘empiricist dilemma’ of analysing genre films to determine which genres they belong to and why only after we have first defined the genres themselves (Tudor 1974).

There are no simple definitions of genres, and trying to solve this riddle has probably driven several film scholars o despair. In fact, one of the two things that everyone agrees on when discussing genres is that no-one agrees about genre definitions. For example, in 1975 Douglas Pye warned against treating genres as Platonic forms that are ‘essentially definable’ and of approaching genre criticism ‘as in need of defining criteria’ (Pye 1975: 30, original emphasis). The same argument is made by David Bordwell 14 years later, arguing there is no fixed system of genre definitions in the film industry or film studies and that no strictly deductive set of principles is capable of explaining genre groupings (1989: 147). In 2008 Raphaëlle Moine writes of being in the ‘genre jungle’ that we are unable to clear with ‘a few machete blows as strong as they were lethal;’ and that not only are definitions of individual genres problematic, the very concept of genre itself and how it functions for producers and audiences is itself ‘neither definitive, nor perfect, nor incontestable’ (2008: 27).

If we consider film genres as categories of classification, one can only note the vitality of generic activity at an empirical level, and the impossibility of organizing cinema dogmatically into a definitive and universal typology of genres at a theoretical level. Categories exist but they are not impermeable. They may coincide at certain points, contradict one another, and are the product of different levels of differentiation or different frames of reference (Moine 2008: 24).

I think that this sums up the problems of researching genre very simply and very clearly. What it doesn’t do is help me with the reviewer’s comments. In fact, it makes them more complicated since we have to acknowledge that ‘family,’ ‘romance,’ ‘comedy,’ and ‘fantasy/science fiction’ are not as unproblematic as we might at first suspect. This is in fact obvious in the above comments: the reviewer immediately questions the distinction between ‘fantasy/science fiction’ and ‘action/adventure,’ and so there is clearly some doubt here. So what should I do?

One solution is to give up. We could simply admit that genres are undefinable, that it is pointless to even attempt any sort of genre analysis given that we cannot begin to describe the object of inquiry or to delineate any individual genres, and regard all genre scholarship as inherently flawed.

This is a ridiculous approach to take since genre categories are obviously widely used by the film industry and by audiences day-to-day in a diverse set of contexts. This is other thing that everyone agrees upon: genre is important. And if it is important then it is definitely something that should be the subject of empirical analysis. So, again, what should I do?

The solution I arrived at was to recognise the subjective nature of genre definitions, but to also make a distinction between ‘subjective’ and ‘arbitrary.’ My inspiration in this was Bayesian probability theory. For a brief overview on Bayes’ theorem and a demonstration of its use see my earlier post on modelling narrative comprehension here. In Bayesian theory probabilities express an agent’s degree of belief in a statement: so a statement like ‘I think there is a 80% chance of rain this afternoon’ is a my belief that it will rain after midday expressed as a probability [1]. The Bayesian approach assumes I am rational agent who holds an opinion about the likelihood of an event based on the available information (the forecast is for rain, it’s the autumn, I live in the north of England, etc). As I acquire new information I can update this probability and revise the intensity of my belief by applying Bayes’ theorem. My belief is subjective but it is not arbitrary: Pierre-Simon Laplace referred to probability in this sense being ‘only good sense reduced to calculus.’

A criticism of the Bayesian approach to probability is that it is subjective and that because different agents have possess different amounts of information the probabilities they express tell us nothing about the world and refer only to the opinions themselves. We cannot therefore arrive at the same conclusions about data since we start at different places. The Bayesian argument against this is based on two principles:

  1. Our beliefs are based on defensible reasoning and evidence.
  2. Through an ongoing process of analysis (accumulation of data, reviewing methodologies and assumptions, etc.) differences in prior positions are resolved and consensus is reached.

Described in these terms, Bayesian probability is itself a model of an ongoing process of scientific inquiry in which differences of opinion are acknowledged and resolved by examining and re-examining data and methods so that clear conclusions may be reached because the weight attached to the evidence comes to carry more than our prior beliefs as we learn more and more about the system we are studying.

The Bayesian argument is I think useful for thinking about researching genre. I’m not advocating that we should start calculating probabilities for our degrees of belief in genres; only that we should use this approach to reasoning as a model for understanding how we conduct research in situations where we do not have definite categories. The statistician CR Rao put it in the following terms: uncertain knowledge + knowledge of amount of uncertainty = useful knowledge. We want useful knowledge about genre, and we can get it despite our uncertainty about genres.

The results of my study of recent genre trends at the US box office found that a limited range of special effects-based films from the action/adventure and fantasy/science fiction genres have come to dominate the US box office at the expense character- and narrative-driven films (crime/thriller and drama films) that were previously identified as the most popular. These results are similar to those reported by Lu et al. (2005) and Ji and Waterman (2010) who found that the five most frequently occurring genres were action, adventure, comedy, thriller, and drama; and that all but the last of these had increased in frequency at the highest box office rankings while drama films had declined from being the most frequently occurring of these genres in 1967-1971 to the least frequently occurring in the period 2002-2004. These papers used a different method of assigning films to genres and yet my results broadly corroborate their conclusions. Now the authors of these studies and myself both acknowledge that genre definition is a methodological problem, but since we now have some evidence and methods to evaluate we can start to pick out the key facts:

  1. the increasing dominance of spectacle-based technology-driven genres at the US box office
  2. the decline of ‘technology-unamenable’ genres

We can also pick out some points of difference. For example, my results indicate a decline in crime/thriller films, whereas these other studies do not. This may result from different ways in which films are classified, the different time periods covered by the studies (1960s-2000s or 1991-2010), or how deeply we go into the box office rankings (top 20 or top 50), and so on. But at least we can begin to understand why these differences occur and work towards resolving them because the papers give a description of their methodologies.

Thus, despite the fact that no-one agrees on genre definitions, we can come to some consensus about the main genre trends in the US. Not because we have plucked them out of thin air, but because we have a way of dealing with the inherent uncertainty with which researchers must cope. Despite the fact that we start from different places, we can arrive at the similar conclusions and thereby establish a body of useful knowledge. This does not mean that we should view these studies as being mutually supporting since relying on the principle of non-contradiction as a basis for empirical research leads to all sorts of ridiculous arguments (see here). But it does mean that as we update our knowledge and review our methods we can begin to build consensus rather than bemoaning the lack of agreement about the definitions of genres. Just as producers and audiences use genre categories every day with seemingly few problems, so do film scholars; and any conclusions we may come to are far more interesting than a recitation of the problems described above. Afterall, there is quite a lot of research on genre in film studies.

When conducting empirical research on genre we should bear in mind the following:

  • The genre definitions used by scholars are subjective but they are not arbitrary, being based on defensible reasoning
  • Empirical studies of genre need to be replicated to test conclusions
  • Replication of studies is required to identify where differences do in fact occur
  • Film scholars need to spend less time thinking about the problems of genre and devote more effort to accounting for the methodologies they do use so that others may properly evaluate their conclusions
  • The study of genre is an ongoing reflexive process

Genre may be a matter of opinion, but it is orderly opinion based on reasoned judgements, and the empirical study of genre is a reflexive, scientific process that arrives at definite, useful, and interesting conclusions even though we often start from different places.


  1. Eric Rohmer’s Ma nuit chez Maud/My Night at Maud’s (1969) features a discussion of Pascal’s wager in an early scene between Jean-Louis and Vidal that includes the concepts of expectation and utility (‘Mathematical hope: potential gain divided by probability’), the expression of subjective (i.e. Bayesian) probabilities, and the terms ‘hypothesis,’ ‘likely,’ ‘chance,’ ‘odds,’ ‘probability,’ and ‘infinite.’


Bordwell D 1989 Making Meaning: Inference and Rhetoric in the Interpretation of Cinema. Cambridge, MA: Harvard University Press.

Ji S and Waterman D 2010 Production Technology and Trends in Movie Content: An Empirical Study. Working Paper, Department of Telecommunications, Indiana University, Bloomington, IN.

Lu W, Waterman D, and Yan MZ 2005 Changing markets, new technologies, and violent conduct: an economic study of motion picture genre trends, The 33rd Annual Telecommunications Policy Research Conference, 23-25 September 2005, Washington, DC.

Moine R 2008 Cinema Genre, trans. Alistair Fox and Hilary Radner. Malden, MA: Blackwell.

Pye D 1975 Genre and movies, Movie 20: 29-43.

Stam R 2000 Film Theory: An Introduction. Oxford: Blackwell.

Tudor A 1974 Theories of Film. London: Secker and Warburg.

Genre and the UK box office 2011

The top 50 grossing films in 2011 at the UK box office account for a total of $1264 million (approximately £813 million at £1=$1.5547). A breakdown of the total gross by genre is given in Table 1. (For consistency, I’ve employed the same genre classifications that used in earlier posts).

The highest grossing film by quite some distance was Harry Potter and the Deathly Hallows (Part 2) with $117.2 million (~£75.4 million), easily outstripping The King’s Speech ($75.0 million/£48.2 million).

Table 1 Top 50 UK grossing films 2011 by genre (Source: Box Office Mojo)

Two of the top 10 films were action/adventure films: Pirates of the Caribbean: On Stranger Tides (3D) (4th) and Transformers 3 (7th). The performance of the third Transformers film is comparable to the first two (give or take an adjustment for inflation): Transformers grossed $49.9 million in 2007 and Revenge of the Fallen grossed $44.4 million in 2009 (these figures are in 2010 US dollars), while T3 grossed $45.1 million (in 2011 dollars). In contrast, Pirates of the Caribbean: On Stranger Tides grossed only $54.2 million (2011 dollars) compared to $106.8 million for Dead Man’s Chest in 2006 and $85.6 million for At World’s End in 2007 (both in 2010 dollars). Thus the Transformers franchise has maintained its level from film to film, whereas the gap between the 2007 and 2011 films and the loss of key cast members (Orlando Bloom, Keira Knightly) for On Stranger Tides has seen the Pirates franchise shed a substantial part of its value in the UK market.

2011 was comedy’s year. Comedy just beat out action/adventure as the second highest grossing genre and accounted for seven films in the top 50, but of these four made it into the top 10: The Inbetweeners Movie, The Hangover Part II, Bridesmaids, and Johnny English Reborn. The median gross for 54 comedy films to make the top 50 in the UK from 2006 to 2010, inclusive, is $12.84 million (in 2010 dollars); but the median gross last year (in 2011 dollars) was $32.0 million. The Inbetweeners Movie is the highest grossing comedy film in the UK in the past six years with $71.2 million/£45.8 million, easily beating Borat into second place (which grossed $49.8 million in 2006, in 2010 dollars). No matter how you look at it, that’s a big success for a movie based on a British TV show. Paul (21st), Horrible Bosses (27th), and Bad Teacher (37th) were less impressive, but comedy was the big story at the UK box office in 2011.

The most frequently occurring genre is family films accounting for 15 films, which have not performed outstandingly well. In fact this genre did not perform even close to family films in recent years, when Toy Story 3, Shrek 3, Ice Age: Dawn of the Dinosaurs, and Up have been amongst the very highest grossing films in the UK. The highest grossing family film in 2011 was Tangled, which was only the ninth highest grossing film of the year. Eight of the family films grossed less than $15.54 million or £10 million pounds. Why might this be the case? Well, if we look at the family films that made it into the top fifty (Table 2) we note that many of them are animated films while very few are love action films. It may be that the family genre suffered from a lack of variety with a glut of animation and too few other types of family films to attract a diverse audience. There is no Night at the Museum film in this year’s top 50, and Mr. Popper’s Penguins is too close to Happy Feet to make the difference worth noting. Horrid Henry seems to have performed particularly poorly. It is also interesting that The Lion King outperformed many new films, but then it would not be unfair to state that, compared to recent years, this year’s animated offerings were not as good as in recent years. Certainly, there is no Ponyo or Up amongst those films listed in Table 2.

Table 2 Rank and total grosses of family films in the UK 2011 (Source: Box Office Mojo)

As noted above the top grossing film last year was a fantasy/science fiction film, but Harry Potter accounted for 65% of the total gross for this genre in the top 50. Rise of the Planet of the Apes performed respectably as the 11th highest grossing film, but the other three films (Super 8, Source Code, and The Immortals) all feature in the bottom 10 films. In fact, Source Code and The Immortals were ranked 49th and 50th respectively.

The King’s Speech accounts for 51% of the gross of drama films, with the three other films performing modestly. The Black Swan ranked 15th, grossing $26.0 million ($16.7 million), but I can’t decide if this is a good performance of a film about ballet or a disappointment for an Academy Award winning film. 127 Hours (39th) and The Fighter (47th) also performed poorly despite Oscar nominations and awards.

Beyond these five genres, there is very little to note about the others.

The majority of the gross for romance films is accounted for The Twilight Saga: Breaking Dawn Part 1, the 6th highest grossing film of the year. This Twilight film achieved similar rankings to Eclipse (2010 – 6th) and New Moon (2009 – 7th); and achieved similar grosses. The other romance films – One Day (38th) and Friends with Benefits (46th) – aren’t worth commentating on.

Only two horror films made the top 50: Paranormal Activity 3 (26th) and Insidious (42nd). Measured in 2010 dollars, Paranormal Activity grossed $16.3 million in 2009 and Paranormal Activity 2 grossed $17.5 million in 2010. The third instalment in the series grossed $17.0 million in the UK (in 2011 dollars), and so while this series is not troubling the upper reaches of the box office charts it is consistent in the level of its gross from film to film and year to year.

The one film classed as ‘other’ is the Coen Brother’s version of True Grit, which ranked 35th.

Crime/thriller films are barely worth commenting on. The highest grossing film in this genre (if you don’t consider it be an action/adventure movie) is Sherlock Holes: Games of Shadows (18th) and this film was only released on 16 December 2011. Tinker, Tailor, Soldier, Spy (23rd), Limitless (33rd), and Unknown (44th) did very little business. The television schedules in the UK are full to overflowing with crime dramas – Lewis (and the upcoming Endeavour), Midsommer Murders, New Tricks, Sherlock, and so on, along with masses of imports from America (CSI, Criminal Minds, NCIS, The Closer, etc) and Europe (The Killing, Wallander, Romanzo criminale) – so there is clearly an audience for producers to tap into. But no one makes crime movies anymore. Weird.

Correspondence analysis of genre preferences in UK film audiences

UPDATE: this piece has now been published as Correspondence Analysis of Genre Preferences in UK Film Audiences, Participations 9 (2) 2012: 45-55. The article can be downloaded here.

UPDATE: I’ve now done a similar analysis for genre preferences in UK television audiences using data from the same BFI study, which you can find on this blog post.

Genre provides viewers with a first reference point for a film, and functions as a ‘quasi-search’ characteristic through which audiences assess product traits without having seen a particular film (Hennig-Thurau et al. 2001). In a market place comprising a large number of unique cultural products with no unambiguous reference brand, audiences form experience-based norms at the aggregate level of genre rather than the specific level of individual films (Desai & Basuroy 2005). Consequently, genre is the means by which the film industry alerts viewers that pleasures similar to those previously enjoyed are available without compromising the need for novel products; and empirical research has shown that genre is an important factor – if not the most important – in audiences’ decision making about which film to see (Litman 1983, Da Silva 1998).

Understanding audience preferences for certain types of films is therefore a priority for film producers and distributors as this will be a factor in deciding which films to produce and how to market them effectively. In this short paper we analyze the genre preferences of UK film audiences, applying correspondence analysis to data produced by the British Film Institute’s research into the cultural contribution of film in the UK. Specifically, we focus on how genre preferences vary with gender and age when treated as a single composite variable.

The BFI dataset

In July 2011, the British Film Institute (BFI) published a report, Opening Our Eyes (Northern Alliance/Ipsos Media CT 2011), examining the cultural contribution of film in the UK [1]. This report analysed how audiences consume films and attitudes to the impact of film based on a series of qualitative ‘paired depth’ interviews and an online survey of 2036 UK adults aged between 15 and 74.

Question C.1 in the questionnaire invited respondents to express preferences for their favourite genres/type of films from a list comprising action/adventure, animation, art house/films with particular artistic value, comedy, comic book movie, classic films, documentary, drama, family film, fantasy, foreign language film, horror, musicals, romance, romantic comedy, science fiction, suspense/thriller, other, none, and don’t know. Respondents were able to select as many genres as they wished, and the data represents the number of respondents expressing a preference for that genre. Figure 7 in the final report presents the breakdown of genre preferences by gender, concluding that male audience members exhibit stronger preferences for science fiction, action/adventure, and horror films while women preferred romantic comedies, family films, romances, and musicals [2]. In an additional detailed summary made available online, genre preferences were broken down by age group. These results showed younger respondents were more likely select comedy, horror, animation, and comic book as their favourite genres, whereas older audience members were more likely to select dramas, documentaries, and classic films.

The report did not present any findings regarding genre preferences based on the combination of the gender and the age of the subjects, and it is this interaction analysed here. In addition to publishing the final report the BFI has made the full set of result tables from the quantitative survey available to researchers freely online. Table 416 of this output contains the data on gender, age, and genre preferences, and is the basis for our correspondence analysis. We use nineteen of the categories listed above, with ‘don’t knows’ excluded from the analysis. Table 416 lists the additional genre categories of westerns, historical, war, and gangster films, and these have been included in the category ‘other.’

Correspondence analysis

Correspondence analysis (CA) is a multivariate technique for exploring and describing frequency data defined by two or more categorical variables in a contingency table. By calculating chi-square distances between the row and column profiles in a table, CA determines the (dis)similarity of the reported frequencies. CA aims to reveal the structure inherent in the data, and does not assume an underlying probability distribution. Consequently, CA requires that all of the relevant variables are included in the analysis and that the entries in the data matrix are nonnegative, but makes no other assumptions. CA does not support hypothesis testing, and cannot be used to determine the statistical significance of relationships between variables. Here we describe the outputs of the correspondence analysis and their interpretation, and the reader can find introductions to the theory and mathematics of CA in Clausen (1998), Beh (2004), and Greenacre (2007).

The first output of the correspondence analysis is a table describing the variation in the contingency table, referred to as the inertia. The total inertia in the table is equal to the chi-square statistic divided by the total sample size:  Φ² = χ²/N. This variation is decomposed into the principal inertias of a set of dimensions, each accounting for a percentage of the total inertia. For an r × c table, the maximum number of dimensions is min(r-1, c-1). The number of dimensions retained for analysis is based on the first k dimensions to cumulatively exceed a threshold (typically 80 or 90 per cent of the total inertia), all those individual dimensions accounting for more than 1/(min[r, c] – 1)% of the total inertia, or by reference to a scree plot of the inertias to determine where the drop in the percentage accounted for by a dimension drops away less rapidly. It is also dependent on our ability to give a meaningful interpretation to the dimensions selected. In selecting only a subset of the available we lose some of the information contained in the original table, but in discarding some dimensions we are able to see structure of the data more clearly for as little cost as possible.

As a form of geometric data analysis, correspondence analysis enables the information in a contingency table to be represented as clouds of points in low-dimensional graphical displays (see Le Roux & Rouanet 2005, Greenacre 2010: 79-88). The origin of the graph represents the average row (column) profile, and by assessing the distance of points from the centroid of the clouds we describe the variation within the table and their similarity. Row (column) points that lie close to the origin are similar to the average profile of the row (columns). Data points that lie far from the origin indicate categories for which the observed counts differ from the expected values under independence and account for a larger portion of the inertia. Points from the same data set lying close together represent rows (columns) that have similar profiles, and data points that are distant from one another indicate that the rows (columns) are remote. The distance between row points and column points cannot be interpreted as meaningful as they do not represent a defined quantity. The angle (θ) subtended at the origin defines the association between row and column points: when the angle is acute (θ < 90°) points are interpreted as positively correlated, points are negatively correlated if the angle between them is obtuse (θ > 90°), and points that subtend a right angle (θ = 90°) are not associated (Pusha et al. 2009).

In addition to the graphical displays, a detailed numerical summary of the correspondence analysis is produced. The mass of a row (column) indicates the proportion accounted for by that category with respect to all the rows (columns), and is simply the row (column) total of divided by the total sample size; while the inertia of a data point is its contribution to the overall inertia. The squared correlation describes that part of the variation of a data point explained by a particular dimension. The quality of a data point measures how well it is represented by the graph, and is equal to the sum of the squared correlations of the dimensions retained for the analysis. The higher the quality of a data point the better the extracted dimensions represent it, and ranges from 0 (completely unrepresentative) and 1 (perfectly represented). The absolute contribution of a data point describes the proportion of the inertia of each dimension it explains, and is determined by both the mass of the data point and its distance from the centroid.

Gender, age, and genre preferences

Table 416 of the BFI’s results output presents counts of genre preferences sorted by gender, by age, and by gender and age. As our interest lies in the variation of genre preferences (19 categories) among UK audiences based on both gender and age we use only this last part of the table, treating ‘gender-age’ as an interactively coded variable with 10 categories combining all the levels of the variables gender (2 categories) and age (5 categories) (Greenacre 2007: 121-128). We apply correspondence analysis to this table using the ca package (version 0.33; see Nenadić & Greenacre 2007) in R (version 2.13.0).

Table 1 presents the 10 × 19 cross-tabulation of ‘gender-age’ with genre. The chi-square statistic for this table is 1312.28 (N = 13086, df = 162, p = <0.01), and we therefore conclude that there is a statistically significant association between gender-age and genre preferences for UK film audiences. However, there is only a weak correlation between ‘gender-age’ and genre preference, with just 10% of the variation in Table 1 due to dependence: Φ² = χ²/N = 1312.28/13086 = 0.1003.

Table 1 Cross-tabulation of interactively-coded gender-age variable with genre. Cell counts represent the number of respondents in each group expressing a preference for a genre. Source: BFI/Northern Alliance/Ipsos Media CT. Click on the table to see it full size.

Table 2 shows the principal inertias, percentages, and cumulative percentage of each dimension, with a scree plot of the inertias. The first two dimensions account for 90.6 per cent of the inertia and the scree plot flattens out after the second dimension. Consequently, these dimensions were retained for analysis and the remainder were discarded.

Table 2 Principal inertias of the correspondence analysis applied to Table 1 explained by dimensions with scree plot

Figure 1 is the resulting symmetric map based on these two dimensions. Tables 3a and 3b present the detailed numerical summary of the results for the rows (gender-age categories) and columns (genre categories), respectively. Click on the graph to see it full size.

Figure 1 Symmetric correspondence analysis map of interactively coded ‘gender-age’ cross-tabulated with genre for UK film audiences

Table 3a Detailed numerical summary of correspondence analysis by gender-age. Click on the table to see it full size.

Table 3b Detailed numerical summary of correspondence analysis by genre. Click on the table to see it full size.

From Table 3a and Figure 1 we see a clear horizontal separation between the male and female respondents, with points arranged vertically by age group from youngest to oldest within each gender category. Consequently, we interpret the principal axes in terms of the rows of Table 1, with the first dimension understood as gender and the second dimension as age. As gender accounts for 64.3 per cent of the total inertia compared to 26.3 per cent for age, this factor is dominant and explains the major part of the variation in Table 1. The quality for the gender-age groups is high (see Table 3a), and these factors are well represented in two dimensions. The points for all gender-age groups are distant from the origin, indicating that no group is close to the average profile in either dimension and that all the groups contribute to the overall inertia.

From Figure 1 we see the distance between the points representing male audience members greater as the age of the respondents increases. The points for males aged 15-24 and 25-34 are very close indicating they have similar row profiles and, therefore, similar genre preferences. The two middle-aged groups are distant from both the youngest and the oldest, while also being remote from one another. Males over the age of 55 are remote from the other age groups, indicating that their genre preferences are substantially different from those of younger male audience members. The points representing female respondents show a similar pattern with the middle-aged groups distant from both youngest and oldest and with over 55s are remote from younger female audience members in their preferences. The greatest contrasts in genre preferences are observed when taking gender and age together: females over 55 are most different from males aged 15-24, and males aged 55+ are most different from young women.

A key difference between audience groups is how the importance of the factors of gender and age vary in explaining their genre preferences. Age becomes increasingly important in the representation of the points for male audience categories. The squared correlations for the three youngest male groups are greatest for dimension 1, indicating that their gender is more important in explaining their preferences than age; for males aged 45-54 gender is still the dominant component albeit to a lesser extent than younger cohorts and the influence of age becomes more apparent in the raised squared correlation for dimension 2; while for males aged 55+ age is the dominant factor. This pattern is not evident for female respondents, and looking at the squared correlations in Table 3a we see the opposite pattern to male audience members. The squared correlations for women aged 35-44, 45-54, and 55+ are dominated by the dimension of gender, whereas age is the main factor for the two youngest groups. However, it should be noted that for the females aged 15-24, gender does contribute substantially to the representation of this point.

Although the correlation between gender-age and genre preference is low, it is clear from these results that the variation within Table 1 is highly structured in terms of the gender and age of the respondents. Describing the preferences of UK cinemagoers therefore requires taking both these factors into account and failure to do so leads to much useful information being obscured. The headline percentages reported by the BFI give only a partial picture of the genre preference of UK film audiences that fails to adequately capture that structure.

Turning to the genre categories themselves we see that the quality of these points is high (see Table 3b), indicating they are well represented in two dimensions and that gender and age are good predictors of the genre preferences of UK audiences. However, we note the quality of the representation for foreign (0.41) and art-house (0.14) films by these two dimensions is very low. This indicates gender and age do not explain variation in audience preferences for these types of films, and that some other factor should be considered. Based on other data available in the BFI’s results output, level of educational attainment is a better predictor of audience preference for these types of films: Table 20 of the results output cross-tabulates level of education and type of film most often watched, with 68 per cent of respondents selecting foreign language films educated to degree level. These two categories are typically applied to films to distinguish them from mainstream cinema (i.e. Hollywood films), and may not function as genre labels in the same context as terms such as ‘comedy,’ ‘drama,’ etc.

The quality of the categories ‘other’ and ‘none’ are also much lower than the mainstream genres, but as these points represent indistinct categories we do not discuss them further.

Gender is the most important factor in determining genre preference, with the cloud of points representing genres orientated along the first principal axis. Family films, romance, and romantic comedies are all associated with female audiences. In fact, 83 per cent of respondents to express a preference for romance films were female, and the corresponding figures are also high for family films (64%) and romantic comedies (72%). Musicals are also strongly associated with female audiences (71%), but this category is dominated by over 55s: over a quarter of respondents expressing a preference for this genre are in this age group. Drama also lies along the same direction as females over 55 indicating that this group is associated with this genre, but the distance from the origin is smaller reflecting a smaller effect. The proportion of males over 55 selecting drama films as a preferred genre is also greater than younger male viewers, but not to the same extent as their female counterparts. In fact, female viewers in each age group expressed a stronger preference for drama films than male viewers of the same age.

Genres associated with male audiences tend to be action-based and technology-driven. Of respondents expressing a preference for science fiction films, 65 per cent were male and there is little variation between age groups within this gender category. Consequently, this genre is very well represented by the first principal axis and age is not a significant factor. This is also the case for action/adventure films (58%), albeit it to a lesser degree as this point lies nearer the origin. Comic book, fantasy, and horror films are strongly correlated with male audiences, and lie along the same direction as males aged 15-24 and 25-34 indicating that age also a key factor here. The squared correlations for gender are the dominant factors for these genres, but age also contributes a substantial part of these points’ representation.

It is interesting that genres we associate with male audiences appear to have broader appeal than genres we associate with female audiences. Dividing the cells by the column totals to give the proportion of respondents in each gender-age group expressing a preference for a genre, we see that no male age group accounts for more 4 per cent of the total for romance films compared to the very large proportion for female audiences noted above. Although female associated, family films do not show the extreme divide as romance films, romantic comedies, and musicals. For science fiction films, the female respondents account for a total of 35 per cent of the expressed preferences for this genre, with each age group within this gender category contributing between 5 and 8 per cent of the total. This is also the case for comic book and action/adventure films. We conclude that so-called ‘female genres’ hold very little appeal to male audiences; and that while similar patterns are certainly evident for ‘male genres’ the effect is much smaller.

Three genres show high squared correlations with age. In all the cases the contribution of the first principal axis is small, and we conclude that gender is relatively unimportant in explaining audience preferences for these films. Animation is associated with under 35s, though female viewers aged 35-44 account 13 per cent of the column total in Table 1 possibly due to selecting these films for family viewing. Documentaries and classic films are associated with over 55s. Of those expressing a preference for documentaries, 18 per cent were males over 55 and 17 per cent were females in the same age group. There is no specific trend among the other age groups, which show roughly equal levels of interest in these films. It is noticeable that proportion selecting classic films increases with age, though this may reflect the aging of the audience rather than a clear genre preference as the new films of one’s youth become classics with time.

Two genres – comedy and suspense/thriller – lie near the origin. These points also have the lowest quality of the mainstream genres, though both are still well represented in Figure 1. Both dimensions contribute to the representation of these points, indicating that gender and age are relevant factors. Gender makes a larger contribution to comedy than age, with males under 35 slightly more likely to express a preference for this genre than males over 35 or female viewers; while for suspense/thrillers over 55s of both genders account for slightly greater proportion of the preferences expressed for this category. However, it is their closeness to the average profile that is most informative about these points, indicating that all gender-age groups enjoy these types of films. This does not mean that they are watching the same films within these genres – it is very unlikely males aged 15-24 are watching the same comedy films, for example, as women over 55; but the BFI’s data cannot help us to explore this aspect.


This study analyzed the genre preferences of British film audiences. We have replicated the results originally presented by the BFI, and have extended them to reveal additional patterns in the data. Correspondence analysis enables us to obtain an overview of how different sections of the audience for films in the UK relate to one another, and to assess the relative importance of different factors in explaining the variation among audiences and their genre preferences. The study showed that gender is the dominant factor in determining audience preferences, with age an important but secondary factor. Most genres can be identified as either ‘male’ or ‘female’ with clear age profiles evident within gender categories, though preferences for animated films, classic movies, and documentaries are determined by age alone. These factors do not adequately explain variation among audiences when applied to categories of films that lie outside mainstream cinema.


1.The report, the research questionnaire, the detailed summary, and the full set of result tables are available at, accessed 21 November, 2011.

2. The report also presents results based on respondents’ ethnic minority but these will not be discussed here.


Beh EJ 2004 Simple correspondence analysis: a bibliographic review, International Statistical Review 72 (2): 257-284.

Clausen S-E 1998 Applied Correspondence Analysis: An Introduction. Thousand Oaks, CA: Sage.

Da Silva I 1998 Consumer selection of motion pictures, in BR Litman (ed.) The Motion Picture Mega-industry. Boston: Allen and Bacon: 144-171.

Desai KK and Basuroy S 2005 Interactive influence of genre familiarity, star power, and critics’ reviews in the cultural goods industry: the case of motion pictures, Psychology and Marketing 22 (3): 203-223.

Greenacre M 2007 Correspondence Analysis in Practice, second edition. Boca Raton, FL: Chapman & Hall/CRC.

Greenacre M 2010 Biplots in Practice. Bilbao: Fundación BBVA.

Hennig-Thurau T, Walsh G, and Wruck O 2001 An investigation into the factors determining the success of service innovations: the case of motion pictures, Academy of Marketing Science Review 6:, accessed 24 May 2011.

Le Roux B and Rouanet H 2005 Geometric Data Analysis: From Correspondence Analysis to Structural Data Analysis. Dordrecht: Kluwer Academic Publishers.

Litman BR 1983 Predicting success of theatrical movies: an empirical study, Journal of Popular Culture 16 (4): 159-175.

Nenadić O and Greenacre M 2007 Correspondence analysis in R, with two- and three-dimensional graphics: the ca package, Journal of Statistical Software 20 (3),, accessed 6 September 2011.

Northern Alliance/Ipsos Media CT 2011 Opening Our Eyes: How Film Contributes to the Culture of the UK, July 2011.

Pusha S, Gudi R, and Noronha S 2009 Polar classification with correspondence analysis for fault isolation, Journal of Process Control 19 (4): 656-663.

Genre trends at the US box office, 1991 to 2010

UPDATE: A revised version of this article has been published as Genre trends at the US box office, 1991 to 2010, European Journal of American Culture 31 (2) 2012: 145-167. DOI: 10.1386/ejac.31.2.145_1.

To carry on the theme of some recent posts, this week I present the first draft of  analysis of the genre trends at the US box office over the past twenty years.

The pdf can be accessed here: Nick Redfern – Genre trends at the US box office


This paper examines genre trends in the top 50 grossing films at the US box office each year from 1991 to 2010, focussing on the frequency and rank of different genres, the box office gross and release patterns of films in different genres, and the release profile of Hollywood studios. The results show a narrowing of the range of genres at the highest rankings, with fantasy/science fiction movies coming to dominate at the expense of comedy, crime/thriller, and drama films. There are also marginal increases in action/adventure and family films.   Analysis of the opening and total gross for each film reveals that different genres are characterized by different release patterns, and noted the importance of awards in contributing to the box office gross of drama films. With one notable exception, there is no evidence of genre specialization among film studios in contemporary Hollywood cinema.

Genre and Hollywood studios, 1991 to 2010

Historically, particular movie studios were often associated with a specific genre of filmmaking as a strategy of differentiating their product in the marketplace (e.g. MGM and musicals, Universal and horror films, Warner Bros. and gangster films), whilst also ensuring that their product was sufficiently diverse to mitigate changes in audience taste and fashion. Table 1 lists the number of films in each of nine genres released by Hollywood studios that were ranked in the top 50 films at the US box office from 1991 to 2010, inclusive.This gives a total sample of 1000 films. See here for more on the sample used. This table is quite large, and can be seen better by opening it in a new window.

Table 1 Number of films in each genre released by Hollywood studios, 1991 to 2010 (minimum of 20 releases)

It is clear from the data that there is no evidence of genre specialisation among five of the six major studios (Fox, Paramount, Sony, Universal, and Warner Bros.). Fox has released fewer crime/thriller films than the other major studios, while releasing a greater number of fantasy/science fiction films. Paramount and DreamWorks have co-released 10 family films, which accounts for their number of releases in this category being lower for Paramount than for the other major studios. The exception for the major studios is Buena Vista, its output dominated by and dominating the genre of family films. Of the 162 films released by the studio to make it into the top 50 between 1991 and 2010, 44% were family films; and this one firm accounts for 43% of the 164 films of this genre in the sample. This result is unsurprising, since Buena Vista is the releasing arm of the Walt Disney Corporation and reflects the corporate image of that company as a producer of safe, wholesome, family entertainment (Wasko 2001). Buena Vista has also diversified its product and the frequency with which it has released other types of film is generally consistent with the other majors, although it has released fewer crime/thriller films compared to most of the other studios.

The six majors account for a total of 778 films in the sample; and many of the smaller firms listed operate within their orbit. New Line was a part of the Time-Warner media conglomerate from 1993 until it merged with Warner Bros. in 2008; and DreamWorks has entered into production and/or distribution arrangements with Paramount and Disney. The only film amongst the highest grossing in this twenty year period not connected to one of the major media conglomerates is Newmarket’s The Passion of The Christ (2004), which was produced and distributed outside the traditional Hollywood mechanisms (Maresco 2004). Looking at the smaller firms in Table 1, we see that New Line’s output is dominated by comedy films, although its most profitable films were the Lord of the Rings trilogy; while half of MGM’s limited output is accounted for by action/adventure (and four of these five films are from the James Bond franchise), comedy, and crime/thriller films. Few films from the action/adventure and fantasy/science fiction genres are produced by firms other than the major studios. The budgets for these types of films tend to be higher than those of other genres, and this level of capital investment is typically beyond the scope of all but the largest studios.


Maresco PA 2004 Mel Gibson’s The Passion of the Christ: market segmentation, mass marketing and promotion, and the internet, Journal of Religion and Popular Culture 8:

Wasko J 2001 Understanding Disney: The Manufacture of Fantasy. Malden MA: Blackwell.