# Some (brief) notes on cinemetrics II

*If anyone is getting confused as to why my comments are appearing and disappearing at the end of posts, it’s simply because there is no one quite so indecisive as the author …*

## Power Laws and cinemetrics

In an earlier post I wrote about power laws and the distribution of mean relative frequencies (MRFs) of shot scales.

I think that MRFs have a useful role to play in statistically analysing film style – they can tell us if a group of films is dominated by a single scale (Lang’s German films) or if there is a more evenly spread usage of scales (the Hitchcock films and Lang in Hollywood). Looking at MRFs will not tell us which scale is dominant, but that is easy to find out.

However, having looked at this area in more depth, I would have to say that I do not think that there is much of a future in the power laws approach. Power regression does provide a good model – but it appears that it is not consistently the best. Exponential regression seems to be more consistent,with logarithmic regression is better too on occasion (but not so consistently). For example, for five Swedish films produced between 1917 and 1920 (see Table 1 below), *R²* (power) = 0.9817, while* R²* (exponential) = 0.9897. Not a big difference, but enough to say that a power law probably is not the best explanation for the trend in this data.

This suggests that power laws are unlikely to be a good explanation for the distribution of MRFs. Of course, the problem now is that I’m also sceptical about the use of exponential regression – given a quick enough decline in distribution of MRFs, an exponential regression line will give a very similar result to a linear curve and so it will not be possible to clearly distinguish between them. Overall, then, I think it is probably best just to use the linear model and to see how far MRFs deviate from this. Essentially, this means stating whether the distribution is linear or not (irrespective of what it might actually be) and looking for patterns in this statistic only. This is, of course, a much quicker and simpler way to proceed than comparing two or more regression models for each group of films, and so it has that advantage as well.

Power laws were worth a look, but I don’t see a future in it (or at least I see only an unnecessarily confusing one).

Table 1 presents some results for the distribution of MRFs for some groups of films when fitted to a linear curve using the model *y = ax + b*, where *a* is the slope and *b* is the intercept. As before this data is from Barry Salt’s database at the Cinemetrics website. Two things stand out: (1) early silent films are poorly fitted by the linear model; and (2) the groups of films that do fit the linear model have similar values for the slope and intercept of the regression curve. In fact, the results for the first five groups in Table 1 can all be adequately modelled by *a* = -0.036 and *b* = 0.29. This is presented in Figure 1, where the red line is the regression model. Why this should be the case is a mystery – why are Thorold Dickinson’s films simialr to Josef von Sternberg’s, Fritz Lang’s, and Alfred Hitchcock’s across time and different countries? We can conclude that shot scales are unlikely to be an indicator of authorship (although, as before, a larger study is needed to confirm this) [1]. Perhaps these regression coefficients crop up wherever continuity editing is used (which would not be the case before 1920 in Europe – hence the values for Lang in Germany and Sweden), or wherever Hollywood has been a determining factor in the development of film style, as it has been in Europe (hence the British film’s similarity to Hollywood. (This raises the question of how shot scales are used in non-Western cinema: I would love to see the distribution of MRFs for Japanese films of the 1930s). Nonetheless, it is a startling empirical regularity, and hopefully soon I will have some more systematic results to present.

**Table 1** Linear regression of the distribution of the mean relative frequencies of shot scales in some motion pictures

**Figure 1** Linear regression of *y = –*0.036*x + *0.29 on five groups of films

## PPCC Data

Barry Salt requested the results for the probabilty plot correlation coeffecient (PPCC) for some 40 films I have looked at. These are presented in Table 2, while Table 3 includes a reel-by-reel breakdown for *Man with a Movie Camera* (Dziga Vertov, 1928) for the same statistic.

**Table 2** PPCC data for 40 films from the Cinemetrics database

**Table 3** PPCC data for *Man witha Movie Camera* (1928)

## Notes

- To date I have found no empirical evidence to support auteurism at all, while I have repeatedly found evidence of group styles, whether those groups be defined by nation, studio, or era.

Posted on June 25, 2009, in Cinemetrics, Film Studies, Film Style and tagged Cinemetrics, Film Studies, Film Style. Bookmark the permalink. 7 Comments.

Thanks. AS I read your results, there are 22 films out of 40 that meet your criterion for lognormality, and several others that come very close indeed. This seems a good reason to me to assume that the distribution of shots lengths is Lognormal for ASLs under 20 seconds, when investigating them. Also, there is a theory of how the Lognormal distribution arises, which gives a lead for investigating detailed causal factors, as I have mentioned in my books.

Scale of Shot distributions are very various indeed, particularly in the silent period, which may be the reasons that they resisted your approach. I like to “see” the actual shape of distributions, for what it might suggest to me.

In the case of Hitchcock, this aspect of his style also changed when he went to the US, as can be readily seen from the histograms in “Film Style & Technology”. And of course Lang reverted to his German Scale of Shot for “Der Tiger von Eschnapur” at the end of his life.

Auteur style analysis covers more than just Scale of Shot, and I discuss examples in more details in my books.

Oh, and a minor point. Was the version of shot length record for “Verboten!” mine, which is farme-accurate, or that of John C?

There is a whole question about whether the approximations in the records obtained using the Cinemetrics programme affect the testing for Lognormality, and indeed other things.

The version of Verboten! referenced here is the one submitted by John C. I have found that a good recorder for cinemetrics can provide an reasonably accurate count (I added a comment about Chaplin to this effect on the Cinemetrics website under the discussion on making cinemetrics automatic). Unless the submission is widely inaccurate, there is no reason to assume that sampling error would be quite so long as to change the shape of the distribution.

Interestingly, if you divide this version of Verboten! into rough 30 min sections you find that for 1 second-1768.2s (218 shots) the median shot length is 2.9s; for 1768.2s-3837.8s (94 shots), median shot length = 4.2s; and 3837.8s-5172.0s (190 shots), median shot length = 2.9s.

I don’t think the sampling error would change the shape of a distribution that much, but it might stop it qualifying as Lognormal under your very restrictive criterion. John C.’s recording of “Verboten” was the least accurate of the three I checked in my piece on “The Question of Accuracy” on the Cinemetrics website. Some of the entries in the mass student class submissions are very inaccurate, I judge from the ASLs quoted for films I have previously counted.

Incidentally, I have just noticed that I had already said on page 219 of “Film Style and Technology” that “In the case of Lang, there is a clear transition from his European films.. etc.”, and of course this is obvious from the histograms in that chapter, as I already remarked.

Got the title of my note on accuracy in Cinemetrics wrong in the previous. It’s called “Requiring Split-second Timing”, though it is easy to find on the Cinemtrics site anyway.

Pingback: Some Notes on Shot Scales « Research into film

Pingback: Expanded sample for lognormal distribution « Research into film