# Blog Archives

## The Mann-Whitney U Test

There is a dire need for film scholars to understand elementary statistics if they intend to use it to analyse film style. See here for the problems a lack of statistical education creates.

This post will illustrate the use of the Mann-Whitney U test using the median shot lengths of silent and sound Laurel and Hardy short films produced between 1928 and 1933 (see here). I will also look at effect sizes for interpreting the result of the test. Before proceeding, it is important to note that the Mann-Whitney U test goes by many different names (Wilcoxon Rank Sum test, Wilcoxon-Mann-Whitney, etc) but that these are all the same test and give the same results (although they may come in a slightly different format).

## The Mann-Whitney U test

The Mann-Whitney U test is a nonparametric statistical test to determine if there is a difference between two samples by testing if one sample is **stochastically superior** to the other (Mann and Whitney 1947). By **stochastic ordering** we mean that data values from one sample (X) are more likely to assume small values than the data values from another sample (Y) and that the data values in X are less likely to assume high values than Y. If *F*x(z) ≥ *F*y(z) for all z, where *F* is the cumulative distribution function, then X is stochastically smaller than Y.

We want to find out if there is a difference between the median shot lengths of silent and sound films featuring Laurel and Hardy. The null hypothesis for our experiment is that

the two samples are stochastically equal(H

_{o}:F_{silent}(z) =F_{sound}(z) for all z).

In other words, we assume that there is no difference between the samples – the median shot lengths of the silent films of Laurel and Hardy are no more likely to be greater or less than the median shot lengths of the sound films of Laurel. (See Callaert (1999) on the nonparametric hypotheses for the comparison of two samples).

In order to perform the Mann-Whitney U test we take our two samples – the median shot lengths of the silent and sound films – and we pool them together to form a single, large sample. We then order the data values from smallest to largest and assign a rank to each value. The film with the smallest median shot length has a rank 1.0, the film with second smallest median shot length has a rank of 2.0, and so on. If two or more films have a median shot length with the same value, then we give each film rank an average rank. For example, in Table 1 we see that five films have a median shot length of 3.3 seconds and that these films are 5th, 6th, 7th, 8th, and 9th in the ordered list. Adding together these ranks and dividing by the number of tied films gives us the average rank of each film: (5 + 6 + 7 + 8 + 9)/5 = 7.0.

**Table 1** Rank-ordered median shot lengths of Laurel and Hardy silent (n = 12) and sound (n = 20) films

Notice that in Table 1, the silent films (highlighted blue) tend to be at the top of the table with lower rankings than the sound films (highlighted green) that tend to be in the bottom half of the table with the higher rankings. This is a very simple way to visual the stochastic superiority of the sound films in relation to the silent films. If the two samples were stochastically equal then we would see more mixing between the two colours.

Now all we need to do is to calculate the U statistic. First, we add up the ranks of the silent and sound films from Table 1:

Sum of ranks of silent films = R1 = 1.0 + 4.0 + 7.0 + 7.0 + 7.0 + 10.5 + 12.0 + 13.0 + 14.0 + 18.0 +18.0 +22.5 = 134.0

Sum of ranks of sound films = R2 = 2.0 + 3.0 + 7.0 + 7.0 + 10.5 + 18.0 + 18.0 + 18.0 + 18.0 +18.0 +22.5 +24.0 + 25.0 + 26.0 + 27.0 + 28.5 + 28.5 + 30.0 + 31.0 + 32.0 = 394.0

Next, we calculate the U statistics us the formulae:

where n1 and n2 are the size of the two samples, and R1 and R2 are the sum of ranks above. For the above data this gives us

We want the smallest of these two values of U, and the test statistic is, therefore, U = 56.0. (Note that U1 + U2 = n1 × n2 = 240).

To find out if this result is statistically significant we can compare it to a critical value for the two sample sizes: as n1 = 12 and n2 = 20, the critical value when α = 0.05, is 69.0. We reject the null hypothesis if the value of U we have calculated is less than the critical value, and as 56.0 is less than 69.0 we can r**eject the null hypothesis of stochastic equality in this case and conclude that there is a statistically significant difference between the median shot lengths of the silent films and those of the sound films. As the median shot lengths of the sound films tend to be larger than the median shot lengths of the silent films we can say that they are stochastically superior.**

Alternatively, if our sample is large enough then U follows a normal distribution and we can calculate an asymptotic p-value using the following formulae:

For the above data, U = 56.0, μ = 120.0, and σ = 25.69. Therefore z = -2.49, and we can find the p-value from a standard normal distribution. The two-tailed p-value for this experiment is 0.013. (Note that ‘large enough’ is defined differently in different textbooks – some recommend using the z-transformation when both sample sizes are at least 20 whilst others are more generous and recommend that both sample sizes are at least 10).

If some more restrictive conditions are applied to the design of the experiment, then the Mann-Whitney U test is a test of a shift function (Y = X + Δ) for the sample medians and is an alternative to the *t*-test for the two-sample location problem. Compared to the *t*-test, the Mann-Whitney U test is slightly less efficient when the samples are large and normally distributed (ARE = 0.95), but may be substantially more efficient if the data is non-normal.

The Mann-Whitney U test should be preferred to the *t*-test for comparing the median shot lengths of two groups of films even if the samples are normal because the former is a test of stochastic superiority, while the latter is a test of a shift model and this is not an appropriate hypothesis for the design of our experiment. It simply doesn’t make sense to speak of the median shot length of a sound film in terms of a shift function as the median shot length of a silent film plus the impact of sound technology. You cannot take the median shot length of *Steamboat Bill, Jr* (X), add Δ number of seconds to it, and come up with the median shot length of *Dracula* (Y = X + Δ). Any such argument would be ridiculous, and only the null hypothesis of stochastic equality is meaningful in this context.

## The probability of superiority

A test of statistical significance is only a test of the plausibility of the model represented by the null hypothesis. As such the Mann-Whitney U test cannot tell us how important a result is. In order to interpret the meaning of the above result we need to calculate the effect size.

A simple effect size that can be quickly calculated from the Mann-Whitney U test statistic is the **probability of superiority**, ρ or PS.

Think of PS in these terms:

You have two buckets – one red and one blue. In the red bucket you have 12 red balls, and on each ball is written the name of a silent Laurel and Hardy film and its median shot length. In the blue bucket you have 20 blue balls, and on each ball is written the name of a sound Laurel and Hardy film and its median shot length. You select at random one red ball and one blue ball and note down which has the larger median shot length. Replacing the balls in their respective buckets, you draw two more balls – one from each bucket – and note down which has the larger median shot length. You repeat this process again, and again, and again.

Eventually, after a large number of repetitions, you will have an estimate of the probability with which a silent films will have a median shot length greater than that of a sound film. (On Bernoulli trials see here).

The probability of superiority can be estimated without going through the above experiment: all we need to do is to divide the U statistic we got from the Mann-Whitney test by the product of the two sample sizes – PS = U/(n1 × n2). This is equal to the probability that the median shot length of a silent film (X) is greater than the median shot length of a sound film (Y) plus half the probability that the median shot length of a silent film is equal to the median shot length of a sound film: PS = Pr[X > Y] + (0.5 × Pr[X = Y]).

If the median shot lengths of all the silent films were greater than the median shot lengths of all the sound films, then the probability of randomly selecting a silent film with a median shot length greater than the median shot length of sound film is 1.0.

Conversely, if the median shot lengths of all the silent films were less than the median shot lengths of all the sound films, then the probability of randomly selecting a silent film with a median shot length greater than the median shot length of sound film is 0.0.

If the two samples overlap one another completely, then the probability of randomly selecting a silent film with a median shot length greater than the median shot length of sound film is equal to the probability of randomly selecting a silent film with a median shot length less than the median shot length of a sound film, and is equal to 0.5.

So if there is no effect PS = 0.5, and the further away PS is from 0.5 the larger the effect we have observed.

There are no hard and fast rules regarding what values of PS are ‘small,’ ‘medium,’ or ‘large.’ These terms need to be interpreted within the context of the experiment.

For the Laurel and Hardy data, we have U = 56.0, n1 = 12, and n2 = 20. Therefore, PS = 56/(12 × 20) = 56/240 = 0.2333.

Let us now compare the effect size for the Laurel and Hardy paper with the effect size from my study on the impact of sound in Hollywood in general (access the paper here). For the Laurel and Hardy data PS = 0.2333, whereas for the Hollywood data PS = 0.0558. In both studies I identified a statistically significant difference in the median shot lengths of silent and sound films, but it is clear that the effect size is larger in the case of the Hollywood films than for the Laurel and Hardy films.

## The Hodges-Lehmann estimator

If we have designed our experiment to understand the impact of sound technology on shot lengths in Laurel and Hardy films around a null hypothesis of stochastic equality, then it makes no sense to subtract the sample median of the silent films from the sample median of the sound films because this implies a shift function and therefore a different experimental design and a different null hypothesis.

If we are not going to test for a classical shift model, how can we estimate the impact of sound technology on the cinema in terms of a slowing in the cutting rate?

To answer this question, we turn to the **Hodges-Lehmann estimator for two samples (HLΔ)**, which is the median of the all the possible differences between the values on the two samples.

In Table 2, the median shot length of each of the Laurel and Hardy silent films is subtracted from the median shot length of each of the sound films. This gives us a total set of 240 differences (n1 × n2 = 12 × 20 = 240).

**Table 2** Pairwise differences between the median shot lengths of Laurel and Hardy silent films (n = 12) and sound films (n = 20)

If we take the median of these 240 differences we have our estimate of the *typical* difference between the median shot length of a silent film and the median shot length of a sound film. Therefore, the average difference between the median shot lengths of the silent Laurel and Hardy films and the median shot lengths of the sound Laurel and Hardy films is estimated to be 0.5s (95%: 0.1, 1.1). (I won’t cover the calculation of the (Moses) confidence interval for the estimator HLΔ in this post, but for explanation see here).

The sample median of the silent films is 3.5s and for the sound films it is 3.9s, and the difference between the two is 0.4s, but as the shift function is an inappropriate design for our experiment this actually tells us nothing. Now it would appear that the difference between the two sample medians and HLΔ are approximately equal: 0.4s and 0.5s, respectively. But it is important to remember that they represent different things and have different interpretations. The difference between the sample medians represents a shift function, whereas the Hodges-Lehmann estimator is the average difference between the median shot lengths.

Note than we can calculate the Mann-Whitney U test statistic directly from the above table. If we count the number of times a silent film has a median shot length greater than that of a sound film (i.e Δ < 0, the green-highlighted numbers) and add this to half the number of times the silent and sound films have equal median shot lengths (i.e. Δ = 0, the red-highlighted numbers), then we have the Mann-Whitney U statistic that we derived above: U2 = 47 + (0.5 × 18) = 56. Equally, if we add the number of times a silent film has a median shot length less than that of sound film (i.e. Δ > 0, the blue-highlighted numbers) to half the number of times the medians are equal, then we have U1 = 175 + (0.5 × 18) = 184.

## Bringing it all together

Once we have performed out hypothesis test, calculated the effect size, and estimated the effect we can present our results:

The median shot lengths of silent (n = 12, median = 3.5s [95% CI: 3.2, 3.7]) and sound (n = 20, median = 3.9s [95% CI: 3.5, 4.3]) short films featuring Laurel and Hardy produced between 1927 and 1933 were compared using a Mann-Whitney U test, with a null hypothesis of stochastic equality. The results show that there is a statistically significant but small difference of HLΔ = 0.5s (95% CI: 0.1, 1.1) between the two samples (U = 56.0, p = 0.013, PS = 0.2333).

These two sentences provide a great deal of information to the reader in a simple and economical format – we have the experimental design, the result of the test, and the practical significance of the result.

Note that at no point in conducting this test have we employed a ‘dazzling array’ of mathematical operations – in fact the most complicated thing in the while process was to find the square root in the equation for σ above and everything else was numbering items in a list, addition, subtraction, multiplication, or division.

## Summary

The Mann-Whitney U test is ideally suited to our needs in comparing the impact of sound technology on film style, and has numerous advantages over the alternative statistical methods:

- it is covered in pretty much every statistics textbook you are ever likely to read
- it is a standard feature in statistical software (though you will have to check which name is used) and so you won’t even have to do the basic maths described above
- it is easy to calculate
- it is easy to interpret
- it allows us to test for stochastic superiority rather than a shift model
- it is robust against outliers
- it does not depend on the distribution of the data
- it can be used to determine an effect size (PS) that is easy to calculate and simple to understand
- we have a simple estimate of the effect (HLΔ) that is consistent with the test statistic

If you want to compare more than two groups of films, then the non-parametric *k*-sample test is the Kruskal-Wallis ANOVA test (see here). The Mann-Whitney U test can also be applied as post-hoc test for pairwise comparisons.

## References and Links

**Callaert H** 1999 Nonparametric hypotheses for the two-sample location problem, *Journal of Statistics Education* 7 (2): www.amstat.org/publications/jse/secure/v7n2/callaert.cfm.

**Mann HB and Whitney DR** 1947 On a test of whether one of two random variables is stochastically larger than the other, *The Annals of Mathematical Statistics* 18 (1): 50-60.

The Wikipedia page for the Mann-Whitney U test can be accessed here, and the page for the Hodges-Lehman estimator is here.

For an online calculator of the Mann-Whitney U test you can visit Vassar’s page here.

For the critical values of the Mann-Whitney U test for samples sizes up to n1 = n2 = 20 and α = 0.05 or 0.01, see here.

## Some notes on cinemetrics IV

In the 1970s, Barry Salt proposed that the mean shot length could be used to describe and compare the style of motion pictures. Many other scholars have followed him, and we find now that average shot lengths are now commonly cited in film studies texts. Unfortunately, a worse choice of a statistic of film style could not have been made – the distribution of shot lengths is not normally distributed and the mean does not accurately locate the middle of the data. This means that a large part of film studies research is utterly useless because it is based on an elementary mistake in the methodology that could have been avoided with only a middle school maths education. Quite simply, the mean is not an appropriate measure of location for a skewed dataset with a number of outliers. It never has been; it never will be; and quoting this as a statistic of film style leads to fundamentally flawed inferences about film style, as can be seen here.

This does not mean tha Salt has decided to give up on the mean shot length. He has subsequently asserted – but not proven – that shot length distributions are lognormally distributed, and that the mean shot length should be retained because the ratio of the mean shot length to the median shot length can be used to derive the shape factor of a lognormal distribution that adequately describes the distribution of shot lengths in a motion picture. (Actually Salt refers to the median-to-mean ratio, but this is just a different way of writing the same information – each ratio is reciprocal of the other. For convenience in later calculations I refer only to the mean-to-median ratio). The ratio of the mean to the median is a measure of the skew of a dataset – symmetrical distributions have a ratio of approximately 1 – and is used widely in economics to represent imbalances in income. *If a distribution is lognormal*, there is a relation between the mean-to-median ratio and the shape factor of a lognormal distribution. As I have shown elsewhere on this blog, the assumption of lognormality is not justified – applying a normality test to the log-transformed data I have found that the null hypothesis of lognormality is rejected in between 50% and 80% of cases. The proportion of silent films for which this null hypothesis is rejected appears to be greater than the proportion of sound films.

Undeterred, Salt persists with the assertion that shot lengths are lognormally distributed and has cooked up a new scheme to justify this assertion by arguing that titles should be removed from the shot length data of silent films and then analysed as being lognormal. No suggestion is made regarding the seemingly large proportion of sound films that also do not appear to be lognormally distributed. As is typical in Salt’s work, this argument is simply asserted as being true without any methodological justification and – as we shall see – some dubious evidence.

What is the methodological justification for removing the titles from the shot length data? Possible reasons for removing this data are that the titles are not original and have been updated so that they no longer accurately reflect the original structure of the film. However, the fact that the titles may not be original does not automatically mean that the titles are inaccurate or that their time on screen is not an accurate reflection of the original tempo of the film. It may be that a conservator has meticulously restored the film and respected the way the film was originally put together. We should certainly feel free to include the titles in the data if they are or are known to be properly restored, are based directly on the original film, or are reasonable approximations based on documentary evidence for the film’s production, historical context, etc. Salt’s suggestion appears to be a blanket ban on all titles in shot length data for silent films, but this would rule out much otherwise useful data. A further appears in the memoir of the projectionist Louis J Mannix (whom I discussed in an earlier post), who noted that it was a practice of projectionists to slow the film when a title came onto the screen for the ease of reading by the audience – there is nothing we can do as statisticians to control for this type of situation specific variability but it is very interesting as film history. The use of titles is certainly a methodological concern for analysts of film style, and it does need to be discussed as part of the methodology of the statistical analysis of film style. This would, however, mean going beyond mere assertion.

Salt’s method involves linking two shots that were previously separated by a title into a single shot, but again there is no methodological justification for this. The decision to put a title in the middle of a shot is itself an aesthetic decision by the filmmakers for the purposes of narrative communication, and should be respected as such. If we combine the shots in the manner Salt suggests can the data be said to reflect the film as it was made? The tempo of the film is changed, and we can no longer make any direct comparison between silent films, and between silent films and sound films. Salt also states the resulting analysis will provide very different results if the shots are not combined in this way, but he does not say why we should prefer his method over the alternative of not combining the shots.

Separating titles from the rest of the shot length data for a film is not in itself a bad idea – it would allow us to look more closely at how a film was put together, and to make inferences about how audiences understand silent films or text on screen in general. However, Salt appears to want to remove this data to make it fit a lognormal distribution, and that is a bad idea. It is back to front: the transformation of the data is suggested to make it fit a preconceived theoretical distribution, even though there is no evidence that this assumption is justified in general. If the method of combining shots is to be preferred to not combining them for the purpose of generating a better lognormal fit, then this is clearly problematic. In the absence of a proper methodological basis, this smacks of both desperation and data manipulation. Nonetheless, Salt has stated that this approach can be termed ‘experimental film analysis’ similar to experimental archaeology. The whole thing can be read here.

*Little Annie Rooney* has been held up of an example of how the fit to a lognormal distribution is improved after removing the titles. The data for this film (without titles) is here. However, closer examination of the data reveals that the mean-to-median ratio leads to a poor estimate of the shape factor and provides a substantially poorer fit than the maximum likelihood estimates (MLE). Recalling that a random variable X (such as the length of a shot) is lognormally distributed if its logarithm is normally distributed, Figure 1 presents the histogram of the shot length data transformed using the natural logarithm and three density estimates.

**Figure 1** Density estimation of shot lengths for *Little Annie Rooney* (minus titles)

The red curve is the kernel density estimate, using an Epanechnikov kernel and a bandwidth of 0.5, and is a nonparametric density estimate that makes no assumption about the shape of the distribution and depends on the data alone. This is the empirical distribution of the log-transformed data, and is used as a part of exploratory data analysis. From the histogram and the kernel density estimate we can see that even after the data has been log-transformed there is still some skewness and a heavy upper tail. We should therefore be sceptical about the assertion that this data is lognormally distributed. (For a kernel density calculator see here).

The black curve is the normal distribution specified by the maximum likelihood estimators of the log-transformed shot lengths – i.e. the mean (μ) and standard deviation (σ) of the logarithms of the shot lengths. (Note that μ is the arithmetic mean of the log-transformed data and the geometric mean of the data in its original scale). For this data, μ = 1.2078 and σ = 0.7304. The probability plot correlation coefficient (PPCC) using a Blom plotting position is 0.9776 and the null hypothesis that the data (n = 1066) is lognormally distributed is rejected for α = 0.05. Figure 2 is the normal probability plot for this data with the parameters of the black curve. (Recall that if the lognormal distribution is a good fit, the data will lie along the red line).

**Figure 2** Normal probability plot for *Little Annie Rooney* (minus titles): LN[X]~N(1.2078, 0.7304)

The green curve in Figure 1 is the normal distribution defined if we take the median shot length and the estimate of σ derived from the mean-to-median ratio, as Salt recommends. According to Salt, the mean-to-median ratio for shot length data is equal to the exponentiate of half the variance (μ/med = exp (σ^{2}/2)) and that from this we can estimate σ. As we know the value of σ is 0.7304, this can be tested for *Little Annie Rooney*. The ratio of the mean-to-median ratio for this film is 4.6/2.9 = 1.5862 and exp (0.7304^{2}/2) = 1.3057. The mean-to-median ratio overestimates the true value by 21.5%. Inevitably, this leads to a poor estimate of σ: if μ/med = exp (σ*^{2}/2) then σ* = √ (2 × LN (μ/med)), and for *Little Annie Rooney* (minus titles) this produces an estimate of σ* = 0.9606. (It is perhaps not clear from the font used here, but √ is ‘square root’). The estimated value of the shape factor is greater than its MLE value by 31.5%. Looking at the function of LN[X]~N(1.0647, 0.9606) in Figure 1, we can see that it provides a better fit to upper tail of the data and is very close to the kernel density estimate. At the same time, it provides a very poor fit below the median, and is actually worse than the MLE parameters. This can be seen more clearly by looking at Figure 3, which is the normal probability plot assuming LN[X]~N(1.0647, 0.9606). (This already poor fit can be made worse by substituting μ for the median).

**Figure 3** Normal probability plot for *Little Annie Rooney* (minus titles) LN[X]~N(1.0647, 0.9606)

From this we can conclude that (1) the shot length data for *Little Annie Rooney* (minus titles) is not lognormally distributed; (2) that the mean-to-median ratio does not equal exp (σ^{2}/2); and (3) that using the mean-to-median ratio to derive σ* provides a very poor estimate of the shape factor. (Conclusion 1 should also lead us to question the method by which Salt claims to measure goodness of fit).

This same process cannot be applied to shot length data available of the Cinemetrics website for *Little Annie Rooney* with titles, as this data includes a shot length (presumably rounded down) of 0.0 seconds. (The logarithm of X ≤ 0 does not exist). This shot length does not appear in the data after the titles have been removed, and I find it hard to believe that this film had a title card that was present on screen for less than 0.05 of a second. The accuracy of this data with or without titles is questionable.

If we examine the shot length distributions of the silent short films of Laurel and Hardy (both with and without titles) we again find that (1) the assumption of lognormality is not justified, (2) the mean-to-median ratio does not provide reasonable estimates of exp (σ^{2}/2), and (3) σ* does not provide reasonable estimates of σ.

Calculating the probability plot correlation coefficient for these films with titles using a Blom plotting position and α = 0.05, the null hypothesis that the data is lognormally distributed is rejected for 10 of the 12 films. Repeating this process with the titles removed, the null hypothesis is rejected for 11 films. (Recall that a statistical hypothesis test is a test of plausibility of the null hypothesis for a given set of data – failure to reject the null hypothesis indicates only that there is insufficient evidence to reject [and does *not* prove] H_{0}). These results are presented in Table 1. The assumption of lognormality is not justified and removing the titles from the data does not affect this conclusion.

**Table 1** Probability plot correlation coefficient for the silent films of Laurel and Hardy with and without titles

Table 2 includes the mean, median, and the standard deviation of the log-transformed data (σ). Using this information, we can test Salt’s other claims regarding the mean-to-median ratio. (Actually this is all rather redundant as we already know that lognormality is not a plausible model for this data). *Early to Bed* was excluded from this part of the study as the log-transformed data exhibits bimodality.

**Table 2** Mean, median, and σ for the silent films of Laurel and Hardy with and without titles

First, let us ask if the mean-to-median ratio is equal to exp (σ^{2}/2) for these films. The results are presented in Table 3, and it is immediately clear that for only two films – *The Second Hundred Years* and *Angora Love* – does μ/med provide a reasonable estimate of exp (σ^{2}/2) when we include the titles in the data, and the PPCC test failed to reject the null hypothesis of lognormality for both these films. For every other film, μ/med overestimates the true value by ~10% or more. Once the titles are removed, we do not get the improvement Salt claims will be evident by censoring the data in this way. Generally, the change in the estimate once the titles are removed is small, although both *The Second Hundred Years* and *Angora Love* show much larger errors after the data has been censored due to an increase in the skew of the data.

**Table 3** Mean-to-median ratio and exp (σ^{2}/2) for the silent films of Laurel and Hardy with and without titles

Table 4 presents the maximum likelihood estimate of σ and the estimate derived by using σ* = √ (2 × LN (μ/med)) for the Laurel and Hardy films, both with and without titles. For the shot length data including titles σ* provides a poor estimate for those films that rejected the null hypothesis of lognormality in the PPCC test, and consistently overestimates σ by at least 12%. Again this is not surprising, as μ/med = exp (σ^{2}/2) is only valid if the data are lognormal, which is not the case here. Turning to the shot length data after the titles have been excluded, we see that σ* is a poor estimate of σ for all the films in the sample.

**Table 4** σ and σ* for the silent films of Laurel and Hardy with and without titles

From these results we can conclude that:

- The methodological justification for removing titles from the shot length data of silent films is obscure, and lacks a theoretical basis.
- There is no evidence to justify the assumption of that shot length data is lognormally distributed.
- There is no evidence that removing the titles from silent films will improve the fit to a lognormal distribution, and may in fact produce a poorer fit.
- The mean-to-median ratio does not provide a good estimate of exp (σ
^{2}/2). - Using the mean-to-median ratio to estimate the shape factor does not provide relaible results.

In other words, the approach suggested by Salt is wrong in every possible way.

Do not take my word for it. Do not blindly accept what someone tells you with scientific sounding words no matter how confident they sound. Learn to do it for yourself – it really is not that difficult to pick up enough statistics to be able to properly evaluate a research paper. Get some data and do your own testing. If you still get stuck then ask a statistician.

If you want to repeat the Laurel and Hardy tests performed above, I have added a spreadsheet to the Laurel and Hardy post (here) that includes the data with titles indicated.

## Shot length distributions in the films of Laurel and Hardy

UPDATE: 28 June 2012 – this article has now been published as Shot length distributions in the short films of Laurel and Hardy, 1927 to 1933, *Cine Forum* 14 2012: 37-71.

This week I put up the first draft of my analysis of the impact of sound technology on the distribution of shot lengths in the short films of Laurel and Hardy from 1927 to 1933. The pdf file can be accessed here: Nick Redfern – Shot length distributions in the short films of Laurel and Hardy.

Abstract

Stan Laurel and Oliver Hardy were one of the few comedy acts to successfully make the transition from the silent era to sound cinema in the late-1920s. The impact of sound technology on Laurel and Hardy films is analysed by comparing the median shot lengths and the dispersion of shot lengths of silent shorts (n = 12) produced from 1927 to 1929 inclusive, and sound shorts (n = 20) produced from 1929 to 1933, inclusive. The results show that there is a significant difference (

U= 56.0,p= 0.0128,PS= 0.2333) between the median shot lengths of the silent films (median = 3.5s [95% CI: 3.2, 3.7]) and those of the sound films (median = 3.9s [95% CI: 3.5, 4.3]); and this represents an increase in shot lengths in the sound films by HLΔ = 0.5s (95% CI: 0.1, 1.1). The comparison of Q_{n}for the silent films (median = 2.4s [95% CI: 2.1, 2.7]) with the sound films (median = 3.0s [95% CI: 2.6, 3.4]) reveals a statistically significant increase is the dispersion of shot lengths (U= 54.5,p= 0.0109,PS= 0.2271) estimated to be HLΔ = 0.6s (95% CI: 0.1, 1.1). Although statistically significant, these differences are smaller than those reported in other quantitative analyses of film style and sound technology, and this may be attributed to Hal Roach’s commitment to pantomime, the working methods of Laurel, Hardy, and their writing/producing team, and the continuity of personnel in Roach’s unit mode of production which did not change substantially with the introduction of sound.

UPDATE: 25 November 2010. WordPress have now very helpfully made it possible to upload Excel spreadsheets to blogs, and so I have replaced the Word file with an Excel file that is much easier to use. This data also now includes information of which shots are titles (as idicated by a T in an adjacent column). I accept no libaility for any problems you may have when downloading and using Excel spreadsheets on you computer. The data used in this study can be accessed in the form of an Excel .xls file here: Nick Redfern Laurel and Hardy shot lengh data. The methodology behind the sources and collection of this data is described in the above paper.

## Robust measures of scale for shot length distributions

This week I have written a short paper on robust measures of scale for shot length distributions. The statistical analysis of film style has typcially focussed on questions of location rather than the dispersion of shot lengths in a motion – understanding how the variation in shot lengths has changed is as important as understanding how editing has speeded up or slowed down over time. Just as we need robust measures of location (e.g. the median shot length) we also need robust measures of dispersion, and in this paper I look at six possible statistics that could be used. The paper can be downloaded here as a pdf file:

Nick Redfern – Robust measures of scale for shot length distributions

The shot length data for the three Laurel and Hardy films that I refer was collected by me as part of a larger study, and when I finally finish it off I will post the draft of my Laurel and Hardy essy along with the complete shot length data for all the films I have looked.

Many of the papers on statistical methodology that I cite can be accessed for free over the internet, and if anyone is interested in the statistical analysis of film style then I recommend reading the papers on robust statistics before proceeding as this will save you a lot of trouble in the long run. The references, with links to online versions of the papers are:

**Croux C and Rousseeuw PJ **1992 Time-efficient algorithms for two highly robust estimators of scale, *Computational Statistics* 1: 411-428.

**Daszykowski M, Kaczmarek K, Vander Heyden Y, and Walczak B** 2007 Robust statistics in data analysis – a review: basic concepts, *Chemometrics and Intelligent Laboratory Systems* 85: 203-219.

**Gorard S** 2004 Revisiting a 90-year-old debate: the advantages of the mean deviation, British Educational Research Association Annual Conference, University of Manchester, 16-18 September 2004: http://www.leeds.ac.uk/educol/documents/00003759.htm, accessed 15 July 2010.

**Rousseeuw PJ** 1991 Tutorial to robust statistics, *Journal of Chemometrics* 5: 1-20.

**Rousseeuw PJ and Croux C** 1993 Alternatives to median absolute deviation, *Journal of the American Statistical Association* 88: 1273–1283.