Blog Archives

Film style and narration in Rashomon

UPDATE: 13 April 2014: The revised version of this article has now been published as Film Style and Narration in Rashomon, Journal of Japanese and Korean Cinema 5 (1-2) 2013: 21-36. DOI: 10.1386/jjkc.5.1-2.21_1.

A post-print of the article can be downloaded here: Nick_Redfern – Film style and narration in Rashomon (post print)

And so after a long (and much enjoyed break) I return to the blogosphere with the first draft of paper on film style and narration in Rashomon. This paper is different to other statistical analyses of film style I have published on this site and to all other studies of film style and narration because it uses multivariate analysis to look at several different aspects of film style together. The method used is multiple correspondence analysis, and you can find a good introductory chapter on MCA here. The software I used is FactoMineR for R, and the website explaining how to do the analysis can be found here.

Multivariate analysis has been used in the quantitative study of literature for some time (see the links below the abstract), but this is the first time multivariate analysis has been applied to film style and it appears to work very well. I am currently looking at some other applications, particularly in distinguishing between the different parts of portmanteau horror films (which is a proper scholarly endeavour and not simply an excuse to watch lots of portmanteau horror films).

An Excel file contain the data used in the analysis can be accessed here: Nick Redfern – Rashomon. This file contains two worksheets: the first is the shot length data for the film, and the second is that data used in the multiple correspondence analysis.

Abstract

This article analyses the use of film style in Rashomon (1950) to determine if the different accounts of the rape and murder provided by the bandit, the wife, the husband, and the woodcutter are formally distinct by comparing shot length data and using multiple correspondence analysis to look for relationships between shot scale, camera movement, camera angle, and the use of point-of-view shots, reverse-angle cuts, and axial cuts. The results show that the four accounts of the rape and the murder in Rashomon differ not only in their content but also in the way they are narrated. The editing pace varies so that although the action of the film is repeated the presentation of events to the viewer is different each time. There is a distinction between presentational (shot scale and camera movement) and perspectival (shot types) aspects of style depending on their function within the film, while other elements (camera angle) fulfil both these functions. Different types of shot are used to create the narrative perspectives of the bandit, the wife, and the husband that marks them out as either active or passive narrators reflecting their level of narrative agency within the film, while the woodcutter’s account exhibits both active and passive aspects to create an ambiguous mode of narration. Rashomon is a deliberately and precisely constructed artwork in which form and content work together to create an epistemological puzzle for the viewer.

On the multivariate analysis of literature see the following:

Hoover DL 2003 Multivariate analysis and the study of style variation, Literary and Linguistic Computing 18 (4): 341-360.

Stewart LL 2003 Charles Brockden Brown: quantitative analysis and literary style, Literary and Linguistic Computing 18 (2): 129-138.

Tabata T 1995 Narrative style and the frequencies of very common words: a corpus-based approach to Dickens’s first person and third person narratives, English Corpus Studies 2: 91-109.

Revealing narrative structure through aesthetic analysis

This week some papers relating to the discovery of narrative structure in motion pictures based on the patterns of aesthetic elements. But first, many of the papers on statistical analysis of film style in this post and on many others across this blog are co-authored by Svetha Venkatesh from Curtin University’s Computing department, and her home page – with links to much research relevant to film studies – can be accessed here.

Adams B, Venkatesh S, Bui HH, and Dorai C 2007 A probabilistic framework for extracting narrative act boundaries and semantics in motion pictures, Multimedia Tools and Applications 27: 195-213.

This work constitutes the first attempt to extract the important narrative structure, the 3-Act storytelling paradigm in film. Widely prevalent in the domain of film, it forms the foundation and framework in which a film can be made to function as an effective tool for storytelling, and its extraction is a vital step in automatic content management for film data. The identification of act boundaries allows for structuralizing film at a level far higher than existing segmentation frameworks, which include shot detection and scene identification, and provides a basis for inferences about the semantic content of dramatic events in film. A novel act boundary likelihood function for Act 1 and 2 is derived using a Bayesian formulation under guidance from film grammar, tested under many configurations and the results are reported for experiments involving 25 full-length movies. The result proves to be a useful tool in both the automatic and semi-interactive setting for semantic analysis of film, with potential application to analogues occurring in many other domains, including news, training video, sitcoms.

Chen H-W, Kuo J-H, Chu W-T, Wu J-L 2004 Action movies segmentation and summarization based on tempo analysis, 6th ACM SIGMM International Workshop on Multimedia Information Retrieval, New York, NY, 10-16 October, 2004.

With the advances of digital video analysis and storage technologies, also the progress of entertainment industry, movie viewers hope to gain more control over what they see. Therefore, tools that enable movie content analysis are important for accessing, retrieving, and browsing information close to a human perceptive and semantic level. We proposed an action movie segmentation and summarization framework based on movie tempo, which represents as the delivery speed of important segments of a movie. In the tempo-based system, we combine techniques of the film domain related knowledge (film grammar), shot change detection, motion activity analysis, and semantic context detection based on audio features to grasp the concept of tempo for story unit extraction, and then build a system for action movies segmentation and summarization. We conduct some experiments on several different action movie sequences, and demonstrate an analysis and comparison according to the satisfactory experimental results.

Hu W, Xie N, Li L, Zeng X, and Maybank S 2011 A survey on visual content-based video indexing and retrieval, IEEE Transactions On Systems, Man, and Cybernetics—Part C: Applications And Reviews, 41 (6): 797-819.

Video indexing and retrieval have a wide spectrum of promising applications, motivating the interest of researchers worldwide. This paper offers a tutorial and an overview of the landscape of general strategies in visual content-based video indexing and retrieval, focusing on methods for video structure analysis, including shot boundary detection, key frame extraction and scene segmentation, extraction of features including static key frame features, object features and motion features, video data mining, video annotation, video retrieval including query interfaces, similarity measure and relevance feedback, and video browsing. Finally, we analyze future research directions.

Moncrieff S and Venkatesh S 2006 Narrative structure detection through audio pace, IEEE Multimedia Modeling 2006, Beijing, China, 4–6 Jan 2006

We use the concept of film pace, expressed through the audio, to analyse the broad level narrative structure of film. The narrative structure is divided into visual narration, action sections, and audio narration, plot development sections. We hypothesise that changes in the narrative structure signal a change in audio content, which is reflected by a change in audio pace. We test this hypothesis using a number of audio feature functions, that reflect the audio pace, to detect changes in narrative structure for 8 films of varying genres. The properties of the energy were then used to determine the audio pace feature corresponding to the narrative structure for each film analysed. The method was successful in determining the narrative structure for 7 of the films, achieving an overall precision of 76.4 % and recall of 80.3%. We map the properties of the speech and energy of film audio to the higher level semantic concept of audio pace. The audio pace was in turn applied to a higher level semantic analysis of the structure of film.

Murtagh F, Ganz A, and McKie S 2009 The structure of narrative: the case of film scripts, Pattern Recognition 42 (2): 302-312.

We analyze the style and structure of story narrative using the case of film scripts. The practical importance of this is noted, especially the need to have support tools for television movie writing. We use the Casablanca film script, and scripts from six episodes of CSI (Crime Scene Investigation). For analysis of style and structure, we quantify various central perspectives discussed in McKee’s book, Story: Substance, Structure, Style, and the Principles of Screenwriting. Film scripts offer a useful point of departure for exploration of the analysis of more general narratives. Our methodology, using Correspondence Analysis and hierarchical clustering, is innovative in a range of areas that we discuss. In particular this work is groundbreaking in taking the qualitative analysis of McKee and grounding this analysis in a quantitative and algorithmic framework.

Phung DQ , Duong TV, Venkatesh S, and Bui HH 2005 Topic transition detection using hierarchical hidden Markov and semi-Markov models, 13th Annual ACM International Conference on Multimedia, 6-11 November 2005, Singapore.

In this paper we introduce a probabilistic framework to exploit hierarchy, structure sharing and duration information for topic transition detection in videos. Our probabilistic detection framework is a combination of a shot classification step and a detection phase using hierarchical probabilistic models. We consider two models in this paper: the extended Hierarchical Hidden Markov Model (HHMM) and the Coxian Switching Hidden semi-Markov Model (S-HSMM) because they allow the natural decomposition of semantics in videos, including shared structures, to be modeled directly, and thus enable efficient inference and reduce the sample complexity in learning. Additionally, the S-HSMM allows the duration information to be incorporated, consequently the modeling of long-term dependencies in videos is enriched through both hierarchical and duration modeling. Furthermore, the use of Coxian distribution in the S-HSMM makes it tractable to deal with long sequences in video. Our experimentation of the proposed framework on twelve educational and training videos shows that both models outperform the baseline cases (flat HMM and HSMM) and performances reported in earlier work in topic detection. The superior performance of the S-HSMM over the HHMM verifies our belief that the duration information is an important factor in video content modelling.

Pfeiffer S and Srinivasan U 2002 Scene determination using auditive segmentation models of edited video, in C Dorai and S Venkatesh (eds.) Computational Media Aesthetics. Boston: Kluwer Academic Publishers: 105-130.

This chapter describes different approaches that use audio features for determination of scenes in edited video. It focuses on analysing the sound track of videos for extraction of higher-level video structure. We define a scene in a video as a temporal interval which is semantically coherent. The semantic coherence of a scene is often constructed during cinematic editing of a video. An example is the use of music for concatenation of several shots into a scene which describes a lengthy passage of time such as the journey of a character. Some semantic coherence is also inherent to the unedited video material such as the sound ambience at a specific setting, or the change pattern of speakers in a dialogue. Another kind of semantic coherence is constructed from the textual content of the sound track revealing for example the different stories contained in a news broadcast or documentary. This chapter explains the types of scenes that can be constructed via audio cues from a film art perspective. It continues on a discussion of the feasibility of automatic extraction of these scene types and finally presents existing approaches.

Weng C-Y, Chu W-T, and Wu J-L 2009 RoleNet: movie analysis from the perspective of social networks, IEEE Transactions on Multimedia 11(2): 256-271.

With the idea of social network analysis, we propose a novel way to analyze movie videos from the perspective of social relationships rather than audiovisual features. To appropriately describe role’s relationships in movies, we devise a method to quantify relations and construct role’s social networks, called RoleNet. Based on RoleNet, we are able to perform semantic analysis that goes beyond conventional feature-based approaches. In this work, social relations between roles are used to be the context information of video scenes, and leading roles and the corresponding communities can be automatically determined. The results of community identification provide new alternatives in media management and browsing. Moreover, by describing video scenes with role’s context, social-relation-based story segmentation method is developed to pave a new way for this widely-studied topic. Experimental results show the effectiveness of leading role determination and community identification. We also demonstrate that the social-based story segmentation approach works much better than the conventional tempo-based method. Finally, we give extensive discussions and state that the proposed ideas provide insights into context-based video analysis.

Analysing film texts

The statistical analysis of literary style was initiated by Augustus De Morgan in 1851, when he observed that ‘I should expect to find that one man writing on two different subjects agrees more nearly with himself than two different men writing on the same subject’ and suggested that average word length word be an appropriate indicator of style.This was followed up by TC Mendenhall, who analysed the works of William Shakespeare and Sir Francis Bacon by looking at the frequency distributions of word lengths.

It may seem that focussing on literary style will be of little use when dealing with films, but there is a body of research that examines film scripts and audio descriptions in order to understand the structure of narrative cinema. This post presents links to some of this material. I had intended to include this research in some of the earlier posts on empirical studies of film style, but it never quite seemed to fit (and I may have forgotten on more than one occasion). Besides it deserves a post of its own.

The best place to start is probably Andrew Vassiliou’s Ph.D thesis:

Vassiliou A 2006 Analysing Film Content: A Text Based Approach. University of Surrey, unpublished Ph.D thesis.

The aim of this work is to bridge the semantic gap with respect to the analysis of film content. Our novel approach is to systematically exploit collateral texts for films, such as audio description scripts and screenplays. We ask three questions: first, what information do these texts provide about film content and how do they express it? Second, how can machine-processable representations of film content be extracted automatically in these texts? Third, how can these representations enable novel applications for analysing and accessing digital film data? To answer these questions we have analysed collocations in corpora of audio description scripts (AD) and screenplays (SC), developed and evaluated an information extraction system and outlined novel applications based on information extracted from AD and SC scripts.

We found that the language used in AD and SC contains idiosyncratic repeating word patterns, compared to general language. The existence of these idiosyncrasies means that the generation of information extraction templates and algorithms can be mainly automatic. We also found four types of event that are commonly described in audio description scripts and screenplays for Hollywood films: Focus_of_Attention, Change_of_Location, Non-verbal_Communication and Scene_Change events. We argue that information about these events will support novel applications for automatic film content analysis. These findings form our main contributions. Another contribution of this work is the extension and testing of an existing, mainly-automated method to generate templates and algorithms for information extraction; with no further modifications, these performed with around 55% precision and 35% recall. Also provided is a database containing information about four types of events in 193 films, which was extracted automatically. Taken as a whole, this work can be considered to contribute a new framework for analysing film content which synthesises elements of corpus linguistics, information extraction, narratology and film theory.

These papers present different aspects of the approach, using written texts to distinguish between film genres, to explore the clustering of narrative events, and the emotion responses of viewers.

Salway A, Lehane B, and O’Connor NE 2007 Associating characters with events in films, 6th ACM International Conference on Image and Video Retrieval, 9-11 July 2007, Amsterdam.

The work presented here combines the analysis of a film’s audiovisual features with the analysis of an accompanying audio description. Specifically, we describe a technique for semantic-based indexing of feature films that associates character names with meaningful events. The technique fuses the results of event detection based on audiovisual features with the inferred on-screen presence of characters, based on an analysis of an audio description script. In an evaluation with 215 events from 11 films, the technique performed the character detection task with Precision = 93% and Recall = 71%. We then go on to show how novel access modes to film content are enabled by our analysis. The specific examples illustrated include video retrieval via a combination of event-type and character name and our first steps towards visualization of narrative and character interplay based on characters occurrence and co-occurrence in events.

Salway A, Vassiliou A, and Ahmad K 2005 What happens in films?, IEEE International Conference on Multimedia and Expo, 6-8 July 2005, Amsterdam.

This paper aims to contribute to the analysis and description of semantic video content by investigating what actions are important in films. We apply a corpus analysis method to identify frequently occurring phrases in texts that describe films – screenplays and audio description. Frequent words and statistically significant collocations of these words are identified in screenplays of 75 films and in audio description of 45 films. Phrases such as `looks at’, `turns to’, `smiles at’ and various collocations of `door’ were found to be common. We argue that these phrases occur frequently because they describe actions that are important story-telling elements for filmed narrative. We discuss how this knowledge helps the development of systems to structure semantic video content.

Vassiliou A, Salway A, and Pitt D 2004 Formalizing stories: sequences of events and state changes, IEEE International Conference on Multimedia and Expo, 27-30 June 2004, Taipei, Taiwan.

An attempt is made here to synthesise ideas from theories of narrative and computer science in order to model high level semantic video content, especially for films. A notation is proposed for describing sequences of interrelated events and states in narratives. The investigation focuses on the idea of modelling video content as a sequence of states: sequences of characters’ emotional states are considered as a case study. An existing method for extracting information about emotion in film is formalised and extended with a metric to compare the distribution of emotions in two films.

Finally, a PowerPoint presentation by Andrew Salway that covers the topic fairly extensively can be accessed here.

Empirical research on narratives

A few weeks ago I published a post that looked at mathematical models of narrative comprehension that could be used for empirical research into how viewers understand narrative films. I also complained that the reason there was a lack of empirical research on narrative comprehension was simply because no-one in film studies has undertaken such research, whereas in many other disciplines an empirically based approach is fundamental. You can read the earlier post here. This week I include some abstracts and some links to papers that do look at the relationship between agents and narratives empirically. An interesting aspect that should also be noted in many of these papers is the way in which concepts of film style have moved into other media, while thinking about narrative in virtual environments and interactive fiction provides new ways of thinking about narrative in the cinema.

As ever, the versions of these papers linked may not be the final published version.

Bizocchi J 2005 Run, Lola, Run: film as a narrative database, Media in Transition 4: The Work of Stories, May 6-8, 2005, MIT, Cambridge, MA. [I’ll have more to say about this paper and the topic of database narratives at a later date in a piece on paraconsistency in narrative cinema].

Clarke A and Mitchell G 2001 Film and the development of interactive narrative, International Conference on Virtual Storytelling: Using Virtual Reality Technologies for Storytelling, 27-28 September 2001, Avignon, France.

This paper explores narration in film and in videogames/virtual environments/interactive narratives. Particular attention is given to their use of the continuity of time, space and action and this is used as a means of classifying different types of work. The authors argue that the creators of these videogames etc. need to have more authorial presence and that this can only be done through abandoning their traditional reliance on the continuity of time, space and action.

Johnson K and Bizzocchi J forthcoming Lost Cause: an interactive film project, Journal of the International Digital Media and Arts.

The paper describes the design, the aesthetics, and the experience of the interactive film Lost Cause. The film is examined from several theoretical perspectives: cinematic roots, narrative construction, interface design, and new media artifact. Lost Cause extends the complex plot structure used by filmmakers such as Altman or Tarentino into an explicitly interactive format. The plot has three interrelated and synchronous threads which are represented in a multiscreen user interface. It culminates in an ending determined by the history of user navigation choices. The paper analyzes the work to reveal critical insights into database narrative, expressive interface design, user agency, and the construction of micronarrative.

Marsh T, Nitsche M, Liu W, Chung P, Bolter JD, and Cheok AD 2008 Film informing design for contemplative gameplay, Sandbox Symposium, 9-10 August 2008 Los Angeles, California.

Borrowing from film and filmmaking styles, techniques and devices that manipulate spectators’ attention and experience, this paper proposes an approach to inform design of games and gameplay to manipulate player’s focus of attention and encourage contemplation — in design features, characters, story elements, etc. or even break the player’s engaged attention in the game/virtual world altogether — to provide meaning, experience and opportunities for learning. Focusing on film styles alternative to the continuity style of Hollywood filmmaking, we discuss examples of design for contemplative gameplay in game-based learning environments/serious games, machinima and augmented and mixed reality games in previous, current and future projects. We propose that one goal of game design is to establish a rhythm between contemplation and engagement, and the appropriate rhythm is determined largely by a game’s genre, platform and/or narrative.

May J and Barnard PJ 1995 Cinematography and interface design, in K Nordby, PH Helmersen, DJ Gilmore, and SA Arnesen (eds.) Human-Computer Interaction: Interact’95. London: Chapman and Hall: 26-31. [NB: there isn’t a direct URL for this paper, but if you google the title you should find the pdf version easy enough].

Interface designers are increasingly relying on craft based approaches to compensate for a perceived lack of relevant theory. One such source is cinematography, where film-makers succeed in helping viewers follow the narrative across cuts which change the information on the screen. Cinematography has evolved over the last century, and its rules of thumb cannot be applied directly to interface design. We analyse film-makers’ techniques with a cognitive theory (ICS) and show that they work by preserving thematic continuity across cuts. Expressing this theoretically allows us to extrapolate away from film, applying it to screen changes in interface design.

Nath S 2004 Narrativity in user action: emotion and temporal configurations of narrative, 4th International Conference on Computational Semiotics for Games and New Media, 14-16 September 2004, Split, Croatia.

One of the core problems in Narrative Intelligence is maintaining the narrative nature of event sequences that emerge owing to user participation. This paper challenges the common premises and assumptions about the nature of human action and experience that underlie common approaches to finding a solution to the problem of narrative structuration. An in-depth analysis of the temporality of human action and experience provides important indicators on how the problem can be approached. It is argued that user emotion is not just a by-product of narrative structure, but a critical factor in maintaining narrativity. Finally, it is indicated as to how patterning of emotions can regulate user action and the creation of a subjective experience.

Rowe JP and Lester JC 2010 Modelling user knowledge with dynamic Bayesian networks in interactive narrative environments

Recent years have seen a growing interest in interactive narrative systems that dynamically adapt story experiences in response to users’ actions, preferences, and goals. However, relatively little empirical work has investigated runtime models of user knowledge for informing interactive narrative adaptations. User knowledge about plot scenarios, story environments, and interaction strategies is critical in a range of interactive narrative contexts, such as mystery and detective genre stories, as well as narrative scenarios for education and training. This paper proposes a dynamic Bayesian network approach for modelling user knowledge in interactive narrative environments. A preliminary version of the model has been implemented for the CRYSTAL ISLAND interactive narrative-centred learning environment. Results from an initial empirical evaluation suggest several future directions for the design and evaluation of user knowledge models for guiding interactive narrative generation and adaptation.

This paper by Rowe and Lester is from the Intellimedia group at North Carolina State University, which publishes a wide range of papers on human-computer interaction, virtual learning environments, and narrative interaction. Their website can be accessed here.