Empirical studies in film style V
This week, another set of papers that analyse the style of films. This time, they all have the analysis of sound as a means of revealing the structure and form of motion pictures.
Austin A, Moore E, Gupta U, and Chordia P 2010 Characterization of movie genre based on music score, IEEE International Conference on Acoustics Speech and Signal Processing, 14-19 March 2010, Dallas, Texas.
While it is clear that the full emotional effect of a movie scene is carried through the successful interpretation of audio and visual information, music still carries a significant impact for interpretation of the director’s intent and style. The intent of this study was to provide a preliminary understanding on a new database for the impact of timbral and select rhythm features in characterizing the differences among movie genres based on their film scores. For this study, a database of film scores from 98 movies was collected containing instrumental (non-vocal) music from 25 romance, 25 drama, 23 horror, and 25 action movies. Both pair-wise genre classification and classification with all four genres was performed using support vector machines (SVM) in a ten-fold cross-validation test. The results of the study support the notion that high intensity movies (i.e., Action and Horror) have musical cues that are measurably different from the musical scores for movies with more measured expressions of emotion (i.e., Drama and Romance).
Jain S and Jadon RS 2008 Audio based movies characterization using neural network, International Journal of Computer Science and Applications 1 (2): 87-90.
In this paper we propose a neural net learning based method for characterization of movies using audio information. We have first extracted the audio streams from the movie clips and then computable audio features are extracted from the audio streams. We than use neural net based classifier to classify the movie clips using these audio features. The features extracted from the clips are volume root mean square, volume standard deviation, volume dynamic range, zero crossing rate and salience ratio. We have demonstrated the effectiveness of our approach for characterizing the movie clips into action and non-action.
Moncrieff S, Dorai C, and Venkatesh S 2001 Affect computing in film through sound energy dynamics, 9th ACM International Conference on Multimedia 2001, 30 September – 5 October, 2001, Ottawa, Ontario, Canada.
We develop an algorithm for the detection and classification of affective sound events underscored by specific patterns of sound energy dynamics. We relate the portrayal of these events to proposed high level affect or emotional coloring of the events. In this paper, four possible characteristic sound energy events are identified that convey well established meanings through their dynamics to portray and deliver certain affect, sentiment related to the horror film genre. Our algorithm is developed with the ultimate aim of automatically structuring sections of films that contain distinct shades of emotion related to horror themes for nonlinear media access and navigation. An average of 82% of the energy events, obtained from the analysis of the audio tracks of sections of four sample films corresponded correctly to the proposed affect. While the discrimination between certain sound energy event types was low, the algorithm correctly detected 71% of the occurrences of the sound energy events within audio tracks of the films analyzed, and thus forms a useful basis for determining affective scenes characteristic of horror in movies.
Moncrieff S, Dorai C, and Venkatesh S 2001 Analysis of environmental sounds as indexical signs in film, IEEE Pacific Rim Conference on Multimedia, 24-26 October 2001, Beijing, China.
In this paper, we investigate the problem of classifying a subset of environmental sounds in movie audio tracks that indicate specific indexical semiotic use. These environmental sounds are used to signify and enhance events occurring in film scenes. We propose a classification system for detecting the presence of violence and car chase scenes in film by classifying ten various environmental sounds that form the constituent audio events of these scenes using a number of old and new audio features. Experiments with our classification system on pure test sounds resulted in a correct event classification rate of 88.9%. We also present the results of the classifier on the mixed audio tracks of several scenes taken from The Mummy and Lethal Weapon 2. The classification of sound events is the first step towards determining the presence of the complex sound scenes within film audio and describing the thematic content of the scenes
Pfeiffer S, Lienhart R, and Effelsberg W 1999 Scene determination based on video and audio features, IEEE International Conference on Multimedia Computing and Systems, 7-11 June 1999, Florence, Italy.
Determining automatically what constitutes a scene in a video is a challenging task, particularly since there is no precise definition of the term “scene”. It is left to the individual to set attributes shared by consecutive shot which group them into scenes. Certain basic attributes such as dialogs, like settings and continuing sounds are consistent indicators. We have therefore developed a scheme for identifying scenes which clusters shots according to detected dialogs, like settings and similar audio. Results from experiments show automatic identification of these types of scenes to be reliable.
The web page of Rainer Lienhart, with access to publications on video segmentation is here.
Wang J, Li B, Hu W, and Wu O 2010 Horror movie scene recognition based on emotional perception, IEEE 17th International Conference on Image Processing, 26-29 September 2010, Hong Kong.
The number of video clips available online is growing at a tremendous pace. Meanwhile, the video scenes of pornography, violence and horror permeate the whole Web. Horror videos, whose threat to children’s health is no less than pornographic video, are sometimes neglected by existing Web filtering tools. Consequently, an effective horror video filtering tool is necessary for preventing children from accessing these horror videos. In this paper, by introducing color emotion and color harmony theories, we propose a horror video scene recognition algorithm. Firstly, the video scenes are decomposed into a set of shots. Then we extract the visual features, audio features and color emotion features of each shot. Finally, by combining the three features, the horror video scenes are recognized by the Support Vector Machine (SVM) classifier. According to the experimental results on diverse video scenes, the proposed scheme based on the emotional perception could deal effectively with the horror video scene recognition and promising results are achieved.
Xu M, Chia L-T, and Jin J 2005 Affective content analysis in comedy and horror videos by audio emotional event detection, IEEE International Conference on Multimedia and Expo, 6-9 July 2005, Amsterdam.
We study the problem of affective content analysis. In this paper we think of affective contents as those video/audio segments, which may cause an audience’s strong reactions or special emotional experiences, such as laughing or fear. Those emotional factors are related to the users’ attention, evaluation, and memories of the content. The modeling of affective effects depends on the video genres. In this work, we focus on comedy and horror films to extract the affective content by detecting a set of so-called audio emotional events (AEE) such as laughing, horror sounds, etc. Those AEE can be modeled by various audio processing techniques, and they can directly reflect an audience’s emotion. We use the AEE as a clue to locate corresponding video segments. Domain knowledge is more or less employed at this stage. Our experimental dataset consists of 40-minutes comedy video and 40-minutes horror film. An average $recall and precision of above 90% is achieved. It is shown that, in addition to rich visual information, an appropriate usage of special audios is an effective way to assist affective content analysis.