Empirical studies in film style IV
It has been a while since I listed some research on the empirical analysis of film style – I could have sworn I did a post on this just before christmas, but apparently not.
First, a couple of general papers that outline the principles of video content analysis (VCA) and the research that has been done in this area. This piece (here) is a set of power point slides by Alan Hanjalic (see below), in which he summarises the goals of VCA, its applications, and the different approaches that have been adopted by researchers. A literature survey of work in VCA is given in the following paper:
Brezeale D and Cook DJ 2008 Automatic video classification : a survey of the literature, IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews 38 (3): 416-430. DOI: 10.1109/TSMCC.2008.919173.
The papers referred to below all cover the relationship between emotion, style, and video content.
Arifin S and Chueng PYK 2006 User attention based arousal content modelling, IEEE International Conference on Image Processing, 8 November 2006, Atlanta, Georgia, USA.
The affective content of a video is defined as the expected amount and type of emotion that are contained in a video. Utilizing this affective content will extend the current scope of application possibilities. The dimensional approach to representing emotion can play an important role in the development of an affective video content analyzer. The three basic affect dimensions are defined as valence, arousal and control. This paper presents a novel FPGA-based system for modeling the arousal content of a video based on user saliency and film grammar. The design is implemented on a Xilinx Virtex-II xc2v6000 on board a RC300 board.
The poster for this paper can be accessed here.
Hanjalic A 2006 Extracting moods from pictures and sounds, IEEE Signal Processing Magazine 23 (2): 90-100. DOI: 10.1109/MSP.2006.1621452.
From the introduction:
Intensive research efforts in the field of multimedia content analysis in the past 15 years have resulted in an abundance of theoretical and algorithmic solutions for extracting the content-related information from audiovisual signals. The solutions proposed so far cover an enormous application scope and aim at enabling us to easily access the events, people, objects, and scenes captured by the camera, to quickly retrieve our favorite themes from a large music video archive (e.g., a pop/rock concert database), or to efficiently generate comprehensive overviews, summaries, and abstracts of movies, sports TV broadcasts, surveillance, meeting recordings, and educational video material. However, what about the task of finding exciting parts of a sports TV broadcast or funny and romantic excerpts from a movie? What about locating unpleasant video clips we would be reluctant to let our children watch? This article considers how we feel about the content we see or hear. As opposed to the cognitive content information composed of the facts about the genre, temporal content structure (shots, scenes) and spatiotemporal content elements (objects, persons, events, topics) we are interested in obtaining the information about the feelings, emotions, and moods evoked by a speech, audio, or video clip. We refer to the latter as the affective content, and to the terms such as “happy ” or “exciting ” as the affective labels of an audiovisual signal.
Hanjalic A and Xu L 2005 Affective video content and representation modelling, IEEE Transactions on Multimedia 7 (1): 143-154. DOI: 10.1109/TMM.2004.840618.
This paper looks into a new direction in video content analysis – the representation and modelling of affective video content. The affective content of a given video clip can be defined as the intensity and type of feeling or emotion (both are referred to as affect) that are expected to arise in the user while watching that clip. The availability of methodologies for automatically extracting this type of video content will extend the current scope of possibilities for video indexing and retrieval. For instance, we will be able to search for the funniest or the most thrilling parts of a movie, or the most exciting events of a sport program. Furthermore, as the user may want to select a movie not only based on its genre, cast, director and story content, but also on its prevailing mood, the affective content analysis is also likely to contribute to enhancing the quality of personalizing the video delivery to the user. We propose in this paper a computational framework for affective video content representation and modelling. This framework is based on the dimensional approach to affect that is known from the field of psychophysiology. According to this approach, the affective video content can be represented as a set of points in the two-dimensional (2-D) emotion space that is characterized by the dimensions of arousal (intensity of affect) and valence (type of affect).We map the affective video content onto the 2-D emotion space by using the models that link the arousal and valence dimensions to low-level features extracted from video data. This results in the arousal and valence time curves that, either considered separately or combined into the so-called affect curve, are introduced as reliable representations of expected transitions from one feeling to another along a video, as perceived by a viewer.
Machajdik J and Hanbury A 2010 Affective image classification using features inspired by psychology and art theory, ACM Multimedia Conference 25-29 October 2010, Firenze, Italy.
Images can affect people on an emotional level. Since the emotions that arise in the viewer of an image are highly subjective, they are rarely indexed. However there are situations when it would be helpful if images could be retrieved based on their emotional content. We investigate and develop methods to extract and combine low-level features that represent the emotional content of an image, and use these for image emotion classification. Specifically, we exploit theoretical and empirical concepts from psychology and art theory to extract image features that are specific to the domain of artworks with emotional expression. For testing and training, we use three data sets: the International Affective Picture System (IAPS); a set of artistic photography from a photo sharing site (to investigate whether the conscious use of colors and textures displayed by the artists improves the classification); and a set of peer rated abstract paintings to investigate the influence of the features and ratings on pictures without contextual content. Improved classification results are obtained on the International Affective Picture System (IAPS), compared to state of the art work.
This paper does not relate specifically to film, but I include it anyway becuase it is interesting to read alongside the other papers listed here and in the context of cognitive film theory. The pdf linked to for this paper is over 10MB, so it may be quite slow to download.
Soleymani M, Chanel G, Kierkels JJK, and Pun T 2008 Affective characterization of movie scenes based on multimedia content analysis and user’s physiological emotional responses, IEEE International Symposium on Multimedia, 15-17 December 2008, Berkeley, California, USA [Abstract only].
In this paper, we propose an approach for affective representation of movie scenes based on the emotions that are actually felt by spectators. Such a representation can be used for characterizing the emotional content of video clips for e.g. affective video indexing and retrieval, neuromarketing studies, etc. A dataset of 64 different scenes from eight movies was shown to eight participants. While watching these clips, their physiological responses were recorded. The participants were also asked to self-assess their felt emotional arousal and valence for each scene. In addition, content-based audio- and video-based features were extracted from the movie scenes in order to characterize each one. Degrees of arousal and valence were estimated by a linear combination of features from physiological signals, as well as by a linear combination of content-based features. We showed that a significant correlation exists between arousal/valence provided by the spectator’s self-assessments, and affective grades obtained automatically from either physiological responses or from audio-video features. This demonstrates the ability of using multimedia features and physiological responses to predict the expected affect of the user in response to the emotional video content.
Yoo HW and Cho SB 2007 Video scene retrieval with interactive genetic algorithm, Multimedia Tools and Applications 34 (3): 317-336. DOI: 10.1007/s11042-007-0109-8.
This paper proposes a video scene retrieval algorithm based on emotion. First, abrupt/gradual shot boundaries are detected in the video clip of representing a specific story. Then, five video features such as “average colour histogram,” “average brightness,” “average edge histogram,” “average shot duration,” and “gradual change rate” are extracted from each of the videos, and mapping through an interactive genetic algorithm is conducted between these features and the emotional space that a user has in mind. After the proposed algorithm selects the videos that contain the corresponding emotion from the initial population of videos, the feature vectors from them are regarded as chromosomes, and a genetic crossover is applied to those feature vectors. Next, new chromosomes after crossover and feature vectors in the database videos are compared based on a similarity function to obtain the most similar videos as solutions of the next generation. By iterating this process, a new population of videos that a user has in mind are retrieved. In order to show the validity of the proposed method, six example categories of “action,” “excitement,” “suspense,” “quietness,” “relaxation,” and “happiness” are used as emotions for experiments. This method of retrieval shows 70% of effectiveness on the average over 300 commercial videos.
Finally, a report from a couple of years ago that appeared in IEEE Spectrum about a jacket that lets you “feel the movies” to add a sense of touch to the emotional events in a film.
The jacket contains 64 independently controlled actuators distributed across the arms and torso. The actuators are arrayed in 16 groups of four and linked along a serial bus; each group shares a microprocessor. The actuators draw so little current that the jacket could operate for an hour on its two AA batteries even if the system was continuously driving 20 of the motors simultaneously.
So what can the jacket make you feel? Can it cause a viewer to feel a blow to the ribs as he watches Bruce Lee take on a dozen thugs? No, says Lemmens. Although the garment can simulate outside forces, translating kicks and punches is not what the actuators are meant to do. The aim, he says, is investigating emotional immersion.
The article can be accessed here.
Posted on January 27, 2011, in Cinemetrics, Cognitive Film Theory, Emotion, Film Analysis, Film Studies, Film Style, Film Technology, Film Theory and tagged Cinemetrics, Cognitive Film Theory, Film Analysis, Film Studies, Film Style, Film Technology, Film Theory. Bookmark the permalink. Leave a comment.