The Ear That Dreams: Eye Tracking Sound in the Moving Image

Curator's Note

Scholars and practitioners of visual culture have often suggested that we live in an ocular age organized around the primacy and potency of looking and seeing (Berger, 1972, Evans, Hall, 1999). So pervasive is vision to everyday life and visual making practices and processes that we forge identities out of shared visual signs and get drunk on the consumption of images.

In film studies the centrality of vision to the way spectators engage with the text has been very well documented (Mulvey, 1975, 1989, Stacey, 1994, Mayne, 2002), although its primacy has always been challenged through a recognition of the synaesthetic and embodied qualities of the moving image (Sobchack, 1992, 2004, Marks, 2000). Film is a replete visual canvas, made of light, colour and movement, but every frame, every shot, is accompanied by the poetics of impressionable sound, even or especially when the frame falls silent.


In this respect, film studies scholars have acknowledged that one only has to turn off the sound while watching a film to understand how its absence impacts on the viewing experience (Doane, 1980, 1994, Whittington, 2009).  With sound on and our ears attuned, we come to realize that film sound is bi-sensorial, both “a sonorous figure in the ears, and a vibration felt in the skin and the bones” (Chion 1994: 221). Sound not only localizes and animates the moving image but registers at an emotional and embodied level on the viewer.  We feel sound.

Vivian Sobchack suggests that film sound has particularly immersive and co-synesthetic affects, and can lead viewers to create “shapes” out of their hearing and feeling alone (2005). Indeed, following this focus on the affordances of sound, Elsaesser and Hagener have argued that film viewers can “hear around corners and through walls, in complete darkness and blinding brightness, even when we cannot see anything…The spectator is…a bodily being enmeshed acoustically, spatially and affectively in the filmic texture” (2010: 131-132).

While there exists a body of work that empirically engages with the way spectators respond to a film’s mise-en-scene, narrative, characterization, and ideological content, there is little research on what the eyes actually attend to, or how sound, voice, dialogue, and music impact the way a text is perceived and made meaningful by audiences. What do we hear when we see? What do we feel when we listen?


The empirical research that exists on the qualities and effects of sound comes largely from music studies, social psychology, and cognitive science (see Juslin and Laukka, 2004). Further, historically, and more generally, a great deal of audience research within film studies either rests on an imagined viewer or involves qualitative research based on memorial work, interviews, and focus groups. The rise of the video essay as a mode of research and method of analysis has tended to continue this dominant concern with the primacy of the image.

There exists, then, an empirical research gap around the specifics of viewing and hearing film, and of the complex relationship between the two, that this audio-video essay takes a step towards filling. The eye tracking technology employed affords one the opportunity to generate new empirical data about what viewers actually gaze at, for what length of time, and with what levels of intensity. It has also allowed us to quantitatively map the relationship between sound and vision, hearing, and seeing, and to determine how and where a relationship emerges between what was heard and what spectators gazed at. There are limitations, nonetheless, in the use of eye tracking technology, which this audio-visual essay also explores.

Drawing on cinematic theories of sound, and neuroscientific understandings of attention, comprehension, and the gaze, this video essay employs eye tracking technology in a sound on/off comparative analysis of the first five minutes of the Omaha Beach landing scene from Saving Private Ryan (Spielberg, 1998). The film was chosen as a case study because it involves complex sound design, moments of perceptual shock, internal diegetic sound, spatial and temporal shifts in sound, and heightened sonic agency.  

Six viewers were eye tracked at the Eye Tracking Lab at La Trobe University, Melbourne, and the data analyzed through a combination of close textual analysis and the statistical interpretation of aggregate gaze patterns. The viewers were shown the sequence twice: once with its normal audio field playing, and once with the sound taken out.

In this video essay I interpret this data to answer the following questions:

To what extent do viewers’ eyes follow narrative-based sound cues?

How does the soundtrack affect viewer engagement and attention to detail?

Is there an element of prediction and predictability in the way a viewer sees and hears?

Do viewers’ eyes ‘wander’ when there is no sound to guide them where to look?

Ultimately, I ask how important is sound to the cinematic experience of vision: Does the ear dream?

[1] An extended written version of this video essay has been published as: Redmond, S., Pink, S., Stadler, J., Robinson, J.,Verhagen, D., & Rassell, A. (2016). Seeing, Sensing Sound: Eye Tracking Soundscapes in Saving Private Ryan and Monsters, Inc. In C. Reinhard & C. Olson (Eds.), Making Sense of Cinema: Empirical Studies into Film Spectators and Spectatorship (pp.139- 164). New York, NY: Bloomsbury.

Response to Reviewers' Comments

I would like to thank [in]Transition's peer reviewers, who reviewed the work upon submission, as well as the colleagues who offered helpful feedback prior to submission. The final published piece is thus indebted to the reviewers and colleagues who viewed and responded to the work.  

This final version has responded to all the technical and delivery questions raised and is stronger for it. However, the substantive question raised by one reviewer with regards to its overall structure, and its use of the term ‘dreaming’, has not been addressed. I have exercised both my critical autonomy here, and the highly positive view of the other peer reviewer, to hold true to the ambitions of my work. Dreaming is not being drawn upon in this video essay to resurrect Metz, for example, or to draw on or from psychoanalytical film theory. Dreaming is being used poetically, as a creative tool to make inferences about the way we hear and see, and to complicate the reading of the empirical data generated by the eye tracking results.  


Berger, J. (1972). Ways of Seeing. London. British Broadcasting Corporation and Penguin Books.

Chion, M. (1994.) Audio-Vision: Sound on Screen. New York: Columbia University Press.

Doane, M. A.(1980). ‘The Voice in the Cinema: The Articulation of Body and Space,’ Yale French Studies 60: 33-50.

Elsaesser, T., & Hagener, M. (2010). Film theory: An introduction through the senses. London: Routledge.

Evans, J., & Hall, S. (eds.) (1999). Visual Culture: The Reader. London: Sage.

Juslin, P. N., & Laukka, P. (2004). ‘Expression, Perception, and Induction of Musical Emotions: A Review and a Questionnaire Study of Everyday Listening’. Journal of New Music Research 33(3): 217-238.

Marks, Laura U.. (2000). The Skin of the Film: Intercultural Cinema, Embodiment, and the Senses. Duke University Press.

Mayne, J. (2002). Cinema and Spectatorship. London: Routledge.

Mulvey, L. (1975). ‘Visual Pleasure and Narrative Cinema’. Screen 16(3): 6-18.

Mulvey, L. (1989). Visual and Other Pleasures. London: Macmillan.

Sobchack, V. C.(1992). The Address of the Eye: A Phenomenology of Film Experience. New Jersey: Princeton University Press.

Sobchack, V. (2005). ‘When the Ear Dreams: Dolby Digital and the Imagination of Sound’. Film Quarterly 58.4: 2-14.

Stacey, J. (1994). ‘Star Gazing’. Hollywood Cinema and Female Spectatorship. London/New York: Routledge.

Whittington, W. (2009). Sound Design and Science Fiction. Austin: University of Texas Press.