The Poetics of the Explanatory Audiovisual Essay

Curator's Note


The Poetics of the Explanatory Audiovisual Essay

How Little We Know: An Essay Film about Hoagy CarmichaelVIDEO 1 to the left, was made in 2013 primarily as a teaching resource for a new course on audiovisual film and television criticism. Intrigued by the possibilities of the audiovisual essay through my viewing of other people's work, I wanted to make one of my own, in order to learn relevant production skills and also to provide an example for students of the kind of practice they might themselves attempt (in terms of process and the final product). This video was conceived, therefore, as a training exercise that would also produce a number of pedagogic tools (my instructor was the filmmaker, Ian Robertson [see endnote i]).

These background factors affected the nature of the finished product. In terms of training, I wanted to familiarise myself with techniques that I felt lent themselves particularly well to the audiovisual essay form - for example, split screen, spotlighting, freeze-framing and slow motion. As a model for the work I expected students on the course to produce, I tried to make a video essay that fulfilled the ‘scholarly’ function identified by Erlend Lavik in his 2012 reflection on these emergent forms: ‘the ability to not just engage with complex thought, but to pull it into focus, and to articulate and communicate those ideas clearly’ [see endnote ii]. Consequently, in relation to the schema that has been so influential in the discussion of the video essay, my film would no doubt be placed much nearer the 'explanatory' than 'poetical' end of the scale [see endnote iii]. However, this does not mean that a consideration of the style of my video is inappropriate. Indeed, the following piece of critical reflection concentrates on the aesthetic choices that were made within an explanatory discourse.

In terms of the audiovisual essay's content, I chose to draw on research on the film performances of Hoagy Carmichael that I had recently disseminated in written form, in the online academic journal Movie [see endnote iv]. The fact that the video has its roots in a more traditional piece of academic writing allows me to consider through comparison what the specific qualities of the audiovisual essay might be, in terms of its potential to articulate scholarly arguments. In this essay, I want to make two types of comparison: one between aspects of the written article and their reinvention in the audiovisual essay; and the other between the source material of the movies I discuss and its repurposing in the context of the video essay.

When adapting the written essay into audiovisual form, I was keen to avoid simply offering an 'illustrated reading' of the article. In the video, the discussion of Cricket's first appearance in To Have and Have Not offers a good example of my attempt to adapt the written word into a voiceover that is choreographed appropriately with the other audiovisual content.

[Watch VIDEO 2, to the left]

This voiceover contains just over 100 words, but is adapted from a written passage that is over three times as long [see endnote v]. The reason for this is that, in the audiovisual essay, I wanted there to be a choreography between my words and the images and sounds under review, so the pace and phrasing of the voiceover have to conform with the pacing of the sequence. The adaptation of the original writing to key the narration into the rhythm of the sequence is evidenced in the difference between the first two lines of the written passage and of the voiceover. In the article, the lines read:

The camera has dollied in towards Harry who is sitting on his own in the bar. Cricket alerts his attention when he strikes up the song’s opening melody, and Harry looks up, cueing a shot of the band as he would see them from his table. (46 words)

In the voiceover, I say:

The camera dollies in towards Harry, sitting on his own in the bar. Cricket’s playing causes Harry to look up, and this cues a shot of the band from his viewpoint. (31 words)

The changes in the first line are small ones  - ‘dollies in’ instead of ‘has dollied in’ and the omission of ‘who is’. Nevertheless, the changes were necessary as they got this introductory line out of the way in time for me to be able to choreograph the words ‘look up’ exactly with the moment that Harry does just that. The point being made in this passage of the audiovisual essay is that Cricket's ceding of control to the main characters is insinuated visually, even before his deference to Slim is indicated very obviously through his handing over of vocal duties. At this moment, I wanted my voiceover to be particularly synchronised to the visual cues, in order to suggest the significance of seemingly routine aspects of the characters' gestures and To Have and Have Not's editing scheme.

By the end of this passage, I wanted the commentary about the images to join up again with a discussion of the music. There is a short pause in my narration in this extract, in which you get to hear a snippet of Cricket’s playing and singing without my commentary. The reason for this is to allow the final words of this section of my voiceover – ‘the band’s easy willingness to harmonise’ – to lead in precisely to the moment at which the drummer joins in with the song: I wanted my observation to be reinforced precisely by a representation of willing musical collaboration.

The changes I have described exploit a potential in the audiovisual essay that is not available to the written essay. This is the ability to orchestrate the voiceover with images and sounds from the film, in a choreography designed to reinforce the argument being made. However, other aspects of the written passage that have not survived the cut in the translation to voiceover might signal something that the audiovisual essay is not so adept at doing. In the written passage, I make use of two quotations from other writers (George M. Wilson and Charles Emge) to suggest an alternative way of reading the scene. Could their absence in the video essay indicate that the form is not particularly suited to the quotation and exploration of other writers’ ideas in the point-counterpoint style characteristic of academic writing? When the voiceover is going with the flow of the film’s images and sounds, how can there be room for other voices to be heard?

In this example, the linear flow of the original film's images and sounds is completely preserved (though clearly the sound levels are manipulated to allow room for the voiceover). Indeed, given that the argument revolves around the sequencing of visual details, it would seem particularly dishonest at this point to manipulate the original material to suit the voiceover. However, on other occasions, the audiovisual essay does make changes to particular moments from To Have and Have Not. This brings to the fore differences between the way film moments are evoked in written form, on the one hand, and in the context of the audiovisual essay, on the other.  The following clip uses split screen to show how one moment from To Have and Have Not differs from its representation in the audiovisual essay. 

[Watch VIDEO 3, to the left]

In the first part of this passage from the audiovisual essay, the voiceover works ‘honestly’ in the way it choreographs with the material upon which it comments: I say the song’s ‘lyrical tone is inspired completely by the maverick qualities of Harry and Slim’, at which point the close-up of Harry appears, in its 'proper' place, and my narration pauses to allow the viewer to hear the chorus line. In this way, I make a suggestion about Harry and Slims' ownership of the song and then pause to allow the film to show how that is expressed: through the couple acknowledging each other within the space of the song, just as it reaches its key refrain.

The second part of the extract is not so honest, as it cuts in another close up of Harry, actually taken from later in the original sequence. In this shot, Harry nods, as if to indicate that he understands how Slim intends the song to be heard by him, as a ‘sly declaration of love’. But this is my claim about Slim’s intentions rather than a definitive fact. In the audiovisual essay, the original film is manipulated to engineer a moment which shows Humphrey Bogart giving my argument the nod of approval, as much as it shows Harry gesturing to Slim. In the video essay there is then a fade-to-black, to stop the sequence short before it reaches the point at which the nodding close-up of Harry was originally placed.

I think the sly editing here can be justified: it represents in moving image form the type of summarising of the rhetoric of a scene that is typical in written commentary. Indeed, in my article, I make just such a summary of Harry’s involvement in this scene: ‘Slim sings ['How Little We Know'] as a typically cool reaction to her preceding declaration of love to Harry, her directing of lines towards him [provoking] reaction shots registering his wry response’ [see endnote vi].

However, in my writing, it is clear that I am summarising the sequence: I note the ‘reaction shots’ without attempting to place them in the second-by-second unfolding of the scene. In the comparable section of the audiovisual essay, there is no indication  to the viewer that I am compressing the original sequence – in the video essay form, there is no standard way of expressing that you are editing out material from or paraphrasing the original footage. I do not think I am guilty of misrepresentation here, but there certainly could be instances where that criticism could be made. In any case, this example suggests that editing in the scholarly video essay is an ethical matter, a balancing act between the representation of original audiovisual material and the advancing of an original argument.

So far, I have discussed, first, the way the audiovisual essay ‘tampers’ with my original writing and, second, the manner in which it ‘tampers’ with the original film. In both cases, the adjustments I have considered are quite small ones: a few words here; a re-ordered shot there. I will finish by considering a more obviously bold transformation of the original film material, in order to consider what the unique rhetorical possibilities of the audiovisual essay may be. The opening of the video essay makes use primarily of material from the first action scene of To Have and Have Not, combined with music from the film's opening credits.

[Watch VIDEO 4, to the left]

The use of the opening credit music is meant to lend urgency to my narration and to suggest a movie-trailer quality. The intention here is for the audiovisual essay to seem initially to ‘succumb’ to the allure of the films’ stars and its eventful narrative, but then to redirect its attention to a character who might easily be neglected. This redirection is suggested by the winding down of the opening credit music and the emergence of ‘How Little We Know’ at the same time as slow-motion is used: a literal cue that the video essay is going to slow the film down in order to analyse it more closely.

In this opening, then, there is an effort to organise disparate materials in a way that leads the viewer into the contemplative world of the audiovisual essay. Unlike the other examples I have cited, there is no obvious point of comparison between this opening and a passage from the written article. Rather it offers an audiovisual essay-specific repurposing of existing sounds and moving images to express, through style, the adoption of a particular critical perspective.



[i] The training process involved Ian consulting with me over editing choices, but he performed the actual sound and picture editing, explaining what he was doing as he did so and recording tutorials that ran through particular techniques (which I could then practice in my own time). Ian also had a key input into editorial decisions that affected the way the film's argument was put across.

[ii] Erlend Lavik, "The Video Essay: The Future of Academic Film and Television Criticism?", Frames Cinema Journal, Issue 1, July 2012. Online at

[iii] See Christian Keathley, "La Caméra-stylo: Notes on video criticism and cinephilia", in The Language and Style of Film Criticism, ed. Andrew Klevan and Alex Clayton (London and New York: Routledge, 2011)

[iv] Ian Garwood, "Play It Again Butch, Cricket, Chick, Smoke, Happy… The Performances of Hoagy Carmichael as a Hollywood Barroom Pianist", Movie - a Journal of Film Criticism, Issue 4, 2013. Online at:

[v] The relevant passage is found on p35 of my article (from 'The camera has dollied in towards Harry…' in the final full paragraph in the left-hand column to 'the band's easy willingness to harmonise' in the upper half of the right-hand column).

[vi] Ibid., p35.


Adrian Martin's picture

Voice-Over Rewriting & Image Re-Editing

Many thanks for your article, Ian, it is super-interesting and illuminating, and resonates with many of the experiences Cristina Álvarez and I have had in working with re-editing films and (most recently) using voice-over. It occurs to me that it would be interesting to look at the most sophisticated examples of TV and documentary work on films & filmmakers (a large tradition which can be seen as another precursor of the audiovisual essay) to gauge how they deal with such re-writing and re-editing, as they surely have done and do all the time. I came upon an interview with the (sadly recently deceased) great critic Michael Henry Wilson, co-director of Scorsese’s PERSONAL JOURNEY THROUGH AMERICAN CINEMA (and their now unfinished British Cinema companion), and he verifies this: he says, working in tandem with editor Thelma Schoonmaker as she cut/reworked the ‘quoted’ clips, he would ‘rewrite the narration up to 20 times to make it fit the image exactly’ (French original is here: - which he describes as ‘truly surgical work’! Of course, to many viewers, this work would be basically invisible, they see only the end result - which is why your piece is so valuable, in taking us ‘behind the scenes’ of the reworking process. Thanks again.
Kevin Lee's picture

poetics and ethics

I am extremely appreciative of this essay’s sensitivity and attentiveness to what can be described as the ethical implications of this sort of video essay work and its relationship to its source materials. It’s a fascinating paradox that, on the one hand, videographic techniques allow us to directly “quote” or represent the media in question, offering a kind of direct engagement that is more vivid and even “authentic” than textual descriptions or citations… and yet your meticulous account casts aspersions on such notions and raises a heightened awareness of how someone practicing this work should not take their interpolations for granted, especially if it is to be valued as scholarship. Thank you.
Ian Garwood's picture

Long-form documentaries and close analysis

Thank you very much for these comments, Adrian and Kevin. After Adrian mentioned Martin Scorsese’s Personal Journey through American Cinema, I started re-watching it, precisely to see how clips had been re-edited and how the voiceover was choreographed with them. I haven’t reviewed all of the documentary, but what strikes me is the general tendency to talk alongside the clip rather than engage with specific details - the clips are used representatively rather than broken down on a moment-by-moment level. I do think there is something about the short form of the audiovisual essay, of the kind hosted on Audiovisualcy, for example, that lends itself well to a more intensive form of close film analysis. Having said that, I’m sure there will be examples of long-form work that engages consistently with the details of moments, managing to combine breadth of reference with analytical depth (in fact, the films of Mark Cousins come to mind as I write this). However, there might still be something unique about the experience of seeing a moment returned to ‘obsessively’ in a compressed amount of time.
Cristina Álvarez López's picture

Ethics and Transformation

As a self-punishment for my overlong previous comment, I will offer only a very brief observation. As Kevin, I think that the question of ethics posed by Ian is an important one that every maker should face at some point. I think it’s essential to pose ourselves these questions, and to evaluate them in each particular case. But I also think that the audiovisual essay is, in itself, a highly transformative practice. It always implies, in a lesser or greater way, a transformation of the source. Sometimes, such as in the case of the use of a voice-over, this transformation is more evident; other times (such as in a subtle re-edit of the kind mentioned by Ian), it isn’t. That’s why I think we need to emphasise this transformative character of the audiovisual essay, and to make clear that the fact that we are using ‘quoted fragments’ in order to produce knowledge about films doesn’t necessarily mean that the audiovisual essay is more “objective” or “faithful” than the written text. Having said this, I find it extremely helpful and enlightening to have all the materials that can help the viewer understand the decisions, transformations and approaches taken in the process of making an audiovisual essay: written texts, reflections about the process being part of the audiovisual essay itself (as Kevin and others have done), Drew Morton’s publication of the drafts of his “Bad Dads” piece, or the incorporation of extra audiovisual material in the form of comparison, as in this case.