Sound(scape) design in virtual production: using location sound in the studio

8 min readJun 28, 2021

Written by Ian Thompson

Ian Thompson, Sound Designer on the set of the‘How To Be Good’ project in the Stockwell Street Studios at the University of Greenwich in London

Recording and designing sound for the micro short film ‘How to Be Good’ (H2BG) involved a similar strategy to pre-visualisation, in that sounds to accompany computer-generated location images were created during pre-production. These were then replayed in the studio during filming, enhancing the sense of virtual presence and immersion in the virtual environment on set.

A scene in Unreal Engine which was the location setting for ‘How to Be Good’

How Sound Contributes to VP

Technologies and practices underpinning virtual production (VP) are image led. The abiding aim of Virtual Production (VP) is to create perfect motion-tracked, real-time visualisation. This requires professional collaboration between: cinematographers, lighting designers, coders, hardware developers, digital artists and imaging technicians. New inter-disciplinary approaches to workflow (from film) and pipeline (from computer game development) are emerging specifically for VP, notably around the pre-visualisation (pre-viz) stage of film production.

Despite widespread coverage of technical achievements in imaging, considerations of how sound might contribute to VP seem conspicuously absent from the discourse. In conversations with several VP professionals during production of H2BG, remarks such as “I know nothing about sound” and “we just hand it over to a sound designer” were common. This is surprising, given the importance of sound in representing a film’s location on screen — and representation of location is, ultimately, what studio-based VP aims to achieve on set.

Unlike traditional film-making where a sound designer might start in pre-production by interrogating the script and working with writer and director to imagine how scenes will look — and sound — in the final cut, the precise look and feel of locations were already created in UE as part of pre-viz. This was a perfect reference point from which to begin recording and creating essential sound design assets for the final cut even before shooting began.

Location sounds were recorded and processed to accompany the computer-generated virtual environment and replayed in the studio in sync during the shoot. An off-screen sound cue was also used, triggered using QLab during filming of a scene in which the protagonist, Lily, is alerted to a sound inside a nearby hangar.

A local industrial estate served as location based recording for the virtual set

Location Soundscape

Drawing on the World Soundscape Project’s taxonomy of soundscape (WSP 1977), location and other sounds created for H2BG pre-viz were categorised as:

- Keynote (the prevailing background sound in a scene)

- Soundmarks (specific sonic features; birdsong, transport, etc. — analogous with landmarks)

- Sound signals (sirens, bells, alerts, etc.)

H2BG is set in an abandoned post-apocalyptic industrial area at night. Hence a keynote sound was relatively easy to create, by recording the ambience of an architecturally similar location on a local industrial estate. This was captured using a Røde NT-SF1 ambisonic microphone, mindful that UE (from where the keynote would be played on set) supports the spatialisation of 1st order ambisonics.

Having a visual reference of exactly how the environment would appear in the finished film was highly beneficial. Not only could the optimum position be selected for recording, but the audio also edited to remove sounds such as traffic and air conditioning units, and processed further to aesthetically match sound to the image, minimising the need to work on this in post.

Once edited, the keynote sound was attached to the main background layer of the environment in UE and set to play as a loop accompanying the scene in each take. For additional control, the audio output from the PC running UE in the studio was routed to stereo loudspeakers via a small mixing desk. Thus, the location sound could be muted or attenuated as necessary between takes.

VP Sound diagram for basic audio playback schematic

Although some spatialisation and object occlusion were applied to the keynote in UE, these were unnecessary as very little movement through the virtual environment needed to be filmed; most takes were relatively static. Shifts in spatial perspective were also too subtle to be heard on set; fixed stereo playback was sufficient.

The only soundmark suggested by the very desolate location was an electric mains hum, attached to a power transformer object in the UE environment (post-apocalyptic setting notwithstanding, artistic license allowed us to suggest that power is coming from somewhere). This sound was also spatially localised relative to a game player’s point of view while moving through the scene, but of course VP is a different experience and the transformer was out of shot for most takes; hearing it on set was distracting rather than helpful, so it was removed.

A sound signal needed for on-set playback was a pan being dropped inside a large hangar, heard off screen. This cue alerts the character, Lily to the potentially threatening presence of life, and she responds accordingly. The pan effect was recorded and processed using convolution reverb to represent the internal acoustic of the hangar. Again, being able to enter the virtual environment in UE and using that accurate visual reference while creating the effect was very helpful.

Unlike the keynote played as part of the projected scene, the pan cue wasn’t attached in UE but instead played on cue during the take. A laptop running QLab software was used, with the effect triggered manually at precisely the correct moment, enabling the actor playing Lily to respond as realistically as possible.

Dialogue Recording

Dialogue was captured on set using lavalier microphones. The potential problem to overcome was an acoustic mismatch between the virtual environment and the studio, so miking closely was essential (good practice on a traditional set also). Actions and other sounds were covered with a shotgun microphone on a boom. Unfortunately, the VP hardware provider was unable to supply or receive timecode, meaning that each take had to be slated and marked in the traditional way with a clapper (somewhat ironic in this high-tech setting!).

Writer/Producer/Co-Director Jodi Nelson-Tabor watches shot setting up in the monitor

Playing the keynote sound on set allowed dialogue to be captured without sound spill being an issue, since the exact same location sound would be used in the final soundtrack anyway. Additionally, compared with the experience of traditional location sound recording where planes, traffic, and other noises are problematic, pre-recording the location sound eliminated the need to stop mid-take because of noise problems, to the obvious benefit of cast and crew. And, being in a studio, radio mic interference is also reduced.

Post Production

Although main elements of location soundscape and atmosphere were produced ahead of filming, post-production still called for basic Foley recording of footsteps, handling of objects, and other movements. Capturing clean dialogue recordings in the studio eliminated the need to use ADR , however the acoustics of spaces in which speech is heard in the film had to be created artificially using convolution reverb. Although this affords a higher degree of control than might be the case for traditional location recording, dialogue editing remains a significant part of the post-production workflow. Also, although some time has been saved in post by recording soundscape elements in advance, these still need adding to the final mix, along with music and other design elements.

Reflections on Sound Design in VP

In summary, after using H2BG as an exercise to consider how sound design can become part of the VP workflow the main findings are:

Using a keynote sound to represent the virtual environment on set adds to the sense of immersion in a scene, and (as reported from one of the actors) is a helpful cue to enter ‘performance mode’ at the start of a take as the sound is faded up.
Dynamic spatialisation in the game engine is not entirely necessary, as most takes are relatively static and don’t require sound to subtly shift in perspective in the same way as imagery.
Routing keynote audio from the VP PC via a mixer adds essential control on set; that background sound can become tiresome after a while!
Playing sound cues on set can help orientate actors and help them respond appropriately.
Close miking of actors in the studio potentially eliminates the need for ADR, if caveats such as low noise levels on set and relatively dry acoustics are met.
The ability of VP hardware to generate and share linear timecode with camera and sound is essential if time savings are to be fully realised.

In terms of workflow, the received wisdom with VP is that post-production will, one day, be replaced by pre-viz. This is perhaps optimistic, but if the trend is towards increasing resources for pre-production, there is a risk that high quality sound design becomes compromised; as this is still very much post-production activity. However, by working closely with the creative team during the build of virtual environments from the outset, appropriate sound design choices can be made much sooner, and a few even realised prior to filming so there are potentially some workflow efficiencies to be realised.

Next Steps for Sound in VP

The on-set playback for H2BG was rather basic, with just a stereo pair of powered PA monitors. Higher quality loudspeakers and more channels would afford more sophisticated use of spatial audio, thereby enhancing the illusion of presence in the virtual environment.

Making H2BG offered a valuable insight into how sound design practice can contribute to VP. The desolate setting for H2BG meant that a location soundscape with only a few elements needed to be created with which to test basic ideas around workflow and implementation of sound playback on set. This research can be developed further in more sophisticated VP projects which feature more vibrant soundscapes and a greater variety of soundmarks and signals.

A team of filmmakers and academics at the University of Greenwich have created a micro-short film entitled, How To Be Good, in collaboration with industry leaders at Storyfutures Academy and Mo-Sys engineering to explore and document workflows in virtual production. In this first article of a series, principle investigator Dr Jodi Nelson-Tabor discusses what virtual production means in different contexts and how producing How To Be Good sheds an important light on how VP can be a managed and harnessed to create films that would otherwise be cost prohibitive and complex to shoot.

To follow the series, click on the following: 1, 2, 3, 4, 5, 6, 7, 8, 9

Follow the series/thread at #H2BG on Twitter @GREdesignSchool.

https://www.sfu.ca/sonic-studio-webdav/WSP/index.html

Sound(scape) design in virtual production: using location sound in the studio

Written by Dr. Jodi Nelson-Tabor