Articles, Blogs, Interviews and News


Public relations for the audio, broadcast and entertainment industries



The Magic Of Mixing

The immersive experience is becoming increasingly important in audio for film, television and games. Mixing sound for these formats requires professionals to produce high quality audio regardless of the final playback system. It used to be easy adding sound effects, music and atmosphere to accompany visual media but today’s sound mixers need to provide high quality audio that works on playback systems ranging from speakers and sound bars through to Dolby Atmos. Then we have immersive audio for games and the explosion in AR (augmented reality) and VR (virtual reality) applications. How do the mix professionals approach mixing to provide quality audio for the various media?


The format in which TV shows and films are mixed comes down to the demands of the individual production both in terms of the budget and the principal market for that particular show. A low budget film might be mixed in 5.1 at cinema/theatrical level with a view to having screenings at film festivals and a cinema release. It would not make sense to mix in 7.1 or Dolby Atmos unless the film contains a lot of material that would benefit from its dynamic range.


Dan Johnson Senior Re-Recording Mixer at Molinaire explains. “As a mixer it would be my preferred option to mix in the highest resolution whether that be Atmos, 5.1 theatrical (with its high dynamic range), 5.1 broadcast or stereo broadcast. It is often easier and more efficient to down mix to lower resolution formats than to up mix, which often requires unpicking and redoing work already done in order to achieve the optimum result and make full use of the format. But very often the best option is to do all the primary work, and get the mix signed off, in the format it is most likely to be seen and heard. This way the majority of the creative work is done in the format that the majority of people will hear it in. Any other deliverable versions can then use this as a starting point.”


With the vast majority of TV shows and films recorded as multi-track digital files; these files will usually contain one or more boom recordings, from zero to six radio microphone (or iso) recordings and a mix track. “It is impractical for the picture editor to edit with all of the multi-track files and they do not have time to choose microphones so will usually use the mix track” continues Johnson. “When the dialogue editor starts work it is often required of them to conform the mix track back to the multi-track files to allow access to all the microphones.”


The dialogue editor uses their skill, technique and experience to get the most from the audio rushes, using alternative takes, editing, noise reduction and a whole host of other techniques to make the dialogue clear, smooth and appropriate to the production. “As a mixer we have many tools at our disposal to change and process the production audio if required. Part of the skill is being able to do very focused work to enable the maximum benefit while doing the least damage. Common tools involve compression for controlling the dynamic range; this could, for example, stop the audience having to continually change the volume of the TV while watching a programme. EQ can be used in two ways, both as a technical and also as a creative tool for effect or to make things sound pleasant. Very often when recordings of quiet dialogue are turned up for clarity it makes particular sounds more prominent than they usually are. It is common to use a de-esser to reduce these sounds and make the dialogue ‘softer’ and less tiring to listen to. Other techniques include the use of reverb to control perspective and to create acoustic spaces to match the locations on screen.”


When discussing mixing Johnson could write a book! “As a TV and film mixer the main thing I consider when mixing is dialogue. The job of the audio department is to use sound to tell the story as effectively as possible and the majority of the storytelling information usually comes from the dialogue. I want the dialogue to be, at the very least, clear and audible and I also want to hear the nuances of the actor’s performance. Evolution has left us very sensitive to human speech and (just like a colourist will pay particular attention to skin tone) this means I pay particular attention to the tonal qualities and clarity of the dialogue recordings. I will always mix the dialogues first and getting these right gives a really solid basis on which to build the rest of the mix. The music, sound effects and foley can then be mixed around and against the dialogue, using the dialogue as the reference or anchor point for the mix.”


“It is not unusual, particularly on a feature film, to deliver many different mixes of the film in a considerable number of formats. This could be from Dolby Atmos with a very high dynamic range all the way down to a stereo broadcast compliant mix with a much reduced dynamic range and everything in between. It is not always practical or possible to review or mix on many different playback systems. An experienced mixer will probably do most of their mixing and decision making on some good, high-quality monitoring that they know well and this will be a good reference. Because of their experience they will be able to make good judgements about how the various mixes will translate to different types of playback system. It never does any harm to check mixes on small speakers and make adjustments. However, the better the monitoring and the more transparent it is, the better the chance of making good mixing choices that will translate to as many different systems as possible.”


Douglas Sinclair of Bang Post Production works mainly in stereo and 5.1 for TV and 5.1 or 7.1 for film however he is finding that Dolby Atmos is now becoming more requested for both TV and film. “When I receive the audio files the AAF (advanced authoring format) from the edit is matched with ProTools to the rushes to make them workable” says Sinclair. “We select the best channel for the scene or character and then further ‘clean’ that if needed. The amount of treatment varies depending on the quality of the location audio from none on quiet location or studio recordings to extensive for noisy locations or sets. The tracks are then Pre-mixed (EQ, Compression, Reverb, Level) then each group of premixes (Dialogues, Atmos, FX, Music) are final mixed to give the overall soundtrack.”


When mixing and mastering Sinclair has a slightly different approach. “For Drama we need to tell the story. Within that Dialogue intelligibility is key along with dynamic use of FX and Music to propel the emotion. For Documentary we need clear dialogues and commentary again but with the overall emphasis on factual realism so the viewer feels they can trust the information they are being given.”


When mixing scores and music editing the composer often delivers the full mix and stems in either stereo or 5.1. “The mix is always used first as that is how the composer intends the music to be heard” adds Sinclair. “However, occasionally something may not work quite as intended so having the stems allows the music mix to be altered in a way that will work more sympathetically within the overall sound mix and so get the best out of everything. If a cue has to be edited either because of a duration change in the pictures or to get a different emotional feel then it’s best to consult with the composer first if possible. They can often offer something quite quickly as they have all the original material. Otherwise editing the stems is the next best thing as ‘bumpy’ edits can be avoided by cross-fading different elements of the cue at different points to help smooth the transitions.”


With the ever increasing playback systems Sinclair ensures he gets it right by listening to the mix on the medium it’s intended for. However it’s important to listen on the different systems as you go through the mix for example if it’s primarily for TV then listen to the A/B the mix on main monitors and on a TV or various TVs. The same applies if it’s going to be streamed or played on a laptop etc. When mixing Sinclair’s favourite bits of kit are the FabFilter EQ and De-Esser. “These are very smooth, transparent and musical. They are excellent corrective and creative tools.”


Games are often mixed in stereo, 5.1 or 7.1 surround as most people will play on headphones or on TVs. Jim Croft, Head of Audio at Frontier Developments says “In terms of software, we mix in Wwise tool. Wwise is a 3rd party audio engine and interfacing tool made by Audiokinetic and used widely in games today. We run Wwise alongside our game and mix on the fly while playing the game. It’s a lengthy process, an end mix usually taking around 2 to 4 weeks depending on its complexity.”


What is traditionally considered mixing at the end of a project needs to happen to an extent throughout a game’s development. In games there is a lesser understanding of the relationship of all the sounds present in the mix or being able to predict what happens next. The game has to be considered to be “live” all the time because it is driven entirely by user input. “One pillar in game mixing is to make it rule based” added Croft. “Using distance to an object, how important/informative it is to a player, even relative azimuth to the camera is used as a guideline for automated ‘run-time’ mixing. For the final pass, we mix in one of our two 7.1 capable mix rooms. Having a pro level mix space is as important for us as it is for mixers in music or film. Our games are played on a multitude of speaker types and headphones, so we need to know that what we are mixing on is absolutely accurate.”


Mixing an interactive soundscape comes with many challenges. Interactive environments are changing all the time. Frontier makes the kind of games including Planet Coaster and Jurassic World Evolution that allow the player to shift perspective at will. “We have to deal with that” said Croft. “We have to make a game that always sounds good, from any perspective and under any circumstances. This is a hugely challenging task. We sometimes envy our linear media cousins who only have to worry about one perspective at a time! We have to create audio systems that can deal with both scale and detail and still sound great at both ends of the spectrum.”


With AR and VR becoming increasingly important Croft has to ensure that his mixes support the evolving playback systems. “The biggest affect that VR has on Elite Dangerous audio is on a technical implementation level. The stereo field is now also influenced by head movement and we’ve had to make sure that UI (user interface) items exist on positional emitters so that they re-enforce their visual positions in space. We have never set out to create VR specific content though because our surround mixes are already very detailed and we have positional emitters on pretty much everything. Occasionally we would need to offset an emitter’s position if a UI element feels to close. Without offsetting that can feel claustrophobic and also lead to strangeness when a player leans his whole body ‘through’ the plane of the UI element. Technically, VR physically gets in the way of mixing. So we still prioritise a good mix first and foremost, then decide which parts of the mix require HTRF related positioning and which can be positioned conventionally.”


There has been a shift towards better audio in games. Expectations have grown exponentially over the years. Quality comparisons are made with film audio these days but with the added challenge of a non-linear medium. “Ironically, audio has been lagging behind in terms of hardware” adds Croft. “It took a long while before sound in games reached current Blu-Ray quality and allowed enough “voices” to accurately represent all visuals on screen. Middleware like Wwise and Fmod have allowed sound designers greater creative freedom and the ability to use processing in real time. We are now at a stage that we can pretty much create what we want and at very high quality.”


As technology moves forward Croft reflects on where it may go next. “I think whatever the delivery medium we will always require strong sonic storytelling and technical innovation. The hardware is only as good as the software running through it. Of course, the more immersive the delivery medium, the better. I’m all for that.”


As technology moves forward from the current 5:1, 7:1, Dolby Atmos, AR and VR we have to ask where it will go next and how it will impact on the mixing process. From a practical perspective one of the biggest challenges is the amount of deliverables i.e. full mixes, M&Es, stems etc. To output all the various assets required by the broadcasters or distributors can take as much or more work than the mix itself.


The process is becoming more streamlined and Dolby have done some great work to make it much easier to derive the 7.1, 5.1 and Stereo down mixes from the Atmos mix so mixing natively in Atmos will become the norm for some productions. One of the areas in which we will see technology moving forward in the mixing arena is in the use of much more automation. There are already many bits of software out there that aim to do elements of audio processing automatically and quickly. Beyond that it is suggested that more control over the ‘mix’ will be handed to the consumer audience. As streaming and metadata become increasingly sophisticated along with the convergence of game audio and post production audio technology its likely consumers will soon be able to tailor the content, look and sound of whatever they watch to their own taste.