Category Archives: Audio Mixing

The sounds of Spider-Man: Homecoming

By Jennifer Walden

Columbia Pictures and Marvel Studios’ Spider-Man: Homecoming, directed by Jon Watts, casts Tom Holland as Spider-Man, a role he first played in 2016 for Marvel Studios’ Captain America: Civil War (directed by Joe and Anthony Russo).

Homecoming reprises a few key character roles, like Tony Stark/Iron Man (Robert Downey Jr.) and Aunt May Parker (Marisa Tomei), and it picks up a thread of Civil War’s storyline. In Civil War, Peter Parker/Spider-Man helped Tony Stark’s Avengers in their fight against Captain America’s Avengers. Homecoming picks up after that battle, as Parker settles back into his high school life while still fighting crime on the side to hone his superhero skills. He seeks to prove himself to Stark but ends up becoming entangled with the supervillain Vulture (Michael Keaton).

Steven Ticknor

Spider-Man: Homecoming supervising sound editors/sound designers Steven Ticknor and Eric A. Norris — working at Culver City’s Sony Pictures Post Production Services — both brought Spidey experience to the film. Ticknor was a sound designer on director Sam Raimi’s Spider-Man (2002) and Norris was supervising sound editor/sound designer on director Marc Webb’s The Amazing Spider-Man 2 (2014). With experiences from two different versions of Spider-Man, together Ticknor and Norris provided a well-rounded knowledge of the superhero’s sound history for Homecoming. They knew what’s worked in the past, and what to do to make this Spider-Man sound fresh. “This film took a ground-up approach but we also took into consideration the magnitude of the movie,” says Ticknor. “We had to keep in mind that Spider-Man is one of Marvel’s key characters and he has a huge fan base.”

Web Slinging
Being a sequel, Ticknor and Norris honored the sound of Spider-Man’s web slinging ability that was established in Captain America: Civil War, but they also enhanced it to create a subtle difference between Spider-Man’s two suits in Homecoming. There’s the teched-out Tony Stark-built suit that uses the Civil War web-slinging sound, and then there’s Spider-Man’s homemade suit. “I recorded a couple of 5,000-foot magnetic tape cores unraveling very fast, and to that I added whooshes and other elements that gave a sense of speed. Underneath, I had some of the web sounds from the Tony Stark suit. That way the sound for the homemade suit had the same feel as the Stark suit but with an old-school flair,” explains Ticknor.

One new feature of Spider-Man’s Stark suit is that it has expressive eye movements. His eyes can narrow or grow wide with surprise, and those movements are articulated with sound. Norris says, “We initially went with a thin servo-type sound, but the filmmakers were looking for something less electrical. We had the idea to use the lens of a DSLR camera to manually zoom it in and out, so there’s no motor sound. We recorded it up close-up in the quiet environment of an unused ADR stage. That’s the primary sound for his eye movement.”

Droney
Another new feature is the addition of Droney, a small reconnaissance drone that pops off of Spider-Man’s suit and flies around. The sound of Droney was one of director Watt’s initial focus-points. He wanted it sound fun and have a bit of personality. He wanted Droney “to be able to vocalize in a way, sort of like Wall-E,” explains Norris.

Ticknor had the idea of creating Droney’s sound using a turbo toy — a small toy that has a mouthpiece and a spinning fan. Blowing into the mouthpiece makes the fan spin, which generates a whirring sound. The faster the fan spins, the higher the pitch of the generated sound. By modulating the pitch, they created a voice-like quality for Droney. Norris and sound effects editor Andy Sisul performed and recorded an array of turbo toy sounds to use during editorial. Ticknor also added in the sound of a reel-to-reel machine rewinding, which he sped up and manipulated “so that it sounded like Droney was fluttering as it was flying,” Ticknor says.

The Vulture
Supervillain the Vulture offers a unique opportunity for sound design. His alien-tech enhanced suit incorporates two large fans that give him the ability to fly. Norris, who was involved in the initial sound design of Vulture’s suit, created whooshes using Whoosh by Melted Sounds — a whoosh generator that runs in Native Instruments Reaktor. “You put individual samples in there and it creates a whoosh by doing a Doppler shift and granular synthesis as a way of elongating short sounds. I fed different metal ratcheting sounds into it because Vulture’s suit almost has these metallic feathers. We wanted to articulate the sound of all of these different metallic pieces moving together. I also fed sword shings into it and came up with these whooshes that helped define the movement as the Vulture was flying around,” he says. Sound designer/re-recording mixer Tony Lamberti was also instrumental in creating Vulture’s sound.

Alien technology is prevalent in the film. For instance, it’s a key ingredient to Vulture’s suit. The film’s sound needed to reflect the alien influence but also had to feel realistic to a degree. “We started with synthesized sounds, but we then had to find something that grounded it in reality,” reports Ticknor. “That’s always the balance of creating sound design. You can make it sound really cool, but it doesn’t always connect to the screen. Adding organic elements — like wind gusts and debris — make it suddenly feel real. We used a lot of synthesized sounds to create Vulture, but we also used a lot of real sounds.”

The Washington Monument
One of the big scenes that Ticknor handled was the Washington Monument elevator sequence. Spider-Man stands on the top of the Washington Monument and prepares to jump over a helicopter that looms ever closer. He clears the helicopter’s blades and shoots a web onto the helicopter’s skid, using that to sling himself through a window just in time to shoot another web that grabs onto the compromised elevator car that contains his friends. “When Spider-Man jumps over the helicopter, I couldn’t wait to make that work perfectly,” says Ticknor. “When he is flying over the helicopter blades it sounds different. It sounds more threatening. Sound creates an emotion but people don’t realize how sound is creating the emotion because it is happening so quickly sometimes.”

To achieve a more threatening blade sound, Ticknor added in scissor slicing sounds, which he treated using a variety of tools like zPlane Elastique Pitch 2 and plug-ins from FabFilter plug-ins and Soundtoys, all within the Avid Pro Tools 12 environment. “This made the slicing sound like it was about to cut his head off. I took the helicopter blades and slowed them down and added low-end sweeteners to give a sense of heaviness. I put all of that through the plug-ins and basically experimented. The hardest part of sound design is experimenting and finding things that work. There’s also music playing in that scene as well. You have to make the music play with the sound design.”

When designing sounds, Ticknor likes to generate a ton of potential material. “I make a library of sound effects — it’s like a mad science experiment. You do something and then wonder, ‘How did I just do that? What did I just do?’ When you are in a rhythm, you do it all because you know there is no going back. If you just do what you need, it’s never enough. You always need more than you think. The picture is going to change and the VFX are going to change and timings are going to change. Everything is going to change, and you need to be prepared for that.”

Syncing to Picture
To help keep the complex soundtrack in sync with the evolving picture, Norris used Conformalizer by Cargo Cult. Using the EDL of picture changes, Conformalizer makes the necessary adjustments in Pro Tools to resync the sound to the new picture.

Norris explains some key benefits of Conformalizer. “First, when you’re working in Pro Tools you can only see one picture at a time, so you have to go back and forth between the two different pictures to compare. With Conformalizer, you can see the two different pictures simultaneously. It also does a mathematical computation on the two pictures in a separate window, a difference window, which shows the differences in white. It highlights all the subtle visual effects changes that you may not have noticed.

Eric Norris

For example, in the beginning of the film, Peter leaves school and heads out to do some crime fighting. In an alleyway, he changes from his school clothes into his Spider-Man suit. As he’s changing, he knocks into a trash can and a couple of rats fall out and scurry away. Those rats were CG and they didn’t appear until the end of the process. So the rats in the difference window were bright white while everything else was a dark color.”

Another benefit is that the Conformalizer change list can be used on multiple Pro Tools sessions. Most feature films have the sound effects, including Foley and backgrounds, in one session. For Spider-Man: Homecoming, it was split into multiple sessions, with Foley and backgrounds in one session and the sound effects in another.

“Once you get that change list you can run it on all the Pro Tools sessions,” explains Norris. “It saves time and it helps with accuracy. There are so many sounds and details that match the visuals and we need to make sure that we are conforming accurately. When things get hectic, especially near the end of the schedule, and we’re finalizing the track and still getting new visual effects, it becomes a very detail-oriented process and any tools that can help with that are greatly appreciated.”

Creating the soundtrack for Spider-Man: Homecoming required collaboration on a massive scale. “When you’re doing a film like this, it just has to run well. Unless you’re really organized, you’ll never be able to keep up. That’s the beautiful thing, when you’re organized you can be creative. Everything was so well organized that we got an opportunity to be super creative and for that, we were really lucky. As a crew, we were so lucky to work on this film,” concludes Ticknor.


Jennifer Walden in a New Jersey-based audio engineer and writer. Follow her on Twitter @audiojeney.com

Nugen adds 3D Immersive Extension to Halo Upmix

Nugen Audio has updated its Halo Upmix with a new 3D Immersive Extension, adding further options beyond the existing Dolby Atmos bed track capability. The 3D Immersive Extension now provides ambisonic-compatible output as an alternative to channel-based output for VR, game and other immersive applications. This makes it possible to upmix, re-purpose or convert channel-based audio for an ambisonic workflow.

With this 3D Immersive Extension, Halo fully supports Avid’s newly announced Pro Tools V.2.8, now with native 7.1.2 stems for Dolby Atmos mixing. The combination of Pro Tools 12.8 and Halo 3D Immersive Extension can provide a more fluid workflow for audio post pros handling multi-channel and object-based audio formats.

Halo Upmix is available immediately at a list price of $499 for both OS X and Windows, with support for Avid AAX, AudioSuite, VST2, VST3 and AU formats. The new 3D Immersive Extension replaces the Halo 9.1 Extension and can now be purchased for $199. Owners of the existing Halo 9.1 Extension can upgrade to the Halo 3D Immersive Extension for no additional cost. Support for native 7.1.2 stems in Avid Pro Tools 12.8 is available on launch.

Dell 6.15

Behind the Title: Nylon Studios creative director Simon Lister

NAME: Simon Lister

COMPANY: Nylon Studios

CAN YOU DESCRIBE YOUR COMPANY?
Nylon Studios is a New York- and Sydney-based music and sound house offering original composition and sound design for films and commercials. I am based in the Australia location.

WHAT’S YOUR JOB TITLE?
Creative Director

WHAT DOES THAT ENTAIL?
I help manage and steer the company, while also serving as a sound designer, client liaison, soundtrack creative and thinker.

WHAT WOULD SURPRISE PEOPLE THE MOST ABOUT WHAT FALLS UNDER THAT TITLE?
People are constantly surprised with the amount of work that goes into making a soundtrack.

WHAT TOOLS DO YOU USE?
I use Avid Pro Tools, and some really cool plugins

WHAT’S YOUR FAVORITE PART OF THE JOB?
My favorite part of the job is being able to bring a film to life through sound.

WHAT’S YOUR LEAST FAVORITE?
At times, clients can be so stressed and make things difficult. However, sometimes we just need to sit back and look at how lucky we are to be in such a fun industry. So in that case, we try our best to make the client’s experience with us as relaxing and seamless as possible.

WHAT IS YOUR FAVORITE TIME OF THE DAY?
Lunchtime.

IF YOU DIDN’T HAVE THIS JOB, WHAT WOULD YOU BE DOING INSTEAD?
Anything that involves me having a camera in my hand and taking pictures.

HOW EARLY ON DID YOU KNOW THIS WOULD BE YOUR PATH?
I was pretty young. I got a great break when I was 19 years old in one of the best music studios in New Zealand and haven’t stopped since. Now, I’ve been doing this for 31 years (cough).

Honda Civic spot

CAN YOU NAME SOME RECENT PROJECTS YOU HAVE WORKED ON?
In the last couple of months I think I’ve counted several different car brand spots we’ve worked on, including Honda, Hyundai, Subaru, Audi and Toyota. All great spots to sink our teeth and ears into.

Also we have been working on the great wildlife series Tales by Light, which is being played on National Geographic and Netflix.

For Every Child

WHAT IS THE PROJECT THAT YOU ARE MOST PROUD OF?
It would be having the opportunity to film and direct my own commercial, For Every Child, for Unicef global rebranding TVC. We had the amazing voiceover of Liam Neeson and the incredible singing voice of Lisa Gerard (Gladiator, Heat, Black Hawk Down).

NAME THREE PIECES OF TECHNOLOGY YOU CAN’T LIVE WITHOUT.
My camera, my computer and my motorbike.

WHAT DO YOU DO TO DE-STRESS FROM IT ALL?
I ride motorbikes throughout Morocco, Baja, Himalayas, Mongolia, Vietnam, Thailand, New Zealand and in the traffic of India.


VR Audio — Differences between A Format and B Format

By Claudio Santos

A Format and B Format. What is the difference between them after all? Since things can get pretty confusing, especially with such non-descriptive nomenclature, we thought we’d offer a quick reminder of what each is in the spatial audio world.

A Format and B Format are two analog audio standards that are part of the ambisonics workflow.

A Format is the raw recording of the four individual cardioid capsules in ambisonics microphones. Since each microphone has different capsules at slightly different distances, the A Format is somewhat specific to the microphone model.

B Format is the standardized format derived from the A Format. The first channel carries the amplitude information of the signal, while the other channels determine the directionality through phase relationships between each other. Once you get your sound into B Format you can use a variety of ambisonic tools to mix and alter it.

It’s worth remembering that the B Format also has a few variations on the standard itself; the most important to understand are Channel Order and Normalization standards.

Ambisonics in B Format consists of four channels of audio — one channel carries the amplitude signal while the others represent the directionality in a sphere through phase relationships. Since this can only be achieved by the combination between the channels, it is important that:

– The channels follow a known order
– The relative level between the amplitude channel and the others must be known in order to properly combine them together

Each of these characteristics has a few variations, with the most notable ones being

– Channel Order
– Furse-Malham standard
– ACN standard

– Normalization (level)
– MaxN standard
-SN3D standard

The combination of these variations result in two different B Format standards:
– Furse-Malham – Older standard that is still supported by a variety of plug-ins and other ambisonic processing tools
– AmbiX – Modern standard that has been widely adopted by distribution platforms such as YouTube

Regardless of the format you will deliver your ambisonics file in, it is vital to keep track of the standards you are using in your chain and make the necessary conversions when appropriate. Otherwise rotations and mirrors will end up in the wrong direction and the whole soundsphere will break down into a mess.


Claudio Santos is a sound editor and spatial audio mixer at Silver Sound. Slightly too interested in technology and workflow hacks, he spends most of his waking hours tweaking, fiddling and tinkering away on his computer.


Audio post vet Rex Recker joins Digital Arts in NYC

Rex Recker has joined the team at New York City’s Digital Arts as a full-time audio post mixer and sound designer. Recker, who co-founded NYC’s AudioEngine after working as VP and audio post mixer at Photomag recording studios, is an award-winning mixer with a long list of credits. Over the span of his career he has worked on countless commercials with clients including McCann Erickson JWT, Ogilvy & Mather, BBDO, DDB, HBO and Warner Books.

Over the years, Recker has developed a following of clients who seek him out for his audio post mixer talents — they seek his expertise in surround sound audio mixing for commercials airing via broadcast, Web and cinemas. In addition to spots, Recker also mixes long-form projects, including broadcast specials and documentaries.

Since joining the Digital Arts team, Recker has already worked on several commercial campaigns, promos and trailers for such clients as Samsung, SlingTV, Ford, Culturelle, Orvitz, NYC Department of Health, and HBO Documentary Films.

Digital Arts, owned by Axel Ericson, is an end-to-end production, finishing and audio facility.


Netflix’s The Last Kingdom puts Foley to good use

By Jennifer Walden

What is it about long-haired dudes strapped with leather, wielding swords and riding horses alongside equally fierce female warriors charging into bloody battles? There is a magic to this bygone era that has transfixed TV audiences, as evident by the success of HBO’s Game of Thrones, History Channel’s Vikings series and one of my favorites, The Last Kingdom, now on Netflix.

The Last Kingdom, based on a series of historical fiction novels by Bernard Cornwell, is set in late 9th century England. It tells the tale of Saxon-born Uhtred of Bebbanburg who is captured as a child by Danish invaders and raised as one of their own. Uhtred gets tangled up in King Alfred of Wessex’s vision to unite the three separate kingdoms (Wessex, Northumbria and East Anglia) into one country called England. He helps King Alfred battle the invading Danish, but Uhtred’s real desire is to reclaim his rightful home of Bebbanburg from his duplicitous uncle.

Mahoney Audio Post
The sound of the series is gritty and rich with leather, iron and wood elements. The soundtrack’s tactile quality is the result of extensive Foley work by Mahoney Audio Post, who has been with the series since the first season. “That’s great for us because we were able to establish all the sound for each character, village, environment and more, right from the first episode,” says Foley recordist/editor/sound designer Arran Mahoney.

Mahoney Audio Post is a family-operated audio facility in Sawbridgeworth, Hertfordshire, UK. Arran Mahoney explains the studio’s family ties. “Clare Mahoney (mum) and Jason Swanscott (cousin) are our Foley artists, with over 30 years of experience working on high-end TV shows and feature films. My brother Billy Mahoney and I are the Foley recordists and editors/sound designers. Billy Mahoney, Sr. (dad) is the founder of the company and has been a dubbing mixer for over 40 years.”

Their facility, built in 2012, houses a mixing suite and two separate audio editing suites, each with Avid Pro Tools HD Native systems, Avid Artist mixing consoles and Genelec monitors. The facility also has a purpose-built soundproof Foley stage featuring 20 different surfaces including grass, gravel, marble, concrete, sand, pebbles and multiple variations of wood.

Foley artists Clare Mahoney and Jason Swanscott.

Their mic collection includes a Røde NT1-A cardioid condenser microphone and a Røde NTG3 supercardioid shotgun microphone, which they use individually for close-micing or in combination to create more distant perspectives when necessary. They also have two other studio staples: a Neumann U87 large-diaphragm condenser mic and a Sennheiser MKH-416 short shotgun mic.

Going Medieval
Over the years, the Mahoney Foley team has collected thousands of props. For The Last Kingdom specifically, they visited a medieval weapons maker and bought a whole armory of items: swords, shields, axes, daggers, spears, helmets, chainmail, armor, bridles and more. And it’s all put to good use on the series. Mahoney notes, “We cover every single thing that you see on-screen as well as everything you hear off of it.” That includes all the feet (human and horses), cloth, and practical effects like grabs, pick-ups/put downs, and touches. They also cover the battle sequences.

Mahoney says they use 20 to 30 tracks of Foley just to create the layers of detail that the battle scenes need. Starting with the cloth pass, they cover the Saxon chainmail and the Vikings leather and fur armor. Then they do basic cloth and leather movements to cover non-warrior characters and villagers. They record a general weapons track, played at low volume, to provide a base layer of sound.

Next they cover the horses from head to hoof, with bridles and saddles, and Foley for the horses’ feet. When asked what’s the best way to Foley horse hooves, Mahoney asserts that it is indeed with coconuts. “We’ve also purchased horseshoes to add to the stable atmospheres and spot FX when required,” he explains. “We record any abnormal horse movements, i.e. crossing a drawbridge or moving across multiple surfaces, and sound designers take care of the rest. Whenever muck or gravel is needed, we buy fresh material from the local DIY stores and work it into our grids/pits on the Foley stage.”

The battle scenes also require Foley for all the grabs, hits and bodyfalls. For the blood and gore, they use a variety of fruit and animal flesh.

Then there’s a multitude of feet to cover the storm of warriors rushing at each other. All the boots they used were wrapped in leather to create an authentic sound that’s true to the time. Mahoney notes that they didn’t want to capture “too much heel in the footsteps, while also trying to get a close match to the sync sound in the event of ADR.”

Surfaces include stone and marble for the Saxon castles of King Alfred and the other noble lords. For the wooden palisades and fort walls, Mahoney says they used a large wooden base accompanied by wooden crates, plinths, boxes and an added layer of controlled creaks to give an aged effect to everything. On each series, they used 20 rolls of fresh grass, lots of hay for the stables, leaves for the forest, and water for all the sea and river scenes. “There were many nights cleaning the studio after battle sequences,” he says.

In addition to the aforementioned props of medieval weapons, grass, mud, bridles and leather, Mahoney says they used an unexpected prop: “The Viking cloth tracks were actually done with samurai suits. They gave us the weight needed to distinguish the larger size of a Danish man compared to a Saxon.”

Their favorite scenes to Foley, and by far the most challenging, were the battle scenes. “Those need so much detail and attention. It gives us a chance to shine on the soundtrack. The way that they are shot/edited can be very fast paced, which lends itself well to micro details. It’s all action, very precise and in your face,” he says. But if they had to pick one favorite scene, Mahoney says it would be “Uhtred and Ragnar storming Kjartan’s stronghold.”

Another challenging-yet-rewarding opportunity for Foley was during the slave ship scenes. Uhtred and his friend are sold into slavery as rowers on a Viking ship, which holds a crew of nearly 30 men. The Mahoney team brought the slave ship to life by building up layers of detail. “There were small wood creaks with small variations of wood and big creaks with larger variations of wood. For the big creaks, we used leather and a broomstick to work into the wood, creating a deep creak sound by twisting the three elements against each other. Then we would pitch shift or EQ to create size and weight. When you put the two together it gives detail and depth. Throw in a few tracks of rigging and pulleys for good measure and you’re halfway there,” says Mahoney.

For the sails, they used a two-mic setup to record huge canvas sheets to create a stereo wrap-around feel. For the rowing effects, they used sticks, brooms and wood rubbing, bouncing, or knocking against large wooden floors and solid boxes. They also covered all the characters’ shackles and chains.

Foley is a very effective way to draw the audience in close to a character or to help the audience feel closer to the action on-screen. For example, near the end of Season 2’s finale, a loyal subject of King Alfred has fallen out of favor. He’s eventually imprisoned and prepares to take his own life. The sound of his fingers running down the blade and the handling of his knife make the gravity of his decision palpable.

Mahoney shares another example of using Foley to draw the audience in — during the scene when Sven is eaten by Thyra’s wolves (following Uhtred and Ragnar storming Kjartan’s stronghold). “We used oranges and melons for Sven’s flesh being eaten and for the blood squirts. Then we created some tracks of cloth and leather being ripped. Specially manufactured claw props were used for the frantic, ravenous wolf feet,” he says. “All the action was off-screen so it was important for the audience to hear in detail what was going on, to give them a sense of what it would be like without actually seeing it. Also, Thyra’s reaction needed to reflect what was going on. Hopefully, we achieved that.”


VR audio terms: Gaze Activation v. Focus

By Claudio Santos

Virtual reality brings a lot of new terminology to the post process, and we’re all having a hard time agreeing on the meaning of everything. It’s tricky because clients and technicians sometimes have different understandings of the same term, which is a guaranteed recipe for headaches in post.

Two terms that I’ve seen being confused a few times in the spatial audio realm are Gaze Activation and Focus. They are both similar enough to be put in the same category, but at the same time different enough that most of the times you have to choose completely different tools and distribution platforms depending on which technology you want to use.

Field of view

Focus
Focus is what the Facebook Spatial Workstation calls this technology, but it is a tricky one to name. As you may know, ambisonics represents a full sphere of audio around the listener. Players like YouTube and Facebook (which uses ambisonics inside its own proprietary .tbe format) can dynamically rotate this sphere so the relative positions of the audio elements are accurate to the direction the audience is looking at. But the sounds don’t change noticeably in level depending on where you are looking.

If we take a step back and think about “surround sound” in the real world, it actually makes perfect sense. A hair clipper isn’t particularly louder when it’s in front of our eyes as opposed to when its trimming the back of our head. Nor can we ignore the annoying person who is loudly talking on their phone on the bus by simply looking away.

But for narrative construction, it can be very effective to emphasize what your audience is looking at. That opens up possibilities, such as presenting the viewer with simultaneous yet completely unrelated situations and letting them choose which one to pay attention to simply by looking in the direction of the chosen event. Keep in mind that in this case, all events are happening simultaneously and will carry on even if the viewer never looks at them.

This technology is not currently supported by YouTube, but it is possible in the Facebook Spatial Workstation with the use of high Focus Values.

Gaze Activation
When we talk about focus, the key thing to keep in mind is that all the events happen regardless of the viewer looking at them or not. If instead you want a certain sound to only happen when the viewer looks at a certain prop, regardless of the time, then you are looking for Gaze Activation.

This concept is much more akin to game audio then to film sound because of the interactivity element it presents. Essentially, you are using the direction of the gaze and potentially the length of the gaze (if you want your viewer to look in a direction for x amount of seconds before something happens) as a trigger for a sound/video playback.

This is very useful if you want to make impossible for your audience to miss something because they were looking in the “wrong” direction. Think of a jump scare in a horror experience. It’s not very scary if you’re looking in the opposite direction, is it?

This is currently only supported if you build your experience in a game engine or as an independent app with tools such as InstaVR.

Both concepts are very closely related and I expect many implementations will make use of both. We should all keep an eye on the VR content distribution platforms to see how these tools will be supported and make the best use of them in order to make 360 videos even more immersive.


Claudio Santos is a sound editor and spatial audio mixer at Silver Sound. Slightly too interested in technology and workflow hacks, he spends most of his waking hours tweaking, fiddling and tinkering away on his computer.

FMPX8.14

Post developments at the AES Berlin Convention

By Mel Lambert

The AES Convention returned to Berlin after a three-year absence, and once again demonstrated that the Audio Engineering Society can organize a series of well-attended paper programs, seminars and workshops, in addition to an exhibition of familiar brands, for the European tech-savvy post community. 

Held at the Maritim Hotel in the creative heart of Berlin in late May, the 142nd AES Convention was co-chaired by Sascha Spors from University of Rostock in Germany and Nadja Wallaszkovits from the Austrian Academy of Sciences. According to AES executive director Bob Moses, attendance was 1,800 — a figure at least 10% higher than last year’s gathering in Paris — with post professional from several overseas countries, including China and Australia.

During the opening ceremonies, current AES president Alex Case stated that, “AES conventions represent an ideal interactive meeting place,” whereas “social media lacks the one-on-one contact that enhances our communications bandwidth with colleagues and co-workers.” Keynote speaker Dr. Alex Arteaga, whose research integrates aesthetic and philosophical practices, addressed the thorny subject of “Auditory Architecture: Bringing Phenomenology, Aesthtic Practices and Engineering Together,” arguing that when considering the differences between audio soundscapes, “our experience depends upon the listening environment.” His underlying message was that a full appreciation of the various ways in which we hear immersive sounds requires a deeper understanding of how listeners interact with that space.

As part of his Richard C. Heyser Memorial Lecture, Prof. Dr. Jorg Sennheiser outlined “A Historic Journey in Audio-Reality: From Mono to AMBEO,” during which he reviewed the basis of audio perception and the interdependence of hearing with other senses. “Our enjoyment and appreciation of audio quality is reflected in the continuous development from single- to multi-channel reproduction systems that are benchmarked against sonic reality,” he offered. “Augmented and virtual reality call for immersive audio, with multiple stakeholders working together to design the future of audio.”

Post-Focused Technical Papers
There were several interesting technical papers that covered the changing requirements of the post community, particularly in the field of immersive playback formats for TV and cinema. With the new ATSC 3.0 digital television format scheduled to come online soon, including object-based immersive sound, there is increasing interest in techniques for capturing surround material and then delivering the same to consumer audiences.

In a paper titled “The Median-Plane Summing Localization in Ambisonics Reproduction,” Bosun Xie from the South China University of Technology in Guangzhou explained that, while one aim of Ambisonics playback is to recreate the perception of a virtual source in arbitrary directions, practical techniques are unable to recreate correct high-frequency spectra in binaural pressures that are referred to as front-back and vertical localization cues. Current research shows that changes of interaural time difference/ITD that result from head-turning for Ambisonics playback match with those of a real source, and hence provide dynamic cue for vertical localization, especially in the median plane. In addition, the LF virtual source direction can be approximately evaluated by using a set of panning laws.

“Exploring the Perceptual Sweet Area in Ambisonics,” presented by Matthias Frank from University of Music in Graz, Austria, described how the sweet-spot area does not match the large area needed in the real world. A method was described to experimentally determine the perceptual sweet spot, which is not limited to assessing the localization of both dry and reverberant sound using different Ambisonic encoding orders.

Another paper, “Perceptual Evaluation of Synthetic Early Binaural Room Impulse Responses Based on a Parametric Model,” presented by Philipp Stade from the Technical University of Berlin, described how an acoustical environment can be modeled using sound-field analysis plus spherical head-related impulse response/HRIRs — and the results compared with measured counterparts. Apparently, the selected listening experiment showed comparable performance and, in the main, was independent from room and test signals. (Perhaps surprisingly, the synthesis of direct sound and diffuse reverberation yielded almost the same results as for the parametric model.)

“Influence of Head Tracking on the Externalization of Auditory Events at Divergence between Synthesized and Listening Room Using a Binaural Headphone System,” presented by Stephan Werner from the Technical University of Ilmenau, Germany, reported on a study using a binaural headphone system that considered the influence of head tracking on the localization of auditory events. Recordings were conducted of impulse responses from a five-channel loudspeaker set-up in two different acoustic rooms. Results revealed that head tracking increased sound externalization, but that it did not overcome the room-divergence effect.

Heiko Purnhagen from Dolby Sweden, in a paper called “Parametric Joint Channel Coding of Immersive Audio,” described a coding scheme that can deliver channel-based immersive audio content in such formats as 7.1.4, 5.1.4, or 5.1.2 at very low bit rates. Based on a generalized approach for parametric spatial coding of groups of two, three or more channels using a single downmix channel, together with a compact parametrization that guarantees full covariance re-instatement in the decoder, the coding scheme is implemented using Dolby AC-4’s A-JCC standardized tool.

Hardware Choices for Post Users
Several manufacturers demonstrated compact near-field audio monitors targeted at editorial suites and pre-dub stages. Adam Audio focused on their new near/mid-fieldS Series, which uses the firm’s ART (Accelerating Ribbon Technology) ribbon tweeter. The five models, which are comprised of the S2V, S3H, S3V, S5V and S5H for horizontal or vertical orientation. The firm’s newly innovated LF and mid-range drivers with custom-designed waveguides for the tweeter — and MF driver on the larger, multiway models — are powered by a new DSP engine that “provides crossover optimization, voicing options and expansion potential,” according to the firm’s head of marketing, Andre Zeugner.

The Eve Audio SC203 near-field monitor features a three-inch LF/MF driver plus a AMT ribbon tweeter, and is supplied with a v-shaped rubberized pad that allows the user to decouple the loudspeaker from its base and reduce unwanted resonances while angling it flat or at a 7.5- or 15-degree angle. An adapter enables mounting directly on any microphone or speaker stand with a 3/8-inch thread. Integral DSP and a passive radiator located at the rear are said to reinforce LF reproduction to provide a response to 62Hz (-3dB).

Genelec showcased The Ones, a series of point-source monitors that are comprised of the current three-way Model 8351 plus the new two-way Model 8331 and three-way Model 8341. All three units include a co-axial MF/HF driver plus two acoustically concealed LF drivers for vertical and horizontal operation. A new Minimum Diffraction Enclosure/MDE is featured together with the firm’s loudspeaker management and alignment software via a dedicated Cat5 network port.

The Neumann KH-80 DSP near-field monitor is designed to offer automatic system alignment using the firm’s control software that is said to “mathematically model dispersion to deliver excellent detail in any surroundings.” The two-way active system features a four-inch LF/MF driver and one-inch HF tweeter with an elliptical, custom-designed waveguide. The design is described as offering a wide horizontal dispersion to ensure a wide sweet spot for the editor/mixer, and a narrow vertical dispersion to reduce sound reflections off the mix console.

To handle multiple monitoring sources and loudspeaker arrays, the Trinnov D-Mon Series controllers enable stereo to 7.1-channel monitoring from both analog and digital I/Os using Ethernet- and/or MIDI-based communication protocols and a fast-switching matrix. An internal mixer creates various combinations of stems, main or aux mixes from discrete inputs. An Optimizer processor offers tuning of the loudspeaker array to match studio acoustics.

Unveiled at last year’s AES Convention in Paris, the Eventide H9000 multichannel/multi-element processing system has been under constant development during the past 12 months with new functions targeted at film and TV post, including EQ, dynamics and reverb effects. DSP elements can be run in parallel or in a series to create multiple, fully-programmable channel strips per engine. Control plug-ins for Avid Pro Tools and other DAWs are being finalized, together with Audinate Dante, Thunderbolt, Ravenna/AES67 and AVB networking.

Filmton, the German association for film sound professionals, explained to AES visitors its objective “to reinforce the importance of sound at an elemental level for the film community.” The association promotes the appreciation of film sound, together with the local film industry and its policy toward the public, while providing “an expert platform for technical, creative and legal issues.”

Philipp Sehling

Lawo demonstrated the new mc²96 Grand Audio production console, an IP-based networkable design for video post production, available with up to 200 on-surface faders. Innovative features include automatic gain control across multiple channels and miniature TFT color screens above each fader that display LiveView thumbnails of the incoming channel sources.

Stage Tec showed new processing features for its Crescendo Platinum TV post console, courtesy of v4.3 software, including an automixer based on gain sharing that can be used on every input channel, loudness metering to EBU R128 for sum and group channels, a de-esser on every channel path, and scene automation with individual user-adjustable blend curves and times for each channel.

Avid demonstrated native support for the new 7.1.2 Dolby Atmos channel-bed format — basically the familiar 9.1-channel bed with two height channels — for editorial suites and consumer remastering, plus several upgrades for Pro Tools, including new panning software for object-based audio and the ability to switch between automatable object and buss outputs. Pro Tools HD is said to be the only DAW natively supporting in-the-box Atmos mixing for this 10-channel 7.1.2 format. Full integration for Atmos workflows is now offered for control surfaces such as the Avid S6.

Jon Schorah

There was a new update to Nugen Audio’s popular Halo Upmix plug-in for Pro Tools — in addition to stereo to 5.1, 7.1 or 9.1 conversion it is now capable of delivering 7.1.2-channel mixes for Dolby Atmos soundtracks.

A dedicated Dante Pavilion featured several manufacturers that offer network-capable products, including Solid State Logic, whose Tempest multi-path processing engine and router is now fully Audinate Dante-capable for T Series control surfaces with unique arbitration and ownership functions; Bosch RTS intercom systems featuring Dante connectivity with OCA system control; HEDD/Heinz Electrodynamic Designs, whose Series One monitor speakers feature both Dante and AES67/Ravenna ports; Focusrite, whose RedNet series of modular pre-amps and converters offer “enhanced reliability, security and selectivity” via Dante, according to product specialist for EMEA/Germany, Dankmar Klein; and NTP Technology’s DAD Series DX32R and RV32 Dante/MADI router bridges and control room monitor controllers, which are fully compatible with Dante-capable consoles and outboard systems, according to the firm’s business development manager Jan Lykke.

What’s Next For AES
The next European AES convention will be held in Milan during the spring of 2018. “The society also is planning a new format for the fall convention in New York,” said Moses, as the AES is now aligning with the National Association of Broadcasters. “Next January we will be holding a new type of event in Anaheim, California, to be titled AES @ NAMM.” Further details will be unveiled next month. He also explained there will be no West Coast AES Convention next year. Instead the AES will return to New York in the autumn of 2018 with another joint AES/NAB gathering at the Jacob K. Javits Convention Center.


Mel Lambert is an LA-based writer and photographer. He can be reached at mel.lambert@content-creators.com. Follow him on Twitter @MelLambertLA.


Recording live musicians in 360

By Luke Allen

I’ve had the opportunity to record live musicians in a couple of different in-the-field scenarios for 360 video content. In some situations — such as the ubiquitous 360 rock concert video — simply having access to the board feed is all one needs to create a pretty decent spatial mix (although the finer points of that type of mix would probably fill up a whole different article).

But what if you’re shooting in an acoustically interesting space where intimacy and immersion are the goal? What if you’re in the field in the middle of a rainstorm without access to AC power? It’s clear that in most cases, some combination of ambisonic capture and close micing is the right approach.

What I’ve found is that in all but a few elaborate set-ups, a mobile ambisonic recording rig (in my case, built around the Zaxcom Nomad and Soundfield SPS-200) — in addition to three to four omni-directional lavs for close micing — is more than sufficient to achieve excellent results. Last year, I had the pleasure of recording a four-piece country ensemble in a few different locations around Ireland.

Micing a Pub
For this particular job, I had the SPS and four lavs. For most of the day I had planted one Sanken COS-11 on the guitar, one on the mandolin, one on the lead singer and a DPA 4061 inside the upright bass (which sounded great!). Then, for the final song, the band wanted to add a fiddle to the mix — yet I was out of mics to cover everything. We had moved into the partially enclosed porch area of a pub with the musicians perched in a corner about six feet from the camera. I decided to roll the dice and trust the SPS to pick up the fiddle, which I figured would be loud enough in the small space that a lav wouldn’t be used much in the mix anyways. In post, the gamble paid off.

I was glad to have kept the quieter instruments mic’d up (especially the singer and the bass) while the fiddle lead parts sounded fantastic on the ambisonic recordings alone. This is one huge reason why it’s worth it to use higher-end Ambisonic mics, as you can trust them to provide fidelity for more than just ambient recordings.

An Orchestra
In another recent job, I was mixing for a 360 video of an orchestra. During production we moved the camera/sound rig around to different locations in a large rehearsal stage in London. Luckily, on this job we were able to also run small condensers into a board for each orchestra section, providing flexibility in the mix. Still, in post, the director wanted the spatial effect to be very perceptible and dynamic as we jump around the room during the lively performance. The SPS came in handy once again; not only does it offer good first-order spatial fidelity but a wide enough dynamic range and frequency response to be relied on heavily in the mix in situations where the close-mic recordings sounded flat. It was amazing opening up those recordings and listening to the SPS alone through a decent HRTF — it definitely exceeded my expectations.

It’s always good to be as prepared as possible when going into the field, but you don’t always have the budget or space for tons of equipment. In my experience, one high-quality and reliable ambisonic mic, along with some auxiliary lavs and maybe a long shotgun, are a good starting point for any field recording project for 360 video involving musicians.


Sound designer and composer Luke Allen is a veteran spatial audio designer and engineer, and a principal at SilVR in New York City. He can be reached at luke@silversound.us.

Nutmeg and Nickelodeon team up to remix classic SpongeBob songs

New York creative studio Nutmeg Creative was called on by Nickelodeon to create trippy music-video-style remixes of some classic SpongeBob SquarePants songs for the kids network’s YouTube channel. Catchy, sing-along kids’ songs have been an integral part of SpongeBob since its debut in 1999.

Though there are dozens of unofficial fan remixes on YouTube, Nickelodeon frequently turns to Nutmeg for official remixes: vastly reimagined versions accompanied by trippy, trance-inducing visuals that inevitably go viral. It all starts with the music, and the music is inspired by the show.

Infused with the manic energy of classic Warner Bros. Looney Toons, SpongeBob is simultaneously slapstick and surreal with an upbeat vibe that has attracted a cult-like following from the get-go. Now in its 10th season, SpongeBob attracts fans that span two generations: kids who grew up watching SpongeBob now have kids of their own.

The show’s sensibility and multi-generational audience informs the approach of Nutmeg sound designer, mixer and composer JD McMillin, whose remixes of three popular and vintage SpongeBob songs have become viral hits: Krusty Krab Pizza and Ripped My Pants from 1999, and The Campfire Song Song (yes, that’s correct) from 2004. With musical styles ranging from reggae, hip-hop and trap/EDM to stadium rock, drum and bass and even Brazilian dance, McMillin’s remixes expand the appeal of the originals with ear candy for whole new audiences. That’s why, when Nickelodeon provides a song to Nutmeg, McMillin is given free rein to remix it.

“No one from Nick is sitting in my studio babysitting,” he says. “They could, but they don’t. They know that if they let me do my thing they will get something great.”

“Nickelodeon gives us a lot of creative freedom,” says executive producer Mike Greaney. “The creative briefs are, in a word, brief. There are some parameters, of course, but, ultimately, they give us a track and ask us to make something new and cool out of it.”

All three remixes have collectively racked up hundreds of thousands of views on YouTube, with The Campfire Song Song remix generating 655K views in less than 24 hours on the SpongeBob Facebook page.

McMillin credits the success to the fact that Nutmeg serves as a creative collaborative force: what he delivers is more reinvention than remix.

“We’re not just mixing stuff,” he says. “We’re making stuff.”

Once Nick signs off on the audio, that approach continues with the editorial. Editors Liz Burton, Brian Donnelly and Drew Hankins each bring their own unique style and sensibility, with graphic Effects designer Stephen C. Walsh adding the finishing touches.

But Greaney isn’t always content with cut, shaken and stirred clips from the show, going the extra mile to deliver something unexpected. Case in point: he recently donned a pair of red track pants and high-kicked in front of a greenscreen to add a suitably outrageous element to the Ripped My Pants remix.

In terms of tools used for audio work, Nutmeg used Ableton Live, Native Instruments Maschine and Avid Pro Tools. For editorial they called on Avid Media Composer, Sapphire and Boris FX. Graphics were created in Adobe After Effects, and Mocha Pro.