Tag Archives: Immersive Audio

Sonic Union adds Bryant Park studio targeting immersive, broadcast work

New York audio house Sonic Union has launched a new studio and creative lab. The uptown location, which overlooks Bryant Park, will focus on emerging spatial and interactive audio work, as well as continued work with broadcast clients. The expansion is led by principal mix engineer/sound designer Joe O’Connell, now partnered with original Sonic Union founders/mix engineers Michael Marinelli and Steve Rosen and their staff, who will work out of both its Union Square and Bryant Park locations. O’Connell helmed sound company Blast as co-founder, and has now teamed up with Sonic Union.

In other staffing news, mix engineer Owen Shearer advances to also serve as technical director, with an emphasis on VR and immersive audio. Former Blast EP Carolyn Mandlavitz has joined as Sonic Union Bryant Park studio director. Executive creative producer Halle Petro, formerly senior producer at Nylon Studios, will support both locations.

The new studio, which features three Dolby Atmos rooms, was created and developed by Ilan Ohayon of IOAD (Architect of Record), with architectural design by Raya Ani of RAW-NYC. Ani also designed Sonic’s Union Square studio.

“We’re installing over 30 of the new ‘active’ JBL System 7 speakers,” reports O’Connell. “Our order includes some of the first of these amazing self-powered speakers. JBL flew a technician from Indianapolis to personally inspect each one on site to ensure it will perform as intended for our launch. Additionally, we created our own proprietary mounting hardware for the installation as JBL is still in development with their own. We’ll also be running the latest release of Pro Tools (12.8) featuring tools for Dolby Atmos and other immersive applications. These types of installations really are not easy as retrofits. We have been able to do something really unique, flexible and highly functional by building from scratch.”

Working as one team across two locations, this emerging creative audio production arm will also include a roster of talent outside of the core staff engineering roles. The team will now be integrated to handle non-traditional immersive VR, AR and experiential audio planning and coding, in addition to casting, production music supervision, extended sound design and production assignments.

Main Image Caption: (L-R) Halle Petro, Steve Rosen, Owen Shearer, Joe O’Connell, Adam Barone, Carolyn Mandlavitz, Brian Goodheart, Michael Marinelli and Eugene Green.

 

SMPTE: The convergence of toolsets for television and cinema

By Mel Lambert

While the annual SMPTE Technical Conferences normally put a strong focus on things visual, there is no denying that these gatherings offer a number of interesting sessions for sound pros from the production and post communities. According to Aimée Ricca, who oversees marketing and communications for SMPTE, pre-registration included “nearly 2,500 registered attendees hailing from all over the world.” This year’s conference, held at the Loews Hollywood Hotel and Ray Dolby Ballroom from October 24-27, also attracted more than 108 exhibitors in two exhibit halls.

Setting the stage for the 2016 celebration of SMPTE’s Centenary, opening keynotes addressed the dramatic changes that have occurred within the motion picture and TV industries during the past 100 years, particularly with the advent of multichannel immersive sound. The two co-speakers — SMPTE president Robert Seidel and filmmaker/innovator Doug Trumbull — chronicled the advance in audio playback sound since, respectively, the advent of TV broadcasting after WWII and the introduction of film soundtracks in 1927 with The Jazz Singer.

Robert Seidel

ATSC 3.0
Currently VP of CBS Engineering and Advanced Technology, with responsibility for TV technologies at CBS and the CW networks, Seidel headed up the team that assisted WRAL-HD, the CBS affiliate in Raleigh, North Carolina, to become the first TV station to transmit HDTV in July 1996.  The transition included adding the ability to carry 5.1-channel sound using Advanced Television Systems Committee (ATSC) standards and Dolby AC-3 encoding.

The 45th Grammy Awards Ceremony broadcast by CBS Television in February 2004 marked the first scheduled HD broadcast with a 5.1 soundtrack. The emergent ATSC 3.0 standard reportedly will provide increased bandwidth efficiency and compression performance. The drawback is the lack of backwards compatibility with current technologies, resulting in a need for new set-top boxes and TV receivers.

As Seidel explained, the upside for ATSC 3.0 will be immersive soundtracks, using either Dolby AC-4 or MPEG-H coding, together with audio objects that can carry alternate dialog and commentary tracks, plus other consumer features to be refined with companion 4K UHD, high dynamic range and high frame rate images. In June, WRAL-HD launched an experimental ATSC 3.0 channel carrying the station’s programming in 1080p with 4K segments, while in mid-summer South Korea adopted ATSC 3.0 and plans to begin broadcasts with immersive audio and object-based capabilities next February in anticipation of hosting the 2018 Winter Olympics. The 2016 World Series games between the Cleveland Indians and the Chicago Cubs marked the first live ATSC 3.0 broadcast of a major sporting event on experimental station Channel 31, with an immersive-audio simulcast on the Tribune Media-owned Fox affiliate WJW-TV.

Immersive audio will enable enhanced spatial resolution for 3D sound-source localization and therefore provide an increased sense of envelopment throughout the home listening environment, while audio “personalization” will include level control for dialog elements, alternate audio tracks, assistive services, other-language dialog and special commentaries. ATSC 3.0 also will support loudness normalization and contouring of dynamic range.

Doug Trumbull

Higher Frame Rates
With a wide range of experience within the filmmaking and entertainment technologies, including visual effects supervision on 2001: A Space Odyssey, Close Encounters of the Third Kind, Star Trek: The Motion Picture and Blade Runner, Trumbull also directed Silent Running and Brainstorm, as well as special venue offerings. He won an Academy Award for his Showscan process for high-speed 70mm cinematography, helped develop IMAX technologies and now runs Trumbull Studios, which is innovating a new MAGI process to offer 4K 3D at 120fps. High production costs and a lack of playback environments meant that Trumbull’s Showscan format never really got off the ground, which was “a crushing disappointment,” he conceded to the SMPTE audience.

But meanwhile, responding to falling box office receipts during the ‘50s and ‘60s, Hollywood added more consumer features, including large-screen presentations and surround sound, although the movie industry also began to rely on income from the TV community for broadcast rights to popular cinema releases.

As Seidel added, “The convergence of toolsets for both television and cinema — including 2K, 4K and eventually 8K — will lead to reduced costs, and help create a global market around the world [with] a significant income stream.” He also said that “cord cutting” — substituting cable subscription services for Amazon.com, Hulu, iTunes, Netflix and the like — is bringing people back to over-the-air broadcasting.

Trumbull countered that TV will continue at 60fps “with a live texture that we like,” whereas film will retain its 24fps frame rate “that we have loved for years and which has a ‘movie texture.’ Higher frame rates for cinema, such as 48fps used by Peter Jackson for several of the Lord of the Rings films, has too much of a TV look. Showscan at 120fps and a 360-degree shutter avoided that TV look, which is considered objectionable.” (Early reviews of director Ang Lee’s upcoming 3D film Billy Lynn’s Long Halftime Walk, which was shot in 4K at 120fps, have been critical of its video look and feel.)

complex-tv-networkNext-Gen Audio for Film and TV
During a series of “Advances in Audio Reproduction” conference sessions, chaired by Chris Witham, director of digital cinema technology at Walt Disney Studios, three presentations covered key design criteria for next-generation audio for TV and film. During his discussion called “Building the World’s Most Complex TV Network — A Test Bed for Broadcasting Immersive & Interactive Audio,” Robert Bleidt, GM of Fraunhofer USA’s audio and multimedia division, provided an overview of a complete end-to-end broadcast plant that was built to test various operational features developed by Fraunhofer, Technicolor and Qualcomm. These tests were used to evaluate an immersive/object-based audio system based on MPEG-H for use in Korea during planned ATSC 3.0 broadcasting.

“At the NAB Convention we demonstrated The MPEG Network,” Bleidt stated. “It is perhaps the most complex combination of broadcast audio content ever made in a single plant, involving 13 different formats.” This includes mono, stereo, 5.1-channel and other sources. “The network was designed to handle immersive audio in both channel- and HOA-based formats, using audio objects for interactivity. Live mixes from a simulated sports remote was connected to a network operating center, with distribution to affiliates, and then sent to a consumer living room, all using the MPEG-H audio system.”

Bleidt presented an overview of system and equipment design, together with details of a critical AMAU (audio monitoring and authoring unit) that will be used to mix immersive audio signals using existing broadcast consoles limited to 5.1-channel assignment and panning.

Dr. Jan Skoglund, who leads a team at Google developing audio signal processing solutions, addressed the subject of “Open-source Spatial Audio Compression for VR Content,” including the importance of providing realistic immersive audio experiences to accompany VR presentations and 360-degree 3D video.

“Ambisonics have reemerged as an important technique in providing immersive audio experiences,” Skoglund stated. “As an alternative to channel-based 3D sound, Ambisonics represent full-sphere sound, independent of loudspeaker location.” His fascinating presentation considered the ways in which open-source compression technologies can transport audio for various species of next-generation immersive media. Skoglund compared the efficacy of several open-source codecs for first-order Ambisonics, and also the progress being made toward higher-order Ambisonics (HOA) for VR content delivered via the internet, including enhanced experience provided by HOA.

Finally, Paul Peace, who oversees loudspeaker development for cinema, retail and commercial applications at JBL Professional — and designed the Model 9350, 9300 and 9310 surround units — discussed “Loudspeaker Requirements in Object-Based Cinema,” including a valuable in-depth analysis of the acoustic delivery requirements in a typical movie theater that accommodates object-based formats.

Peace is proposing the use of a new metric for surround loudspeaker placement and selection when the layout relies on venue-specific immersive rendering engines for Dolby Atmos and Barco Auro-3D soundtracks, with object-based overhead and side-wall channels. “The metric is based on three foundational elements as mapped in a theater: frequency response, directionality and timing,” he explained. “Current set-up techniques are quite poor for a majority of seats in actual theaters.”

Peace also discussed new loudspeaker requirements and layout criteria necessary to ensure a more consistent sound coverage throughout such venues that can replay more accurately the material being re-recorded on typical dub stages, which are often smaller and of different width/length/height dimensions than most multiplex environments.


Mel Lambert, who also gets photo credit on pictures from the show, is principal of Content Creators, an LA-based copywriting and editorial service, and can be reached at mel.lambert@content-creators.com Follow him on Twitter @MelLambertLA.

 

AES Paris: A look into immersive audio, cinematic sound design

By Mel Lambert

The Audio Engineering Society (AES) came to the City of Light in early June with a technical program and companion exhibition that attracted close to 2,600 pre-registrants, including some 700 full-pass attendees. “The Paris International Convention surpassed all of our expectations,” AES executive director Bob Moses told postPerspective. “The research community continues to thrive — there was great interest in spatial sound and networked audio — while the business community once again embraced the show, with a 30 percent increase in exhibitors over last year’s show in Warsaw.” Moses confirmed that next year’s European convention will be held in Berlin, “probably in May.”

Tom Downes

Getting Immersed
There were plenty of new techniques and technologies targeting the post community. One presentation, in particular, caught my eye, since it posed some relevant questions about how we perceive immersive sound. In the session, “Immersive Audio Techniques in Cinematic Sound Design: Context and Spatialization,” co-authors Tom Downes and Malachy Ronan — both of who are AES student members currently studying at the University of Limerick’s Digital Media and Arts Research Center, Ireland — questioned the role of increased spatial resolution in cinematic sound design. “Our paper considered the context that prompted the use of elevated loudspeakers, and examined the relevance of electro-acoustic spatialization techniques to 3D cinematic formats,” offered Downes. The duo brought with them a scene from writer/director Wolfgang Petersen’s submarine classic, Das Boot, to illustrate their thesis.

Using the university’s Spatialization and Auditory Display Environment (SpADE) linked to an Apple Logic Pro 9 digital audio workstations and a 7.1.4 playback configuration — with four overhead speakers — the researchers correlated visual stimuli with audio playback. (A 7.1-channel horizontal playback format was determined by the DAW’s I/O capabilities.) Different dynamic and static timbre spatializations were achieved by using separate EQ plug-ins assigned to horizontal and elevated loudspeaker channels.

“Sources were band-passed and a 3dB boost applied at 7kHz to enhance the perception of elevation,” Downes continued. “A static approach was used on atmospheric sounds to layer the soundscape using their dominant frequencies, whereas bubble sounds were also subjected to static timbre spatialization; the dynamic approach was applied when attempting to bridge the gap between elevated and horizontal loudspeakers. Sound sources were split, with high frequencies applied to the elevated layer, and low frequencies to the horizontal layer. By automating the parameters within both sets of equalization, a top-to-bottom trajectory was perceived. However, although the movement was evident, it was not perceived as immersive.”

The paper concluded that although multi-channel electro-acoustic spatialization techniques are seen as a rich source of ideas for sound designers, without sufficient visual context they are limited in the types of techniques that can be applied. “Screenwriters and movie directors must begin to conceptualize new ways of utilizing this enhanced spatial resolution,” said Downes.

Rich Nevens

Rich Nevens

Tools
Merging Technologies demonstrated immersive-sound applications for the v.10 release of Pyramix DAW software, with up to 30.2-channel routing and panning, including compatibly for Barco Auro, Dolby Atmos and other surround formats, without the need for additional plug-ins or apps, while Avid showcased additions for the modular S6 Assignable Digital Console, including a Joystick Panning Module and a new Master Film Module with PEC/DIR switching.

“The S6 offers improved ergonomics,” explained Avid’s Rich Nevens, director of worldwide pro audio solutions, “including enhanced visibility across the control surface, and full Ethernet connectivity between eight-fader channel modules and the Pro Tools DSP engines.” Reportedly, more than 1,000 S6 systems have been sold worldwide since its introduction in December 2013, including two recent installations at Sony Pictures Studios in Culver City, California.

Finally, Eventide came to the Paris AES Convention with a remarkable new multichannel/multi-element processing system that was demonstrated by invitation only to selected customers and distributors; it will be formally introduced during the upcoming AES Convention in Los Angeles in October. Targeted at film/TV post production, the rackmount device features 32 inputs and 32 discrete outputs per DSP module, thereby allowing four multichannel effects paths to be implemented simultaneously. A quartet of high-speed ARM processors mounted on plug-in boards can be swapped out when more powerful DSP chips became available.

Joe Bamberg and Ray Maxwell

Joe Bamberg and Ray Maxwell

“Initially, effects will be drawn from our current H8000 and H9 processors — with other EQ, dynamics plus reverb effects in development — and can be run in parallel or in series, to effectively create a fully-programmable, four-element channel strip per processing engine,” explained Eventide software engineer Joe Bamberg.

“Remote control plug-ins for Avid Pro Tools and other DAWs are in development,” said Eventide’s VP of sales and marketing, Ray Maxwell. The device can also be used via a stand-alone application for Apple iPad tablets or Windows/Macintosh PCs.

Multi-channel I/O and processing options will enable object-based EQ, dynamic and ambience processing for immersive-sound production. End user price for the codenamed product, which will also feature Audinate Dante, Thunderbolt, Ravenna/AES67 and AVB networking, has yet to be announced.

Mel Lambert is principal of Content Creators, an LA-based copywriting and editorial service, and can be reached at mel.lambert@content-creators.com. Follow him on Twitter @MelLambertLA.

SMPTE 2015: the impact of immersive audio formats

By Mel Lambert

“In the very near future, immersive audio will be everywhere,” stated William Redmann, Technicolor’s director of standards for immersive media technologies. “It will enhance our dramas, seat us in sports venues and put us on the field [for outdoor events]. Now it’s just a matter of making it happen!”

Technicolor’s technologist was chairing a fascinating session during the SMPTE Technical Conference & Exhibition, held at Loews Hotel in the heart of Hollywood in late October, focusing on new sound capture and production techniques for cinema and broadcast that will be crucial for the effective authoring of object-based immersive audio.

“Support of these premium experiences requires new mixing skills, tools, tests, monitors and a pervasive respect for the artist’s intent at all levels of presentation… and that includes legacy formats,” Redmann stressed. “Nobody is predicting that stereo will disappear!”

Steven Silva

Steven Silva

Steven Silva, VP of technology and strategy at Twentieth Century Fox, provided a succinct overview of “Object-Based Audio for Live TV Production,” with a focus on the development of the new ATSC 3.0 specification for terrestrial TV broadcasting, which is expected to include multichannel immersive opportunities, using object-based and scene-based audio to enhance the consumer experience.

“The new format will offer qualitative improvements over the current ATSC standard,” Silva stressed, “with more than the current six-channel configuration at lower bit rates, yet with enhanced loudness and dynamic-range capabilities.”

Focusing on scene-based audio, Dr. Nils Peters (pictured in our main image) — a staff research engineer at Qualcomm and co-chair of the AES technical committee on spatial audio — presented an interesting overview of Higher Order Ambisonics (HOA), a technology that can be used to create “holistic descriptions of captured sound scenes, independent from particular loudspeaker layouts.”

As Peters explained, immersive sound is carried by a series of compressed/uncompressed digital channels that contain predominant sounds and companion ambiences. “This is different from conventional channel-based formats, which send one signal for each loudspeaker output; up- or down-mixing would be needed if another speaker configuration is used.” Scene-based audio is said to enable immersive sound with objects at bit rates comparable to current formats.

Nuno Fonseca, a professor at the Polytechnic Institute of Leiria, Portugal, and an invited professor at the Lisbon School of Music, described his ongoing research into enhanced sound design and 3D mixing, including a proprietary Sound Particles app which, reportedly, is being evaluated at several Hollywood studios. As Fosenca stressed, VFX graphics software is used to generate moving 3D objects within a virtual space, with a moving virtual camera being responsible for rendering that scene. “Unfortunately, the same concept is not used for audio post production,” he stated. Instead of using a conventional DAW — in reality, the audio equivalent of video editing software — Fonseca is developing a particle-based program for creating virtual 3D audio scenes.

Brian Claypool

Brian Claypool

The “Listening Test Methodology for Object-Based Audio Rendering Interoperability” session comprised a three-way presentation from current innovators of immersive sound systems. As Brian Claypool, senior director of strategic business development at Barco Audio Technologies, explained, work is progressing on the development of listening tests that will ensure the “preservation and representation of artistic intent.” These criteria are the basic focus of SMPTE Technology Committee 25CSS, which is developing an interoperative audio-creation workflow and a single DCP that can be used regardless of which playback configuration — Dolby Atmos, Barco Auro 3D or DTS MDA — has been installed in the exhibition space.

Other participants included Dr. Markus Mehnert, head of technology at Barco Audio Technologies, which markets the Auro 3D format to the motion picture community, and Bert Van Daele, chief technology officer with Auro Technologies NV, which, like Barco, is based in Belgium. “These subjective, objective and combined tests will check the render compatibility of competitive immersive sound systems replaying the proposed SMPTE single-file [interoperable] format,” Van Daele stressed. “We are working on test procedures that will check object size/spread, object positions and other parameters to retain the director’s artistic intent.”

“Monitoring and Authoring of 3D Immersive Next-Generation Audio Formats,” presented by Peter Poers, managing director of marketing and sales at Junger Audio, Germany, stressed that the importance of key functions to ensuring easy adoption of next-generation immersive audio formats will necessitate changes in audio production workflow to accommodate additional audio channels and object-based formats. “Monitoring the audio material, along with authoring and verification of dynamic metadata will become a new challenge,” Poers emphasized. New procedures for managing object-based content need to be established, along with personalization services that will enable the selection, for example, of alternative audio objects – such as commentator languages — as well as loudness control.

Mel Lambert is principal of LA-based Content Creators. He can be reached at mel.lambert@content-creators.com, and follow him on Twitter @MelLambertLA.