By Mel Lambert
While the annual SMPTE Technical Conferences normally put a strong focus on things visual, there is no denying that these gatherings offer a number of interesting sessions for sound pros from the production and post communities. According to Aimée Ricca, who oversees marketing and communications for SMPTE, pre-registration included “nearly 2,500 registered attendees hailing from all over the world.” This year’s conference, held at the Loews Hollywood Hotel and Ray Dolby Ballroom from October 24-27, also attracted more than 108 exhibitors in two exhibit halls.
Setting the stage for the 2016 celebration of SMPTE’s Centenary, opening keynotes addressed the dramatic changes that have occurred within the motion picture and TV industries during the past 100 years, particularly with the advent of multichannel immersive sound. The two co-speakers — SMPTE president Robert Seidel and filmmaker/innovator Doug Trumbull — chronicled the advance in audio playback sound since, respectively, the advent of TV broadcasting after WWII and the introduction of film soundtracks in 1927 with The Jazz Singer.
Currently VP of CBS Engineering and Advanced Technology, with responsibility for TV technologies at CBS and the CW networks, Seidel headed up the team that assisted WRAL-HD, the CBS affiliate in Raleigh, North Carolina, to become the first TV station to transmit HDTV in July 1996. The transition included adding the ability to carry 5.1-channel sound using Advanced Television Systems Committee (ATSC) standards and Dolby AC-3 encoding.
The 45th Grammy Awards Ceremony broadcast by CBS Television in February 2004 marked the first scheduled HD broadcast with a 5.1 soundtrack. The emergent ATSC 3.0 standard reportedly will provide increased bandwidth efficiency and compression performance. The drawback is the lack of backwards compatibility with current technologies, resulting in a need for new set-top boxes and TV receivers.
As Seidel explained, the upside for ATSC 3.0 will be immersive soundtracks, using either Dolby AC-4 or MPEG-H coding, together with audio objects that can carry alternate dialog and commentary tracks, plus other consumer features to be refined with companion 4K UHD, high dynamic range and high frame rate images. In June, WRAL-HD launched an experimental ATSC 3.0 channel carrying the station’s programming in 1080p with 4K segments, while in mid-summer South Korea adopted ATSC 3.0 and plans to begin broadcasts with immersive audio and object-based capabilities next February in anticipation of hosting the 2018 Winter Olympics. The 2016 World Series games between the Cleveland Indians and the Chicago Cubs marked the first live ATSC 3.0 broadcast of a major sporting event on experimental station Channel 31, with an immersive-audio simulcast on the Tribune Media-owned Fox affiliate WJW-TV.
Immersive audio will enable enhanced spatial resolution for 3D sound-source localization and therefore provide an increased sense of envelopment throughout the home listening environment, while audio “personalization” will include level control for dialog elements, alternate audio tracks, assistive services, other-language dialog and special commentaries. ATSC 3.0 also will support loudness normalization and contouring of dynamic range.
Higher Frame Rates
With a wide range of experience within the filmmaking and entertainment technologies, including visual effects supervision on 2001: A Space Odyssey, Close Encounters of the Third Kind, Star Trek: The Motion Picture and Blade Runner, Trumbull also directed Silent Running and Brainstorm, as well as special venue offerings. He won an Academy Award for his Showscan process for high-speed 70mm cinematography, helped develop IMAX technologies and now runs Trumbull Studios, which is innovating a new MAGI process to offer 4K 3D at 120fps. High production costs and a lack of playback environments meant that Trumbull’s Showscan format never really got off the ground, which was “a crushing disappointment,” he conceded to the SMPTE audience.
But meanwhile, responding to falling box office receipts during the ‘50s and ‘60s, Hollywood added more consumer features, including large-screen presentations and surround sound, although the movie industry also began to rely on income from the TV community for broadcast rights to popular cinema releases.
As Seidel added, “The convergence of toolsets for both television and cinema — including 2K, 4K and eventually 8K — will lead to reduced costs, and help create a global market around the world [with] a significant income stream.” He also said that “cord cutting” — substituting cable subscription services for Amazon.com, Hulu, iTunes, Netflix and the like — is bringing people back to over-the-air broadcasting.
Trumbull countered that TV will continue at 60fps “with a live texture that we like,” whereas film will retain its 24fps frame rate “that we have loved for years and which has a ‘movie texture.’ Higher frame rates for cinema, such as 48fps used by Peter Jackson for several of the Lord of the Rings films, has too much of a TV look. Showscan at 120fps and a 360-degree shutter avoided that TV look, which is considered objectionable.” (Early reviews of director Ang Lee’s upcoming 3D film Billy Lynn’s Long Halftime Walk, which was shot in 4K at 120fps, have been critical of its video look and feel.)
Next-Gen Audio for Film and TV
During a series of “Advances in Audio Reproduction” conference sessions, chaired by Chris Witham, director of digital cinema technology at Walt Disney Studios, three presentations covered key design criteria for next-generation audio for TV and film. During his discussion called “Building the World’s Most Complex TV Network — A Test Bed for Broadcasting Immersive & Interactive Audio,” Robert Bleidt, GM of Fraunhofer USA’s audio and multimedia division, provided an overview of a complete end-to-end broadcast plant that was built to test various operational features developed by Fraunhofer, Technicolor and Qualcomm. These tests were used to evaluate an immersive/object-based audio system based on MPEG-H for use in Korea during planned ATSC 3.0 broadcasting.
“At the NAB Convention we demonstrated The MPEG Network,” Bleidt stated. “It is perhaps the most complex combination of broadcast audio content ever made in a single plant, involving 13 different formats.” This includes mono, stereo, 5.1-channel and other sources. “The network was designed to handle immersive audio in both channel- and HOA-based formats, using audio objects for interactivity. Live mixes from a simulated sports remote was connected to a network operating center, with distribution to affiliates, and then sent to a consumer living room, all using the MPEG-H audio system.”
Bleidt presented an overview of system and equipment design, together with details of a critical AMAU (audio monitoring and authoring unit) that will be used to mix immersive audio signals using existing broadcast consoles limited to 5.1-channel assignment and panning.
Dr. Jan Skoglund, who leads a team at Google developing audio signal processing solutions, addressed the subject of “Open-source Spatial Audio Compression for VR Content,” including the importance of providing realistic immersive audio experiences to accompany VR presentations and 360-degree 3D video.
“Ambisonics have reemerged as an important technique in providing immersive audio experiences,” Skoglund stated. “As an alternative to channel-based 3D sound, Ambisonics represent full-sphere sound, independent of loudspeaker location.” His fascinating presentation considered the ways in which open-source compression technologies can transport audio for various species of next-generation immersive media. Skoglund compared the efficacy of several open-source codecs for first-order Ambisonics, and also the progress being made toward higher-order Ambisonics (HOA) for VR content delivered via the internet, including enhanced experience provided by HOA.
Finally, Paul Peace, who oversees loudspeaker development for cinema, retail and commercial applications at JBL Professional — and designed the Model 9350, 9300 and 9310 surround units — discussed “Loudspeaker Requirements in Object-Based Cinema,” including a valuable in-depth analysis of the acoustic delivery requirements in a typical movie theater that accommodates object-based formats.
Peace is proposing the use of a new metric for surround loudspeaker placement and selection when the layout relies on venue-specific immersive rendering engines for Dolby Atmos and Barco Auro-3D soundtracks, with object-based overhead and side-wall channels. “The metric is based on three foundational elements as mapped in a theater: frequency response, directionality and timing,” he explained. “Current set-up techniques are quite poor for a majority of seats in actual theaters.”
Peace also discussed new loudspeaker requirements and layout criteria necessary to ensure a more consistent sound coverage throughout such venues that can replay more accurately the material being re-recorded on typical dub stages, which are often smaller and of different width/length/height dimensions than most multiplex environments.
Mel Lambert, who also gets photo credit on pictures from the show, is principal of Content Creators, an LA-based copywriting and editorial service, and can be reached at firstname.lastname@example.org Follow him on Twitter @MelLambertLA.