If there were any overriding themes at this year’s AES Convention, they were immersive audio and audio networking. On the exhibit floor, and especially in the workshops, tutorials and panel sessions, the topics seemed to be the talk of the show.
For a while referred to as “3D Audio,” multi-speaker presentation formats have been around for years, but while visual 3D TV failed to catch on, with the commercial success of the newer immersive cinematic audio formats such as Auro-3D, Dolby Atmos and DTS MDA, soundtracks incorporating height information have taken on a new lease on life. The first Blu-ray Disc with a Dolby Atmos soundtrack was released at the end of September and more than one presenter at the convention hinted that the broadcast and streaming industry could begin delivering immersive audio to consumers soon.
“Is it marketing, so that hardware manufacturers can sell a lot more stuff? Maybe,” said Tom Ammermann, general manager of New Audio Technology in Germany, who presented tutorials on producing 3D audio for music, film and games, and for 3D headphones. “But the reason [for embracing immersive audio] for those of us who make content is that it makes it more emotional; we have more fun with it.”
Ammermann believes the new immersive formats produce a compelling experience when applied not only to film but also games and music. Indeed, the music industry “missed a real chance” to provide consumers with the 5.1 experience, he believes, but with the emerging immersive audio tools, “We have another chance now to do new mixes.”
Headphone virtualization of the immersive formats could be the key to wider consumer acceptance, especially with so many people listening on handheld devices. MPEG-H 3D Audio, the newest MPEG standard, includes data compression and rendering that allows the delivery of immersive sound formats—channeland object-based, as well as Higher Order Ambisonics—into the home and to mobile platforms.
“You produce once and it permits you to consume in many formats— different loudspeaker locations, headphones, sound bars—all decoded in the same bitstream,” according to Schuyler Quackenbush of Audio Research Labs. MPEG-H can virtualize a surround speaker environment from a sound bar or a loudspeaker configuration with fewer speakers than are accommodated in the content, for example.
The flexible rendering engine within MPEG-H can also deliver a spatialized immersive surround experience to a headphone listener using head-related transfer function (HRTF) or binaural room impulse response methods. An interface that allows the measurement and input of personal HRTF data has been incorporated into MPEG-H.
Object-based immersive formats could potentially bring new audio services to streaming and broadcast in the not-too-distant future. According to Quackenbush, the metadata that is associated with the audio objects could support a level of audience interaction. For example, a user could select alternate languages, commentary for the visually impaired, alter the balance between commentators and the stadium crowd, swap perspective from one end of the stadium to the other, or create a personal mix based on the available audio objects.
The network audio program at the AES Convention doubled in size this year over previous years, and included several presentations promoting AVB, Dante, Ravenna and the relatively new AES67 Audio-over-IP interoperability standard. Chaired by Tim Shuttleworth, the network audio track investigated the growing LAN and WAN applications in the audio industry.
Shuttleworth moderated a panel detailing the implementation of the large-scale Ethernet AVB audio network unveiled at ESPN’s new Digital Center 2 (DC2) production and distribution facility in Bristol, CT earlier this year. DC2 was built initially to take over from the sports broadcaster’s 10-year-old DC1, which will now be refreshed with similar technology to DC2.
The main goals of John Pannaman, senior director, technology, for ESPN, the panel reported, was to have a single production infrastructure for all of the company’s media outlets. It needed to be a dedicated solution, easily expandable and 4k-capable.
Interoperability was also a big concern, according to Christian Diehl from Riedel, enabling flexible vendor choice for the AVB-enabled equipment on the network. Scalability is also important; DC1 handles 700 audio streams, but the new DC2 must eventually be capable of managing up to 16,000 sources.
AVB, a layer 2 protocol, is implemented as a Class A network at ESPN, which means that latency is only 2 ms across seven hops. AVnu Alliance, the consortium of 80—and growing—manufacturers and users behind the protocol, is currently working on creating a layer 3 version, according to Warren Belkin from network switch supplier Arista.
The show also saw the launch of the Media Networking Alliance, a group of 20 pro audio and broadcast companies advocating the adoption of AES67, which is designed to enable interoperability between audio networking standards, such as Ravenna, Livewire, Q-LAN and Dante. The inaugural membership meeting at the convention was hosted by steering committee members Bill Scott, Bosch Communications Systems; Terry Holton, Yamaha; Stefan Ledergerber, Lawo Group; Marty Sacks, Axia Audio; and Rich Zwiebel, QSC Audio. The same panel presented a session detailing how the alliance intends to support AES67 adopters.