Delivering Two Channels As Five

by Steve Harvey. The countrywide switchover from NTSC analog to ATSC digital television transmission might have been expected to bring standardization to the production and post-production audio processes. Yet there remains some uncertainty over issues relating to upmixing and downmixing, and the deliverables that the networks and cable channels expect.
Author:
Publish date:

by Steve Harvey.

Image placeholder title

The countrywide switchover from NTSC analog to ATSC digital television transmission might have been expected to bring standardization to the production and post-production audio processes. Yet there remains some uncertainty over issues relating to upmixing and downmixing, and the deliverables that the networks and cable channels expect.

Roger Charlesworth, executive director of the DTV Audio Group and president of Charlesworth Media, encourages content creators to grasp the idea of “single-threaded delivery,” whereby the broadcaster transmits a high-definition picture and 5.1 audio that can then be automatically reformatted and downmixed, according to the associated metadata, in the home. “Everyone understands that we’re sending an HD picture, and we’re making the SD picture where needed. The same thing is true of the audio. What is built into the ATSC spec is the idea that the stereo will be derived as a Pro Logic downmix of the 5.1.”

Jim Starzynski, principal engineer and audio architect, NBC Universal (NBCU) Advanced Engineering, elaborates, “Any time a television station transmits in 5.1, the two-channel version that is heard by the consumer is done within the set-top box, the integrated television/receiver, or in the home theater receiver. That’s automatically the way that it works. That downmix that creates the 2.0 version from the 5.1 is under control of the metadata that is transmitted by the television station. This process is an inherent part of the ATSC system.”

“Knowing this means that you need to be producing in discrete 5.1,” stresses Charlesworth. “It’s an inescapable conclusion of realizing that you’re in a single-threaded world; it’s the only way you can really QC what the stereo is going to be. So it has implications for people who are mixing TV shows. If we can get out of this schizophrenic thing of having two standards that are side by side—but they really aren’t—it will make life easier for people.”

During the long transition to exclusively ATSC operation, content creators would generate both a 5.1-channel mix and a stereo mix and deliver them on the eight tracks of the Dolby E stream or on discrete videotape channels to cover both HD and SD transmissions. As Starzynski observes, they must now get out of the habit of thinking the stereo deliverable will go to air, even though NBCU’s current deliverables spec, for example, requires content to be on Sony HDCAM SR tape with both a 5.1 and a stereo mix.

“If they supply us with a 5.1 channel soundtrack, the two-channel version that they’ve mixed is only there for archive and protection purposes, just in case we need to screen that tape in an environment that doesn’t have 5.1. The two-channel version that the audience hears is the on-the-fly downmix that occurs in the set-top box,” Starzynski explains.

Consequently, he continues, “What becomes extremely important is that the content provider has to audition both the 5.1 that they’re doing as the primary deliverable, and a metadata-controlled 2.0 downmix of the 5.1, so that they understand what both are going to sound like.

“We post those metadata figures in our program spec. We make this process very clear in the main audio section as well—that content suppliers really need to audition the 5.1 and the 2.0 version that is created from the 5.1 that’s under metadata control to make certain both meet their expectations.”

Michael Abbott of All Ears, who specializes in technical audio design for high-profile entertainment broadcast events, including the annual Grammy Awards show, notes that in his experience, “Nine-tenths of the stereo stuff we do upmixes better than it downmixes. A stereo mix upmixed is better than a 5.1 downmixed, because of the algorithms utilized.”

Abbott advocates more consistency between networks and even within programming. On one recent primetime major network show he discovered at the 11th hour that the initial taped portion was in 5.1, yet the live segment with which he was involved was in stereo, and was being upmixed for transmission. The late notice didn’t allow time to insert up- and downmix processing into the monitor chain before the show went live.

With some networks standardizing on certain models of up- and downmix processors, he adds, “I would much rather have a stereo mix delivered and know that it’s going to get put through one of three or four processors that will upmix it. If the processor is native to that network and the algorithms are matched on the back and front ends, then you have a better chance.”

Further complicating the issue, “There are still a lot of production companies that don’t even want to be bothered by 5.1,” says Abbott.