Dolby’s DP600 Program Optimizerby Steve Harvey.
The original date for the federally mandated analog television switch-off has come and gone, replaced by a June 12 deadline imposed to give consumers more time to prepare for ATSC-only transmission. In truth, the four-month postponement also gives broadcasters a little extra time to prepare, as they mull the pros and cons of audio processing, most particularly to control loudness discrepancies between programs, between programming and commercial pods, and even from channel to channel.
Further complicating the issue is the growing number of viewers who have invested in some form of 5.1 listening setup, and who expect all six speakers to be working fulltime. With viewers quick to fire off an e-mail complaint to their local TV station or–worse–reach for their remotes to change channels, broadcasters have every reason to worry about the best way to maintain a consistent viewing experience.
For those broadcasters that do not yet have the plant infrastructure to support 5.1 audio, synthesis of a 6-channel stream from stereo–upmixing–can ensure that they are not leaving the presentation to chance–that is, to the settings in viewers’ receivers. As Tim Carroll, founder and president of processor manufacturer Linear Acoustic, points out, “If you tag it as stereo you have no idea what’s going to happen at the final point in the chain.”
Upmixing may be controversial, but it has supporters, Carroll observes. “It’s finding its place as a tool in production, to help people get 5.1 mixes up faster, and at stations where they might not have metadata to tell them what to do.” Those stations include network affiliates; according to Carroll, the three major networks do not currently send out metadata alongside their program stream. Unless a local affiliate injects metadata or chooses to upmix 2-channel programs, the audio stream is subject to the idiosyncrasies of the home receiver.
Although the majority of viewers are still only listening in stereo, consultant Roger Charlesworth, who has worked extensively with NBC at its New York headquarters, has long advocated that broadcasters should move to a 5.1 workflow even if they are transmitting in 2.0. “Despite the fact that it was heretical to say it a few years ago, you really need to have something that upmixes that which is not 5.1, now that we’re in a mostly 5.1 mode and we understand why it’s really bad to jump back and forth,” he says, then elaborates, “You have to be very careful with the AC3 stream if you do switch on the fly. In different receivers and in different circumstances it hiccups. Some receivers mute, and others make an awful snap, and some are just fine.”
Plus, he adds, “You also have disappearing dialog. If you’re switching between 2.0 and 5.1 and you make a mistake, you lose the dialog.” Consequently, he says, “There are compelling reasons not to switch format metadata. If you’re not switching format metadata then you really want to stay in a 5.1 mode and be center-channel-compatible.”
“The only network that is keeping 5.1 up constantly and filling the channels is Fox,” reports Carroll, noting that PBS also upmixes to a certain extent. “Anything that isn’t 5.1 [Fox] turns into 5.1, and they tell content producers: You can provide it to us, or we’ll make it.”
There is no evidence that the adoption of upmixing is slowing down the generation of 5.1 local programming, Carroll believes, “KPIX in San Francisco does its news in 5.1; that’s all discrete mixing. There are other stations in the Bay Area that do the same thing. WRAL [in North Carolina] does a lot of local 5.1-channel production.”
In a complete turnabout from just a year or two ago, he continues, “During a typical network broadcast–and this is totally not scientific–most every commercial played during a network break is 5.1.You can almost tell now when you’re going back to the local insertion because it’s usually stereo.”
Bob Orban, founder and chief engineer of the Orban processor company, is less than impressed with some of the audio he hears on television. “Some of this processing sounds very bad to me,’ he says. For example: “Surround synthesis that spreads dialog across the three front speakers, sometimes even beyond the edges. I have heard this occur in a frequency-dependent way such that, for a given talker, sibilance appears at one point in the surround soundfield and vowel sounds appear elsewhere, giving the aural impression that the talker’s head is several feet wide.”
Loudness processing can also sound less than stellar, says Orban, enumerating some examples he has heard: “Background ambience slowly pumped up in loudness until it is almost at the level of dialog; dialog ambience pumping quickly up and down like an automatic ‘ducker’; harsh-sounding dialog whose high and low frequencies are exaggerated, thereby pulling down midrange and reducing intelligibility; and a very ‘dead’ presentation in terms of dynamics, evidently produced by aggressive multi-band compression more appropriate for FM radio than for television.”
He continues, “Having recently completed the design and tuning of Orban’s surround processor, Optimod 8585, I have become very aware of potential pitfalls like these and have developed techniques to prevent them from occurring in automatic loudness processing. It may be that stations have been so anxious to control problems of inconsistent loudness–particularly loud commercials–that their staffs have been willing to put up with unnecessary side effects caused by some online loudness controllers. However, none of these side effects are inevitable, and none should be tolerated.”
With the digital switchover approaching fast, Orban still sees loudness control as a work in progress. “Some stations still have unacceptable differences between program material and commercials. Some stations have ‘thrown the baby out with the bath water’ to correct these problems. Some stations are sounding very good, which is encouraging because it means that controlling loudness problems without objectionable side effects is not only possible but practical.”
Asked about the current state of loudness control implementation, Carroll’s initial response is unprintable. “I think there’s a fundamental lack of understanding by all of us about how complex this can be,” he says, a situation that is exacerbated by the FCC’s choice of language. “The way the rules read in the United States, your dialnorm value shall indicate the average loudness of typical dialog in a program.” Some people can’t get past that, he says, especially for programming that contains no dialog. “In reality, it should probably be called what it’s called internationally in other coding standards: program reference level.”
“A lot of it is education,” agrees Jeff Riedmiller, broadcast product manager for Dolby Laboratories. “It’s taken a long time but finally there is enough momentum, tools and understanding.”
According to Riedmiller, Dolby’s DP600 file-based program optimizer has found an eager audience among premium cable programmers, especially the MSOs such as Time Warner, Comcast and Charter, enjoying exponential growth over the last year. “We’re seeing a huge uptake in them using our solution to go into files in a non-real-time sense and analyze the metadata, in particular dialnorm, which controls reproduced loudness. If it finds an error between the signaling provided by things like dialnorm and the actual loudness of the speech, it will go back into the pre-compressed bitstream and update dialnorm and any dynamic range control metadata based on it then put the stream back together without having to do re-encode.”
The DP600 is now deployed at eight or nine of the top 10 U.S. cable markets, he says. “These people can encode and ingest assets and make file-based assets–ads, VOD, programming–all day long. Operators don’t have to worry because once it’s in the AC3 domain, our process can go in, analyze and correct it without any operator intervention. There’s one facility that last year alone put over 100,000 pieces of content through our process.”
For those struggling with the correct use of dialnorm in loudness control, TC Electronic’s implementation of the ITU BS.1770 loudness metering standard that instead finds the “center of gravity” of content can be beneficial, especially where there is no dialog present on which dialnorm might find an anchor. But Carroll points to the solution put in place by France’s equivalent to the FCC, the CSA, as one possible way forward for the U.S.
Rather than measuring the dialog levels, he reports, “They say, we’re going to give you a specific value. We don’t want you to measure each program separately; we want you to make sure your programming averages around, in their case, -25 dB LKFS. For dialog programs it can’t vary for more than +/-7 dB, and for non-dialog portions, no more than +/-12 dB.
“I think it’s a really interesting step forward. The great thing is, you’re left with a program that has an average that is interchangeable with the rest of the world and you’ve got dynamic range that makes sense for dialog; 14 dB of range is within the target area that is acceptable to viewers before they’ll raise or lower their volume [according to a study by Dolby Labs].”
But for those relying on processing to control loudness, Carroll advises careful management at each stage of the signal path rather than strapping a “band-aid” across it at the very end. “If you’re relying on a processor at the end of the chain to pull in a commercial that is 12 dB too loud, if you want that commercial to be pulled back in to within a dB, that’s 12: 1 compression,” he points out.
Instead, broadcasters should apply compression only where it’s necessary: “When you combine it, you end up with a program that is already pretty much controlled.”
For example, he offers, “On the commercial server, use a Dolby DP600 to realign average levels, because more of those commercials are already compressed. Leave the local news to 2:1 or 3:1. Network programming that comes in is probably already close enough, but if you want to do something light, like 1.5:1 to 2:1 to just pull in the outer edges, you now have an output on the air that is pleasing to listen to, it’s appropriately dynamic, and nobody has ‘doinked’ around with the producer’s content to the point where they’re going to complain too much. We have stations on major networks in top markets where they’re doing this successfully, so we know that it works.”