NORTHRIDGE, CA—It’s beginning to look like 2016 will be the year that VR—virtual reality—goes mainstream, as more and more headsets become commercially available. But while consumers can now get their hands on equipment to experience VR, there are still significant challenges facing producers and post producers, who are currently struggling to find efficient tools and workflows.
The AES Los Angeles Section recently addressed these issues with a presentation entitled “Virtual Reality for Consumer Media,” hosted at Harman International’s headquarters in Northridge, CA. Organized and coordinated by Linda A. Gedemer, adjunct professor at LMU-School of Film and Television’s Recording Arts Department, presenters included Tim Gedemer, president/CEO of Source Sound, and Charles Deenen, director of Source Sound Digital, a Los Angeles-based sound design and mixing group that specializes in immersive, VR-specific audio services, together with Joel Susal, director of Virtual and Augmented Reality at Dolby Laboratories and Adam Somers, senior software engineer at Jaunt VR.
Gedemer began with an announcement that the 2016 AES Convention will include an immersive conference. The market is blossoming: Oculus, Sony and HTC are launching headsets with PC and PlayStation connectivity, and Google and Samsung are making inexpensive smartphone-powered headsets available. “Smartphones have made VR accessible,” he observed.
“Headphones are the most important aspect of the consumer experience,” said Somers, since viewers are unlikely to have a speaker array capable of reproducing immersive sound. In addition to the 360-degree camera arrays coming to the market, he said, there are choices of single-point “3-D” microphones, including the Core Sound TetraMic, MH Acoustics Eigenmike, and devices from VisiSonics, Dysonics with Telefunken and 3Dio.
“The more capsules, the higher the spatial resolution,” he said. Jaunt favors first-order Ambisonics, which occupies four tracks and is easy to work with and stream, he added.
Susal commented that VR falls into two basic categories, interactive and experiential. The former essentially comprises games; the latter includes music concerts, sports, tourism and documentaries. Audio for VR needs to be the same quality as for movies, so Dolby is adapting Atmos for the market, he said, with software for creation and playback currently going through beta testing. Post production can help bring the better-than real-life quality of movies to the format, he said, building upon the initial mic capture.
But as Gedemer noted from a recent experience, “It can be unexpectedly difficult.” An Ambisonic mic by the camera was rendered useless by unwanted noise. Plus, spill between the lavalier mics on the seven characters on his project—boom mics would have been visible—meant ADR was not an option, so Gedemer had to “remove the bleed forensically,” he said.
Source Sound has had plenty of experience with the VR post audio process, working out a series of “hacks,” according to Gedemer and Deenen. For example, there is currently no way to embed audio with picture, necessitating syncing by hand from two playback devices. Since it’s impractical—impossible, even—to perform a mix while wearing VR goggles, the 360-degree image has to be displayed two-dimensionally, with various degrees of the circle marked on the screen. When panning sounds, “You hear it behind you but see it in front,” said Deenen. “It takes some getting used to.”
The VR format is also very revealing, requiring pans to be within +/- 2 degrees to sound precise, he continued. In terms of level, tolerances are +/- 2 dB on the X/Y plane and +/- 5 dB in the Z (vertical) plane.
A new platform brings with it loudness issues, too. Source Sound mixes to -16 LKFS/LUFS (YouTube uses -13 LUFS), which is the target used by game companies for online streaming. Deenen recommended a reference monitoring level of 60 to 65 dB for speakers and 70 to 75 dB for headphones. “Many are listening in noisy spaces and mixing louder would produce too wide a dynamic range,” he said. He also cautioned, “If you mix on headphones, it tends to be too dense, because you can hear all the sounds.”
Audio post can be performed in a DAW then sent into a middleware game engine such as Unity or Wwise for manipulation in the 360-degree sound field. “But none of the engines handle rotational sound, or how sources change as they move” relative to the listener, in terms of filtering and phase, noted Deenen.
Plus, said Gedemer, there are no currently tools that can control fixed versus dynamic sources. On one project, in order to have the music remain fixed while head-tracking allowed everything else to rotate appropriately, he reported, he had to mono the music and place it in four quadrants of the 360-degree field.
There also needs to be new technology for consumers, commented Susal, enabling them to easily browse, stream and pause content. With an estimated 1.7 billion smartphones in use, Jaunt hopes to galvanize a large audience, said Somers. The early signs are hopeful: a recent VR news report from North Korea clocked up 20 million hits in three days.