Mixing and Mastering with “Self-Driving” Audio

Craig Anderton
Here’s an interesting workforce statistic: Far more jobs are lost to robots than to companies sending jobs overseas. But of course, as recording engineers, we don’t have to worry about that; our expertise will always be required to mix and master great music…right?<

We’re still in the early days of artificial intelligence. However, the ability to analyze audio is gaining sophistication—compare first-generation amp sims or vintage gear emulations to current versions. It seems likely that in the future, analysis will become good enough that computers can make not only technical judgments, but something approaching artistic decisions.

This all started years ago with “curve-stealing” software, like Steinberg’s Freefilter. You could tell your computer you liked a particular spectral response, and the program would adjust an EQ curve to re-create that same response. Then Waves’ Vocal Rider (introduced in 2009) listened to a vocal track’s levels and raised or lowered the vocal volume in real-time as appropriate (no off-line level analysis needed), thus fitting the vocal within a target level range relative to the mix. Vocal Rider also wrote automation, so you could tweak its decisions if needed.

There are also tools like SoundRadix’s Auto-Align, which analyzes signals from multi-mic setups and adjusts phase automatically—no more flipping channels out of phase, and listening critically while you adjust sample-accurate delay tools. Or consider NUGEN Audio’s LM-Correct 2, a Windows/Mac stand-alone program or Pro Tools plug-in that automates the compliance process needed to hit target levels for various loudness standards. Then there’s VocALign, which analyzes audio and syncs up overdubbed audio or looped dialog to match the source audio’s timing.

These processes have mostly been robotic in nature—they automate or simplify processes that used to require at least time and effort. But the next generation of processors take this one step further, and make value judgments based on their own “experience” derived from their algorithms.

Sonible’s frei:raum has a “Smart EQ” mode that’s a good example of this new breed. The company isn’t forthcoming about how the process works, but I was pretty surprised when testing it. I took a raw track that needed EQ and made the edits I thought were optimum, then applied “Smart EQ” to the same raw track. frei:raum created a curve that was virtually identical to the curve I had dialed in. The “me in a box” vibe was spooky enough, but the biggest difference was that it took frei:raum about ten seconds to dial in the EQ settings—much less time than it took me. frei:raum doesn’t get it right every time, but if nothing else, it can get you really close.

iZotope has taken the same kind of concept even further with the Neutron plug-in. It’s like an intelligent version of the company’s Alloy channel strip—EQ, two multiband compressors, multiband exciter, transient shaper and output limiter—that analyzes a track to create starting points for each module. As expected, the EQ creates a curve (some nodes can even be dynamic as well as static), but Neutron also sets crossover points and settings for the multiband compression and exciter. iZotope doesn’t claim Neutron replaces manual tweaking, but rather, it creates “presets” that are a more customized starting point than the stock presets that ship with signal processors.

There’s also an “unmasking” EQ function that’s intended to fix issues where frequencies in different tracks “step on” each other (when inserting multiple instances, the Neutron plug-ins “talk” to each other). In a way, this is fairly simplistic; for example, if kick and bass conflict, it will adjust EQ in the bands that conflict, rather than do something like boost the bass’s upper mids to accentuate the pick attack, thus differentiating it from the kick. But more interestingly, sidechaining allows EQ changes to happen only in response to when the interfering signal is present.

Neutron isn’t a panacea; I wouldn’t trust a complete mix to it. However, I would trust it to get me started in the right direction if I was pressed for time. And I’m sure for those new to recording, it would be a fantastic tool not just because of what it does, but because “reverse-engineering” its edits—and figuring out what’s needed to optimize those edits—would be highly educational.

We haven’t reached truly “self-driving” audio yet for mixing and mastering. But with computers getting more powerful, algorithms getting smarter, and time becoming ever more precious, it’s likely that this trend will continue to develop.