The PAR headphone bench test published in April generated a response from none other than scientist, researcher, and AES Fellow, Tom Holman — someone whose breadth of audio understanding I’ve learned from and learned to respect. Tom outlined his own measurements of headphones over the years. His methodology employs a KEMAR dummy mannequin and Zwislocki coupler that physically model the human ear, including such factors as ear-canal resonance and concha response. The response of the system was evaluated with flat airborne sound source and a frequency-response curve developed that would later be inverted and combined with the measured headphone response for an end measurement that would be a close approximation of the way the headphones would react with the human ear. Tom’s research formed the basis for an article written for the CAS Journal.
Conversations with B&K — manufacturers of the Type 4153 Artificial Ear we used in our evaluations — revealed that there are indeed limitations to the methodology we employed, even though, “for standardized comparative purposes,” use of the Type 4153 approach “is widely accepted.” The advantages of this approach include simplicity and repeatability. There is a broad range of coupler types in use in various measurement applications, from testing hearing aids to evaluating telephone headsets. A common consideration with most of these devices is limited capability for flat measurement above 10 kHz due to system resonances, which I’m told is at least in part due to designs optimized for accuracy in ear and canal modeling at low frequencies.
B&K does manufacture a IEC 711 (B&K Type 4157) coupler that is used by some, as part of a head and torso simulator, for measurement of headphones and telephone headsets — representative of the more “realistic” measurement couplers, which would include the devices used in Tom’s testing. Even with such devices, without the application of essentially a HRTF (Head Related Transfer Function) offset, headphone measurement does not produce frequency-response plots that translate readily to, say, comparison of a particular loudspeaker’s response curve to that of a particular set of headphones. Standard measurements like those we employed would thus look substantially different from a plot of a loudspeaker that might give the subjective impression of having a similar response.
What does all this mean in regards to our testing? The extremes in our tests are not likely to be perceived as extreme as the measurement plots indicate, without some correction for the way the physical ear structure would mechanically process incoming sounds. That said, the peaks and valleys as measured, if one concedes some consistency to the methodology, were wildly dissimilar. The center frequencies of the various measured nonlinearities were markedly different from headphone set to headphone set. And the amplitudes of these deviations were also radically different from one set of phones to the next. Comparing these differences is something I still find quite interesting, while I look forward to learning more about ways to improve future such tests.