SoundStage! Access | SoundStageAccess.com (GoodSound.com) - Why Do We Treat Audio and Video So Differently?

Most Popular Features

Details: Written by: Dennis Burger; Created: 01 December 2022

I was recently chatting with an industry colleague who’s reviewing an updated version of a wireless headphone model I reviewed several years back (for another publication). Without knowing the tonal quirks of this new version, I told him the old one had a reasonably flexible built-in EQ, and I’d be happy to share my custom profiles for that version to see how well they worked for the new release.

You’d have thought I offered to fricassee his children based on his reaction: “I review headphones as they sound, not how you can make them sound. I also don’t believe many people EQ their headphones.”

Let’s ignore that last bit, which is obviously wrong. My reply to him was: “Would you review a TV without at least mentioning the picture mode that best matched ISF standards or alluding to its calibration capabilities? I’m not saying you should only review the headphones with EQ applied. But if the headphones have an app with an EQ function, you could at least try it out and see if it has enough bands to get you closer to a more neutral frequency response.”

The response I received in return basically boiled down to, “Well, that’s different.”

But why? Why is it different? We have reference standards for audio and video alike. Why is it that so many audiophiles and A/V enthusiasts think the video standards are sacrosanct and the audio standards are outright heresy?

Calibration Artwork by Dennis Burger, made with Stable Diffusion

If you think my colleague is alone, by the way, consider the headphone review I referenced above. I began my evaluation by talking about the sound of the headphones out of the box, if you will, and my dissatisfaction with their tonal idiosyncrasies, as well as the shortcomings of their sound-profile presets. Then I talked about the custom EQ preset I created within the headphones’ app, and gave my before-and-after listening impressions.

And I got scolded in the comments section for doing so. Headphones should sound great from the giddy-up, one commenter told me. “I wonder how many other less-than-stellar reviews of headphones and speakers would have been turned around if the reviewer was allowed to EQ the sound,” he said.

Allowed? Did he say “allowed”? Was I in danger of being excommunicated?

At any rate, my response to that was and is: if headphones include built-in EQ as a function, that functionality contributes to the audio performance of those headphones. By reviewing the before-and-after sound of the headphones with EQ applied (back when I reviewed the things), I was simply giving what I thought was a more thorough evaluation of the product’s performance capabilities.

The thing is, I acknowledge that you might not find that commenter’s position as preposterous as I do. But, to return to this well again, imagine if we were talking about a video display instead. Imagine a commenter berating me for adjusting the contrast, brightness, and white-balance controls of a TV by saying, “I wonder how many other less-than-stellar reviews of TVs would have been turned around if the reviewer were allowed to calibrate the picture.”

In the domain of video, we simply assume that the reviewer is going to try to use the tools built into a display to get as close to the reference standards as possible, and judge the product by how well those tools enable or hinder such efforts. And to be fair, with audio—especially headphones—there are many people who follow the same principles. Yet there are a disturbing number of readers and writers alike who feel this is a violation of some ancient moral code, even if the product has those calibration or personalization tools built in.

But again, I ask: why?

First, a few caveats about the legitimate differences between sight and sound

Just to be clear, I’m not asking why hearing and vision are different physical phenomena. I understand that from the perspectives of both physics and neurology. Hearing is a mechanosensation—effectively a glorified form of touch at a distance—that generally operates in the range from 20Hz to 20,000Hz (20 to 20,000 cycles per second). Vision, meanwhile, is the process of phototransduction that lets our brains comprehend a small portion of the electro-magnetic spectrum, from roughly 430 to 790THz (one THz is equal to 1,000,000,000,000Hz, or 10¹² cycles per second).

So, it should go without saying, audio and video are not the same. Duh. Our brains process them differently, our computer systems process them differently, we remember them differently, we encode them differently, we compress them differently (although not as differently as some people may think), and we have different words for people who lose sensitivity to one or the other: being blind and being deaf are wholly different experiences.

Still, though, as I’ve written before, our disparate senses do interact with each other in sometimes nonintuitive ways. Take a surround-sound A/V system and change the size of the screen—and nothing else—and your perception of the sound will change. Any good listening test attempting to study listener preferences will take into account the ways in which extra-sonic stimuli affect our perception of audio fidelity.

But that’s sort of a digression. When I ask why some people treat audio and video as wholly non-overlapping magisteria, I’m not suggesting that we should treat optics as exactly the same as acoustics. Hell, I’m not even advocating that we treat the entire audible spectrum exactly the same. I’m merely saying this: our visual and auditory systems (along with our vestibular, olfactory, gustatory, and somatosensory systems) evolved so that we could engage with (and survive in) objective reality.

Synesthesia Artwork by Dennis Burger, made with Midjourney

Yes, these biological sensory systems and our experience thereof have their quirks and shortcomings. Otherwise, we wouldn’t have phenomena like optical and auditory illusions. And granted, you don’t see harmonic distortion or hear the color red—unless you experience synesthesia or dabble in amateur experimental pharmacology. But the point remains: most of us seem keen to accept that there are objective standards for video reproduction. So why are so many of us hesitant to admit that similar standards exist for audio and/or that they’re a good thing?

Why do audio standards matter?

I’ve been talking about this notion with some industry friends lately, and one of the most valid criticisms I’ve received is that audio reproduction is often a matter of taste. And indeed, SoundStage! Solo editor Brent Butterworth and I discussed this in an early episode of the SoundStage! Audiophile Podcast: some people prefer a Harman curve-like sound profile with a little extra bass, and some people prefer Harman curve-esque frequency response with a little less bass.

Indeed, if you read my review of the NAD C 399 and its Dirac Live room-correction functionality, you’ll notice references to the differences between Dirac’s own target room curve and NAD’s in-house curve, as well as my own preferred in-room target curve somewhere in between. Where do they differ? Largely in the bass frequencies.

Dirac

This doesn’t invalidate the whole notion of audio standards, mind you. It just means that different people have different notions of what those standards should be. More importantly, their disagreements are largely confined to the bottom four octaves of the audible spectrum.

But why do I care? I’ll give you another example that comes from my work on the podcast. At some point in the future, I’m planning on writing a whole article about things I’ve learned from (or notions that have been reinforced by) the technical work of mixing and mastering every other episode of that show. But here’s the salient point that’s relevant to this conversation.

After a few months of working on the podcast, I got a pair of RØDE NTH-100 over-ear headphones because in our earlier recording sessions, Brent’s voice kept bleeding out of my headphones and into my microphone, and given the delays caused by internet latency, it was making the editing a nightmare. I needed something with better isolation. (Most of the cans that I use for enjoyment are either wireless or open-back, and most of my wired closed-back cans are gaming headsets that don’t even pretend to be neutral.)

RØDE RØDE NTH-100 measurements courtesy of Brent Butterworth

The RØDE headphones worked great for sound isolation, and given that they track reasonably closely to the Harman curve, they made it a lot easier to EQ my and Brent’s voices. Still, until very recently, when I was done with my first-pass mix and unplugged my cans to finish mixing and mastering on my SVS Prime Wireless 2.1 system, I always had to go back and tweak the EQ of both channels. Then I’d arrive at something that sounded good on my RØDE cans and the SVS speakers, only to find that when I tested the mix on my hi-fi setup with full-range Paradigm towers, I had to make more coarse adjustments between 200 and 500Hz and much finer adjustments between 7 and 10kHz.

But recently I purchased a new PEQ plug-in from a company called Toneboosters, and it came with demo copies of the other plug-ins the brand produces. One of them, Morphit, allows you to correct the frequency response and phase characteristics of more than 500 different models of headphones. Not only can you effectively emulate any headphones in that database with any other, but you can also correct your headphones to the Harman curve, as long as they’ve been modeled by Toneboosters.

On the first episode I mixed with this plug-in in my signal chain, I pulled off the cans after a first pass to listen through my SVS Prime Wireless speakers, and almost immediately realized I didn’t have to make any EQ adjustments at all. When I switched over to my Paradigm towers, my first-pass mix sounded great. In short, I’ve knocked an hour or more off the time it takes me to mix and master an episode just by improving the neutrality of the headphones on which I mix from the giddy-up.

Toneboosters

OK, that’s well and fine for mixing and mastering, but what about listening?

I think you’ll find that my experience with the RØDEs translates to listening just as well. We’ve all read headphone reviews (and some of us have written them) that effectively said, “They sounded great with yacht rock and rap but sounded like poo-doo with ’80s pop” or something to that effect.

And that’s great if all you listen to is yacht rock and rap (or whatever genre of music happens to accentuate the positives and downplay the negatives of a particular pair of cans, by design or by accident). But the beauty of neutral headphones is that they are more likely to work for every genre of music, in my experience. And a mix done on more neutral headphones or speakers is far likelier to translate much better to other headphones and speakers.

So if you only listen to heavy metal or classical or jazz or self-made DAT recordings of Tuvan throat singers, then sure—get headphones whose distinctive sonic flavor works in harmony with those recordings. But if you have broad musical taste and you want everything to sound as good as it can through your headphones, you probably ought to either buy something with a neutral tonal balance, whether it be the Harman curve or any number of other attempts at achieving neutral tonal balance from headphones. Or, you know, use the EQ function built into the app of many wireless headphones these days to remove their tonal peculiarities.

Roon has PEQ built in. Spotify has a more primitive six-band EQ deep in the Settings section of its app. I’m hoping Qobuz adds something similar at some point for those times when I’m charging my Sony cans and instead have to rely on my gorgeous and dynamic but tonally eccentric B&W wireless over-ears.

But hey, if you’re 100 percent happy with the sound of your headphones or earphones right out of the box, and you don’t want to EQ them, then don’t. I’m not criticizing you. I’m all for anything that brings you sonic bliss and elevates your listening experience. Just don’t chastise those of us who use what tools we have at our disposal to calibrate our gear to legitimate reference standards.

. . . Dennis Burger
dennisb@soundstagenetwork.com