|www.ethanwiner.com - since 1997|
by Ethan Winer
This article first appeared in the December 2014 issue of Recording Magazine.
Everyone knows that dither is an important final step in the production process when preparing a 24-bit mixdown file for CD. According to conventional wisdom, if you don't use dither the frequency spectrum and sound stage will be compromised, with soft passages and reverb tails sounding grainy and distorted.
Likewise for jitter, another common audio boogey man. Jitter is a periodic timing error similar to wow and flutter, but it occurs at a much higher frequency. Jitter due to inferior digital clocking is often accused of harming localization, and causing instruments to lose fullness and focus, resulting in generally bad sound.
The notion that jitter or dither affect sound quality per se is pure fiction. These myths have been repeated so often that they're accepted as fact by many. Jitter manifests as noise 100+ dB below the music. This is softer than the background "hiss" of a CD, which I've never heard at normal listening volume. The amount of jitter in modern digital converters is literally 1,000 times less than the flutter of even the finest analog tape recorder. Now, it's true that studios having multiple digital sources will benefit from an external master clock, but that's entirely about synchronization. Same for dither, whose effect is 90 dB below peak level. Nobody can hear artifacts that soft, especially when they're masked by the music that's playing. When these things are compared in a proper level-matched blind test, all of a sudden what had been an obvious difference that "even my mother can hear" becomes impossible to identify.
In practice, dither reduces quantization distortion from too soft to hear to even softer. So I never argue against using dither because the improvement is real and can be measured. Most audio editor programs include an option to dither, so there's no reason not to use it. Further, music passes through many pieces of gear between the studio microphones and the listener's loudspeakers or ear buds, so it's important to minimize quality loss at every step, even if the degradation added by each individual device is inaudible. But suggesting that people experiment with different dither types by ear to select the one that sounds best is like chasing unicorns. If you have to crank the playback volume unnaturally loud to hear the effect of dither on a reverb tail, what's the point? Further, claims that dither affects the frequency spectrum and sound stage of your mix are easy to disprove with standard measurements.
Another frequent audio whipping boy is inadequate or "dirty" AC power which is accused of "blurred imaging" and "flabby bass," among other claims. AC power problems severe enough to have an audible affect are rare in most parts of the civilized world. Power problem are easily heard as clicks and pops when a refrigerator turns on or off, or as buzzing caused by solid state light dimmers. Both of these noises are caused when short duration impulses are sent back into the power line by the offending device.
When a power spike has a short duration, the corresponding fast rise and fall times contain high frequencies that can leak through power transformers. Such impulses can also radiate through the air as radio waves to be received by the pickups in an electric guitar or bass. But these voltage spikes are very large - generally hundreds of volts! Compare that to the usual amount of noise riding on AC power lines which is typically measured in millivolts, or thousandths of a volt. The power supplies in audio gear routinely filter out such low level noise.
One task of an AC power source is to supply a reasonably constant voltage that's then converted by the power supply inside your audio gear to the lower DC voltage required by solid state circuits used for microphone preamps, equalizers, and other devices. AC power must also provide enough current, especially for power amplifiers that will deliver several amperes to a loudspeaker load. A 100 watt power amp driving an 8 Ohm speaker provides about 3.5 amps, and that current comes from the AC mains.
If an AC power source is unable to provide sufficient voltage or current the result is distortion. But AC power rarely varies more than a few volts except during summer brown-outs, and all audio gear routinely handles such normal variations. A standard 16 gauge power cord can deliver sufficient current for any audio device in your rack. If a high-power amplifier needs more current, certainly the vendor will provide a suitable power cord. The notion that "power problems" can create a subtle lack of fullness and loss of musical detail or image width is, again, easily disproved with basic audio measurements and proper blind tests.
EARS ARE NOT PERFECT!
So why do people sometimes swear they hear a difference with low-jitter clocks, various flavors of dither, or after replacing one perfectly competent AC power cord with another? One factor is the frailty of human hearing perception. People often think they hear an improvement, but it's more likely that they simply became more familiar with the music after repeated playing and noticed more details. Is that delicate cymbal ping really clearer after replacing the capacitors in your power supply, or did you simply never notice it before? Psychoacoustic researchers are well aware that human hearing is fragile and short-term. If you play a piece of music, then switch to an outboard clock and listen again, it's very difficult to recall the earlier playback's tonality.
According to former DTS chief scientist James Johnston, hearing memory is valid for less than one second. James also explains that we cannot focus on everything in a piece of music all at once. On one playing we might notice the organ but ignore the bass, and so forth. This makes it very difficult to know if subtle differences are real or imagined. If you play the same section of music five times in a row, the sound reaching your ears won't change unless you move your head, but the way you perceive each playing will surely vary.
Hearing limitations extend beyond our inability to hear multiple details at once, or remember the exact timbre of an instrument for longer than a few seconds. Another important issue is the masking effect, which obscures content in one source when similar frequencies are present in another. When an electric bass track is played in isolation every note can be distinguished clearly. But that same clear bass sound can turn to mush after adding a chunky sounding rhythm guitar or piano. I'm convinced this is the real reason people wrongly accuse jitter or "faulty summing" for a lack of clarity in their recordings. The clarity loss is in our ears and brain, not the audio device or its power supply.
YOUR ROOM IS LYING TO YOU
Another factor is the acoustics of the room you listen in. Unless you wear earphones, moving even an inch or two makes a real change in the frequency response arriving at your ears. While testing acoustic treatment I measured the frequency response at high resolution in an untreated room. This room is typical of the size you'll find in many homes - about 16 by 11 by 8 feet high. Besides measuring the response at the listening position, I also measured at another location four inches away. This is less distance than the space between an adult's ears. At the time I was testing bass traps so I considered only the low frequency response, which showed a surprising change over such a small distance.
|Figure 1: This graph shows the low frequency response in a room about 16 by 11 by 8 feet at two locations four inches apart. Even over such a small physical span, the response changes substantially at many frequencies.|
Conventional wisdom holds that the bass response in a room cannot change much over small distances because the wavelengths are very long. (A 40 Hz sound wave is longer than 28 feet.) Yet you can see in Figure 1 that the peak at 42 Hz varies by 3 dB for these two nearby locations, and there's still a difference even as low as 27 Hz. The reason the response changes so much even at low frequencies is because many reflections, each having different time and phase delays, combine in varying amounts at every point in the room. In small rooms the reflections are especially strong because the reflecting boundaries are all nearby, which further increases the contribution from each reflection. Also, nulls tend to occupy a relatively small physical space, which is why the nulls on either side of the 92 Hz marker have very different depths. Indeed, the null at 71 Hz in one location becomes a peak at the other. If you examine the same data over the entire audible range, shown in Figure 2, the two responses are so totally different you'd never guess this is the same room and loudspeaker!
|Figure 2: This shows the full range response from the same two measurements in Figure 1. At mid and high frequencies the response difference four inches apart is even more substantial than below 200 Hz.|
One cause of these large differences is comb filtering. Peaks and deep nulls occur at predictable quarter-wavelength distances, and at higher frequencies it takes very little distance to go from a peak to a null. For example, at 7 KHz a quarter wavelength is less than half an inch! At these higher frequencies, reflections from a nearby coffee table or even a leather seat back can be significant.
Because of comb filtering, moving even a tiny distance changes the response by a very large amount at mid and high frequencies. Especially in small rooms having no acoustic treatment. The response at any given cubic inch location in a room is the sum of the direct sound from the speakers plus many reflections all arriving from different directions, at different strengths, and with different delay times. So unless you sit perfectly motionless, there's no way the sound will not change substantially when you move your head a few inches.
Even with acoustic treatment, loudspeaker beaming and lobing also cause the frequency response to change with position. Speaker designers aim for a flat response not only directly in front of the speaker, but also off-axis. But it's impossible to achieve the exact same response at all angles. Even if the response didn't change as you moved your head a few inches to one side, mono content coming from both speakers will arrive at your ears at different times. That also causes comb filtering, and is another reason the sound can seem to change even though it really didn't.
We don't usually notice these changes when moving around because each ear receives a different response, so what we perceive is more of an average. A deep null in one ear is likely not present in the other ear, and vice versa. And since all rooms have this property, we're accustomed to hearing such changes and don't always notice them. However, the change in response over distance is very real, and it's definitely audible if you listen carefully. If you cover one ear it's even easier to notice because then the frequencies missing in one ear are not filled in by the other ear.
I'm convinced that comb filtering is at the root of people reporting a change in the sound of cables and clocks when measurements show that an audible change is unlikely. If someone listens to their system using one AC power wire, then gets up and switches cables and sits down again, the frequency response heard is sure to be different because it's impossible to sit down again in exactly the same place. So the sound really did change, but not because the cable made a difference!
With audio and music, some frequencies tend to be harsh sounding, such as the range around 2 to 3 KHz. Other frequencies are more full sounding (50 to 200 Hz), and yet others have a pleasant "open" quality (above 5 KHz). So if you listen in a location that happens to favor harsh sounding frequencies, then change a wire or clock and listen again in a place that suppresses the harshness, it's reasonable to believe the wire change is responsible for the difference. Likewise, applying dither might seem to affect fullness even though the low frequency response change was due entirely to positioning.
PROPER TEST METHODS
It's common to read in audio forums about a "shoot-out" comparison of preamps, sound cards, or sample rates. Often someone will record themselves singing or playing an instrument with one setup, then record themselves again for comparison. But recording different performances is not acceptable because the subtle details we listen for when comparing gear also change from one performance to another. For example, a bell-like attack of a guitar note, or a glassy sheen on a brushed cymbal. Nobody can play or sing exactly the same way twice. Nor can they remain perfectly stationary, which is needed to ensure the microphone captures the same sound, as explained above. Therefore, recording different performances is not valid for testing wires, preamps, clocks, or anything else, unless the difference really is large, such as a Shure SM57 microphone versus a Neumann U87.
One useful self-test is simply closing your eyes while switching between two sources with software. When I test myself blind using Wave files, I set up two parallel tracks in SONAR, then assign the Mute switches for those tracks to the same group while the switches are in opposite states. That is, one track plays while the other is muted, and vice versa. Each time either Mute button is clicked, the tracks are exchanged. This lets me switch from one track to the other seamlessly. I put the mouse cursor over either track's Mute button, close my eyes, then click a bunch of times at random without paying attention to how many times I clicked. That way I don't know which version will play first. Then I start playback, still with my eyes closed, and listen carefully to see if I can really tell which source is which as I switch back and forth.
I've read claims that blind tests are inherently flawed, but in my opinion that's just an excuse for not being able to pass the test! Blind testing is the gold standard for all branches of science, and it makes no sense that it's invalid for assessing audio fidelity. Some people claim the testing itself is stressful because they're put on the spot to identify subtle differences, often in front of strangers. But this is easily solved using ABX software. An ABX program lets people test themselves as often as they want, over a period as long as they want, in the comfort of their own listening environment. The original ABX tester was a hardware device that played one of two audio sources at random each time you pressed the button. The person being tested must identify whether the "X" currently playing is either Source A or Source B. After running the same test, say, ten times, you'd know with some certainty whether you really can reliably identify a difference. These days several ABX testers that keep track of your selections over time are available as freeware.
Whether using an informal blind test or ABX software, it's important to understand a few basic requirements. First, the volume of both sources must be matched exactly, to within 0.1 dB. When all else is equal, people generally pick the louder (or brighter) version as sounding better, unless, of course, it was already too loud or bright. Indeed, people sometimes report a difference even in an "A/A" test, where both sources are the same! And just because something sounds "better" it's not necessarily higher fidelity. Applying a "smiley face" EQ curve often makes music sound better, but that's certainly not more faithful to the original source material.
THE NULL TEST
Some people believe there are aspects of audio fidelity that "science" doesn't yet know about, or understand how to measure properly. But this is easily disproved with a null test. The premise of a null test is to subtract two audio signals to assess what remains. Subtracting is done by reversing the polarity of one source, then mixing it with the other source at the same volume. This can be done in a DAW program using parallel Wave files, or live with electrical signals into a summing device. If nothing remains after subtraction, then the signals are by definition identical. And if a residual difference signal does remain, its level and spectrum show the extent and nature of the difference.
You can assess a residual signal either by ear or with spectrum analysis. For example, if one source has a slight low-frequency roll-off, the residual after nulling will contain only low frequencies. And if one source adds strong third harmonic distortion to a sine wave test tone, then the residual difference signal will contain only that added content. Nulling has been around since the 1940s, so if there were some as-yet unknown audio parameter, it surely would have been revealed years ago.
|Figure 3: This graph shows the low frequency response of a bedroom size space before (red) and after (blue) adding a number of bass traps.|
If you want a wide stereo sound stage and rock solid imaging, simply adding absorber panels at the left and right sides of your listening position will be vastly more effective than using a low jitter clock or replacement power cord. And while the bass response varies with location in all home-size rooms, bass traps can greatly reduce the variation. The graph in Figure 3 shows the low frequency response in a small bedroom size space, with and without bass traps. The span between peaks and nulls exceeding 30 dB is reduced to less than 15 dB, with most areas falling within a 10 dB window. Indeed, addressing the acoustics of your room yields the largest benefit. The improvement is easily measured and readily heard. No blind test needed!
Ethan Winer has been an audio engineer and professional musician for more than 45 years, and is co-owner of RealTraps where he designs acoustic treatment products for recording studios and home listening rooms. Ethan's Cello Rondo music video has received more than 1.5 Million views on YouTube and other web sites, and his new book The Audio Expert published by Focal Press is available at amazon.com and his own web site.
Entire contents of this web site Copyright © 1997- by Ethan Winer. All rights reserved.