Headphones Mixing? Speakers Mixing? Both?
By Roey Izhaki
Headphones vs. speakers – the theory
When listening on headphones, both the left and right ears are fed exclusively with the corresponding channel. This means that the left-channel signal reaches the left ear only, and the right-channel signal reaches the right ear only.
With speakers, however, this is not the case. Sound from each speaker reaches the nearest ear first and the farther ear soon after. Effectively, each ear gets the signal from the nearest speaker blended with the slightly delayed signal from the other speaker. This results in the following:
• Sounds from one speaker can mask sounds from the other.
• Overall smearing of the image for any sounds not panned to the extremes. Most severe smearing happens with sounds panned center.
• Curvature of the sound image as center-panned sounds appears deeper due to the delay between the late arrival of sound from the far speakers.
None of this happens with headphones, but stereo was conceived with speakers in mind, and for many decades now music has been made assuming playback through speakers. Our equipment, notably the design of our pan pots (but also that of stereo effects such as reverbs), assumes the same. Mixing engineers mix using and for speakers. But how do these mixes translate onto headphones?
The key difference between listening to music through speakers and headphones has to do with the way our brain localizes sounds. How this happens is based on the findings of Helmut Haas and is implemented through Alan Blumlein’s invention of stereo. It is sufficient to say that a sound from one speaker will completely mask a similar sound from the opposite speaker if the former is approximately 15 dB louder. Practically, if the signal sent to one speaker is roughly 15 dB softer than a similar sound sent to the opposite speaker, the sound will appear to be coming entirely from the louder speaker. But with headphones no such masking occurs as the sound of each channel doesn’t arrive at the opposite ear. To make a sound appear as if it is coming entirely from one ear, roughly 60 dB of attenuation is required on a similar sound for the other ear.
In the way pan pots are designed, when one pans from the center to the left, one should expect the location of the sound to correspond to the pot movement. This does not happen with headphones, where the sound seems to shift only slightly off the center first, before quickly ‘escaping’ to the extreme just before the pan pot reaches its extreme position. It is hard to place sounds anywhere between the slightly off-center position and the very extreme (in headphones). In fact, in the way standard pan pots work, it is next to impossible. Positioning instruments on the sound stage is much easier when listening through speakers. Applications such as Logic offer ‘binaural’ pan pots, which tackle this exact problem and can achieve much better localization using headphones; but the penalty is that they do so by altering the frequency content of the signal sent to each ear, thereby altering the overall frequency spectrum of the instrument. Also, these types of ‘binaural’ mixes do not translate well on speakers. In addition to all that, the sound stage created by speakers is limited to the typical 60° between the listener and the two speakers. With headphones, the sound stage spans 180°.
Mixing engineers work hard to create sound stages in mixes using speakers. When these mixes are played through headphones, these sound stages appear completely distorted. While this does not seem to bother most listeners, most serious music buffs insist that listening to music via speakers is far more pleasing, largely due to the lack of spatial sense when using headphones.
The dominance of speaker mixes was never questioned until recently, when portable MP3 players and their integration with cellular phones became so widespread. It is a valid question to ask why we still mix using (and for) speakers when so many people nowadays listen via headphones. There is an unexploited opportunity here for record labels to produce ‘speaker’ and ‘headphone’ versions. This would make sense not only from a mixing point of view but also from mastering, consumer and label revenue points of view.
Some recording and mixing engineers take their reference headphones to studios they are not familiar with. Headphones provide a solid reference, and their sound only alters when different headphone amplifiers are used. As previously explained, the room plays a dominant part in what we hear with a speaker setup – the sound headphones produce is not affected by room acoustics or modes.
This is very important for rooms with flawed acoustics, such as many bedrooms and project studios. In such rooms, a good pair of headphones, together with a good headphone amp, can be a real aid. Having room modes out of play means that the range between 80–500 Hz can be more evenly reproduced, although studio monitors still have an advantage reproducing very low frequencies compared to most headphones. Other acoustic issues simply don’t exist when using headphones, for example the combfiltering caused by early reflections, the masking between the left and right speakers and even the directivity of the tweeters. It can be generalized that, as far as frequency reproduction is concerned, the results we get using good headphones are more accurate than those generated by speakers in a flawed room.
For many, compression, gating and dynamic range tasks, speakers do not provide a clear advantage over headphones – as long as the processing does not affect the perceived depth of the instrument, headphones can be useful.
While headphones can be great when dealing with frequencies and useful when treating certain dynamic aspects in a mix, there are also a few disadvantages in using them, and they are almost useless for some mixing tasks.
As discussed, the spatial image created by headphones is greatly distorted and conventional tools make it very hard to craft appropriate sound stages on headphones. Any sound stage decisions we make, whether left/right or front/back, are better made using speakers. As depth is often generated using reverbs, delays or other time-based effects, the configuration of these effects benefits from using speakers.
Earlier it was stated that ideal mixing rooms aim to have the reverb within them decaying at around 500 ms and that anechoic chambers, where no reverb exists, are far from ideal. As we lose the room response when using headphones, we also lose the reverb, so it is as if we mix in a space similar to an anechoic chamber. Most people find this less pleasant. But the lack of room reverb together with the close proximity of the headphones’ diaphragms to our ear drums mean that ear fatigue is more likely to occur. At loud levels, headphones are also more likely to cause pain and damage to the ear drum more rapidly than speakers.
The above is an excerpt from Roey Izhaki’s book Mixing Audio, 2e. Roey Izhaki has been involved with mixing since 1992. He is an academic lecturer in the field of audio engineering and gives mixing seminars across Europe at various schools and exhibitions. He is currently lecturing in the Audio Engineering department at SAE Institute, London.
Above image from Flickr by ilovememphis