Human Hearing and Mixing
The way we hear is perhaps the least well understood part of this entire process. It is certainly the part where I have the least science knowledge. A lot of this post is as much an observation of my own hearing, and how to analyze your own hearing, as it is about the science of hearing. I believe it is important for any mixer to have a sense of their own ears, both good and bad. You need to learn, or unlearn, your own hearing to have a neutral baseline behind the console.
The first thing I have noticed about my own hearing is that my two ears are different. I have several more dB of clarity above 5khz in my right ear, than i my left. I am also much more sensitive to the 1-2khz range in my left ear than right. My Sensaphonic custom ER-15 earplugs clearly tell me that my right ear canal is much smaller than my left, and my experience with in ear monitors tells me that I have very small ears, and ear canals, relative to the general populace.
The next thing I have noticed about my hearing, and indeed about hearing in general, is that it is depressingly nonlinear. My perception of lows/mids/highs is highly dependent on the volume of the sounds, and the time of exposure to those sounds. Commercial CDs that are mastered to listening levels between about 80-90dBA are often depressingly shrill, bright, and "sizzly" at live sound concert volume levels.
Two major studies trying to characterize the nonlinearity of human hearing were Fletcher Munson in 1993, and Robinson-Dadson in 1956. The most recent extension of this work I am aware of this the ISO 226 standard from 2003. The loudness contours of this standard are shown below:
The way to interpret this graph is that the y (vertical) axis shows the sound pressure level to produce and equivalent perceived volume. For the 80 phon equivalent loudness curve it takes about 80dB of sound at 3khz to intersect the curve. At 100hz, however, it takes about 95dB! This clearly shows are
ears are much more sensitive in the 1khz-7khz regime than at lower frequencies, in terms of equal volume perception.Also notice that as the total volume increases, the volume of low frequencies needed to perceive the same level decreases. This is why a cd that sounds thin when played quietly gets fuller and thicker on the bottom end simply by turning up the volume. This phenomenon is very noticeable in the studio mixing setting, where turning up the nearfield monitors a few dB often makes them "come alive" and makes the mix much more impressive.
Practically all of live sound mixing lies above the 80 phon loudness curve. Unfortunately this means a relative lack of hard data in this regime for live sound mixers. However, certain trends in terms of perceived loudness should be very obvious quickly from the above graph.
A very obvious question that arises is "why are my ears so sensitive from 1khz to 7khz?" The answer lies partly in the geometry of our ears, and partly in the nature of the perception of human speech. Your ear canal cavity forms a natural resonator, and the resonance frequency is approximately 2700hz. This coincides well the range in which human consonant sounds are formed.
Speech (and singing) can be roughly split into two types of sounds. The first are consonants, and the second are vowels. Consonants carry the
information of speech, and vowels carry the
power. Your ears' resonance is tuned to help pick up the information component of speech, so it makes sense that your ears would be most sensitive in this range.
Too much cutting equalization in this band can destroy the audience's ability to discern the meaning of the words being spoken or sung.The vowel range can traverse the range from about 150hz to 1khz for most singing. Somewhere in this range many singers, especially those with little formal training, or a specific accent, will often have a pretty specific sinus cavity resonance. This is the "nasal" or "whiney" tone that is ascribed to singers, especially in rock music. It is a fairly safe bet that this nasal cavity resonance lies in the octave between 400-800hz. This resonances is often made even more prominent by the close proximity of the vocal mic to the singer's face.
Cutting equalization in this octave can reduce the nasal quality of a singer's voice.While I am not familiar with the Asian languages, the latin-based languages of the world have a good degree of uniformity in the nature of the speech content frequencies. Keeping the above information in mind can help you mix effectively in a language in which you do not understand the words, simply by listening to the consonant/vowel balance.
Another thing I have noticed about my own hearing is that the more familiar I am with the words, the better I perceive them. This can result in mixing the vocals progressively lower over time as familiarity with the songs increases. While this remains sufficient for my personal vocal intelligibility, it can strand the audience who is less familiar with the material.
Now that we have spent a substantial amount of time discussion how our ears perceive level and speech, I now turn to a discussion of tone. The classic thirty-one band equal has equal octave spacing of its frequencies. That means the upper sliders influence a much larger swath of frequencies than the lower frequencies. This is also a fair analog for human hearing; at low frequencies we can generally readily distinguish between tones only a few hertz apart. People can also do the same at higher frequencies, but usually only under laboratory conditions. As a rule of thumb, as frequency increases, our ability to identify the specific tone from a tone at a nearby frequency decreases. A little shelving "air" may be enough in the last octave, but low and mid frequencies usually demand more discriminating equalization.
This is compounded by the reality that
the fundamental tones of almost all instruments lie between 100hz and 5khz. Now this is not a universal rule, but it is often reflected practically when behind the mixing board. The typical high-end analog mixing console will have an adjustable highpass filter, a low shelf/parametric, two mid parametric eqs, and a high shelf/parametric.
Three or four of those five eq implements are common targeted at the midrange band between 100hz and 5khz! Now, obviously many signals may need shaping above 5khz, but this shaping is of overtones, and not the raw notes/chord/tone.
I find a common mistake of starting mixers, and one that I made, was to assume that frequency of a tone was much HIGHER than it actually was. A musician may consider A440 on a piano a fairly high tone, in reality 440hz is squarely in midrange from a mixing and equalization perspective. If you will recall the previous post, it is also in a range where most speakers have only moderate directivity control. Taking this reminder into the mixing environment can improve the ability to quickly identify the problem range of frequencies.
A final point that needs covered on the nature of human hearing is what I call "threshold shift." This is the activation of the muscles in the middle ear as a built in compressor to protect our hearing from continuous loud sounds. If you have ever been to a loud concert than started out unbearable, and then became "glassy" and ok volume wise, and then noticed that the world was REALLY quiet after the show, you have experienced threshold shift.
One thing that, unfortunately, often accompanies the "really quiet" phase is ringing in the ears. Ringing in the ears comes from leaking calcium ducts in the ears, ducts that have been damaged by excessive vibration of the inner ear hairs. When these ducts leak, the brain falsely perceives that the tone range of that particular hair is happening continuously.
Ringing in the ears is a clear indication that you have caused acute trauma to your inner ear! Whether or not these trauma repair themselves is a matter of debate, but the damage has been done at least temporarily.
Threshold shift typically takes a matter of minutes to set in. For me it is about 5-10minutes, and typically releases between fifteen and thirty minutes after the exposure. The louder the sound, and the longer the exposure, I find increases the release time.
Threshold shift is the death of good mixing for me. My ability to judge mix balance, and quickly pick up problems frequencies greatly diminishes. Today I threshold shift about 97dBA slow. If I am asked to mix above this level I typically have to alternate songs with my earplugs in, and then out, just to keep my ears out of threshold shift. I suspect that i threshold shift at a lower level than most people, as I am relatively young, and have taken very good care of my ears. Threshold shift is big motivating factor for me to try to mix at moderate levels.
I also suspect that threshold shift is one of the biggest problems between musicians and monitor engineers. A brief soundcheck, short enough to make the monitors seem loud and clear, and not long enough to cause threshold shift, goes well for band an monitor engineer. However, once the show is going, and the band has been subjected to high levels for a more extended period, threshold shift sets in, the sound from the wedges goes to mush in the muso's ears, and all they know to do is ask for more level! This is the crux of so many soundcheck vs. show moments, that I feel this must be a major part of the underlying cause for the problems the musicians experience during the show. Food for thought as we head to part six...
Part six will be much shorter, and less theory in nature. In part six I discuss the need to be able to evaluate both the full mix, and the individual instruments, and methodologies of practice to learn this art.