Sounds in Digital Media
- Details
- Category: Articles
- Last Updated on Friday, 29 April 2011 06:33
- Published on Friday, 29 April 2011 06:28
Sound is extremely important in today's world of fast paced first person shooter games. Especially in the competitive environment, sound is one of several tools that gamers use to gain a slight edge over their opponent. Even a fraction of a seconds notice that someone is creeping up on you may give you enough notice to avoid being killed from behind.
Obviously, having a good headset is therefore a crucial tool in order for you to take the maximum advantage of these auditory cues. Therefore, having some understanding of the way your body processes sound and how headsets create sound might guide you to making a better decision for your headset choice.
The Science of Sound
So, what is sound? Well, the ever reliable Wikipedia states that "sound is a mechanical wave that is an oscillation of pressure transmitted through a solid, liquid, or gas, composed of frequencies within the range of hearing and of a level sufficiently strong to be heard, or the sensation stimulated in organs of hearing by such vibrations." That's a fairly complex definition. For our purposes, we're just going to say it's a wave. Now, if you'll notice, Wikipedia's definition says it's a mechanical wave. That means it's physical. Sound is caused by vibration. When you speak, your vocal cords are vibrating. If you take a bottle and blow across the top of it, the air inside is vibrating and makes that annoying sound that children (and my roommates) find so entertaining. If you were in band in high school and played a brass instrument, your lips "buzz" or vibrate. Piano strings vibrate when the hammer hits the string.
So, this wave, created by whatever sound is there, radiates away from the source. It doesn't pass through most solid objects that well. And when it does, it alters the sound source slightly. That's partially why if you're listening to music on the other side of a wall, it sounds differently than if you were in that room to hear it. Other things that affect sound are objects in the room that might get in the way of the wave traveling. Smaller rooms tend to be "louder" because the sound travels towards the wall and bounces back. Although this is louder, it does tend to alter the characteristics of the sound wave.
So now, the wave has bounced around from its source, and it has nearly reached your ear. Let's take a look at your ear.
http://scienceblogs....ar%20cutout.bmp
If you'll look at the image above, you'll notice that your ear isn't just what's externally visible, there's a lot of little parts inside of it. But let's start on the outside. Now that sound wave is starting to hit your outer ear. Everyone's ear shape is fairly unique (http://www.medicalda...ints-for-id.htm), this will play an important role in a moment. As the sound hits your outer ear, the sound wave is modified by the shape of your ear. Sounds coming from in front of you hit different parts of the ear than sounds coming from behind you. This is one of the ways that helps you "localize" the sound, which means determine what direction it came from. The other way is by measuring (subconsciously) the time delay it takes for the signal to reach your other ear.
Now, let's pretend the signal has hit your outer ear, and is now traveling down the ear canal (the hole in your ear) towards the center of your head. You'll notice that it hits the eardrum. Now, you know a drum makes noise because you hit it with a mallet, and the drums surface vibrates, causing the air underneath it to vibrate (this is why drums aren't flat, but they have depth to them). The eardrum behaves similarly, this soundwave hits the eardrum, makes it vibrate, and the air behind your eardrum vibrates. The vibration travels into your inner ear, where there is liquid with tiny sensor hairs sticking out into the liquid. When these hairs detect enough vibration in the liquid, they send nerve impulse to your brain, which interprets these as sounds. And you know from experience what each particular sound means (whether it's a meaning or a word).
Recording
Alright, so that's a basic introduction to how you hear and interpret sounds. But that still doesn't answer the question of how do we artificially make sounds. Well, a microphone is very similar to your ear. Sound reaches it, vibrates a disc (similar to the eardrum) and this vibration is converted into electrical impulses that are "encoded" into some sort of format that corresponds to what the sound was that was recorded. This format could be the grooves on an old fashioned record, or the 1's and 0's of binary (I may do an article on binary later). In binary, which is a digital source (digital means that only certain values can be represented, we use 0's and 1's. In analog, the possible values are infinite). These 0's and 1's are usually in some sort of agreed upon format so that they represent the amplitude and frequency of the signal over time.
To play back music from an analog or digital source, electrical signals are sent (that are the same as what was recorded) to an driver, which vibrates a disc and thus replays the sound, hopefully as it was originally recorded.
Now, analog sources can take on an infinite amount of values, so therefore they have the capability to result in the best audio quality. However, the recording head for analog is typically a physical needle that carves out the signal into a record or tape. If this needle becomes worn, or the path that it carves becomes worn, then it results in a degradation of signal.
Digital sources on the other hand, don't suffer from this mechanical failure. The 1's and 0's of files on your hard drive, for the most part, don't change. And even if they do, there's sometimes ways of telling (now I really feel like writing an article about binary). So even if you make copies of a file, it's probably going to be exactly the same as the original. This means that in practice, digital recordings can sound better than analog. Also, digital recordings take up a lot less room. Some of those big records your grandparents may have used could hold a three minute song. A CD, holding a ton of uncompressed (more about that in a moment) can hold 233 three megabyte songs. A DVD can hold about 1500 songs. A Blu Ray can hold over 8000. So for digital media (music, games, videos, and computer sounds) we use digital methods of storing the sound.
Now, I mentioned uncompressed music. Here's a very basic introduction to compression. Imagine I have a string of numbers that is 000001110000011. That's a lot of numbers right? Well what if I said it in this way "5 zeroes, then 3 ones, then 5 zeroes, then 2 ones". That could potentially take up a lot less room, yet still hold the same amount of data. That's lossless compression. However, there's also something called lossy compression. Imagine that that string of 0's and 1's represented frequency 15.762, then frequency 14.5, then frequency 15.762 again, and then frequency 12. In order to conserve space, we might just round off that 15.762 to say 15.7 or 15.5. Or we might cut out the 14.5 from between them. So the signal has been changed slightly, but you may or may not notice it depending on how closely you listen. Doing lossy compression allows us to conserve space. MP3's, which are a file format you're familiar with, is a lossy compression format. A three minute song in .mp3 format is about 3.5 megabytes (a megabyte means a million bytes, and a byte is 8 bits, so that's a lot of 1's and 0's). However, a lossless format might be 30 to 40 megabytes in size. So you can see, the .mp3 is a lot smaller. And some of you might not even be able to tell the difference, depending on what you're listening to it with and how good your ears are. But just know that lossy formats like .mp3 are far more common than lossless.
Sound Channels
Now, most music that we listen to is in stereo. This basically means that there is one recording that represents the sound coming from the left, and another recording that represents sound coming from the right. We call these recordings "channels", since we typically call a single song a recording, not two recordings. The left channel is played through the right speaker, and the right channel is played through the right speaker. A single channel of audio is called "mono". Now, keep in mind that it's possible to play mono sound through two sets of speakers. The sound itself isn't stereo, even though two speakers are typically associated with stereo sound. This becomes important later on as we add more channels.
Let's pretend you're watching TV, Law and Order: Special Victims Unit, one of my favorites. Let's say our cops are standing in the middle of the screen. This show is normally from the point of view of the cops, because they sort of what you want you to project yourself onto the cops in order to immerse you into it. So when a gunman fires out the cops on the scree from the left hand side of the screen, they play this sound through the left speaker. This is intended to immerse you into the show more. However, we sent several feet away from the TV, with the TV's built in speaker in front of us, not off to the left. So we're kind of getting mixed signals in our heads. The sound wave is coming at us from in front and slightly to the left, but the director of the show wants you to interpret this as coming from your left. Now, we've grown up with TV's so we can kind of figure this conundrum out. But it would be much easier if the speaker was actually on our left. So, that's where headphones come in. It's a speaker that sits directly to your left, a little ways from your ears. Now, that gunshot really does sound like it's coming from the left. But what happens when our cops on the TV talk? The director wants that to sound like it's coming from ourselves, so he plays it through both speakers. Not too hard to imagine. But what if the gunman has now come out, and is standing in front of the cops talking to them (with the cops back to you the viewer). The director wants that to sound like it's coming from in front of you. Well, there's not an easy way to do that with two speakers. You can't really get the best of both worlds. He can't play it through just the left, because that would imply that the gunman is standing to the left. He can't play it through just the right, because that would imply that the gunman is standing to the right. So he compromises it and plays through both left and right, which we had already determined is what the cops voices are coming through. Just too hard to overcome this with two speakers.
However, by adding speakers and audio channels to this setup, we can provide a much more accurate "auditory image". So, you the viewer have now set up a home theater in your house. You bought a surround sound speaker set. This speaker set contains six speakers: front left, front right, center, rear left, rear right, and subwoofer. You're also watching Law and Order on DVD now, which includes a surround sound track encoded with six channels of audio: front left, front right, center, rear left, rear right, and low frequency effects. This combination of channels is referred to as "5.1". There are five primary channels, and the low frequency effects (LFE) is the ".1". These match up pretty nicely with your speaker set, which is a 5.1 speaker set. Notice, that the LFE channel may be played through any of the speakers, not just the subwoofer. This is because low frequency effects contains the lower frequencies or "bass" and bass is a non-directional sound. You can't really tell which direction low noises come from. Some of the low frequency effects are so low that they're out of your hearing range, but you can still physically feel them. This is what the subwoofer excels at, playing these really low sounds like explosions.

So, with this new 5.1 surround sound speaker setup and DVD that has 6 channels of audio, let's do a new scene. Gunman A is to the left of the cops off the screen, the cops are in the center of the screen with their backs turned to you, and Gunman B is standing in front of the cops. Now, Gunman A's gunshots come through the front left and rear left. Gunman B's gunshots come through front left, center (center speaker is placed in front of you near your TV, between the Front Left and Front Right speakers), and front right. And our cops come through from all the speakers, but each one at a fairly low volume so that we get the impression we're in the center. You can probably see how this creates a much more accurate auditory image. This helps immerse you in the movie more, and you can easily tell where each sound is coming from. Remember from the more science oriented beginning of this article that sounds coming from each direction are modified by your ears according to the direction that they originated from, allowing you to tell which direction the sound is coming from.
However, a set of six speakers (especially the big subwoofer) tend to disturb people around you. That could be your wife, parents, neighbors etc. So they came out with "5.1 headsets". Now, like the stereo headphones that I described above, these contain drivers in side an earcup out to the side of your head. However, stereo headsets only had one driver per earcup (remember the driver is the part of a speaker that makes the noise). True 5.1 headsets have four drivers per earcup. Also recall that I said the number of channels isn't directly related to the number of drivers. Here's the layout for a 5.1 headset:

There's a "front" speaker that's placed towards the front of the earcup, a rear towards the rear of the earcup, a center in the middle, and a (non directional) subwoofer. Now, this layout isn't exactly the same as the 5.1 speaker setup as your home theater, due to the obvious limitation of having each ear isolated from each other and no way to stick a center speaker right in front of your face. So the front left and front right play together to simulate the center channel. The left center driver would be for if a noise occurs to your left, the right one for the right, and the rear ones handle rear sounds. Each driver is angled at your ear slightly, so that they point towards your ear. This allows your ears to still come into play and modify the sound wave, and allow you to tell which direction the sound originated from.
However, most 5.1 headsets suffer from some limitations. The limited space inside the earcup means that they must use smaller drivers. Smaller drivers typically result in lower sound quality. Also, some manufacturers don't really angle them very well, which doesn't allow your outer ear to modify the sound wave properly. So there are some headsets that do a good job of this, but most of the early ones were kind of failures, and that left a bit of a stigma about them.
Virtual Surround Sound
Now, headset manufacturers want to make money. If their headset has this negative stigma about it, you can bet they're going to find something else. And so Dolby Laboratories figured out a way to simulate surround sound from stereo headphones. They called this technology Dolby Headphone (there are some other similar technologies out there, but this is the main one). The way it works is that Dolby figured out in what way your ear modifies the signal, and created these formulas or algorithms for modifying a 5.1 source for playing through stereo headphones. Say you have signal that's intended to be played through the rear left speaker. Let's simplify it and say this signal has a value of 35 when heard by your ear, and it has an original value of 18. Well, if you just listened to the 5.1 source through stereo speakers, your ears would might hear a value of 19, since it shoots nearly straight into your ear canal and doesn't have a chance to get modified all that much. But, if you plug that value of 18 into this formula where x=18 and the formula is 2x-1, and then play it through the stereo speakers, it gives you the impression that the sound originated from behind you, since the formula did the conversion for you instead of your ears. Well Dolby came out with these computer chips that do the Dolby Headphone conversions (it can also be done by software). They take a 5.1 source, apply the formulas, and put out all six channel of audio through two channels, for playback by stereo speakers. So now marketing companies can market these Dolby Headphone products to you and you'll be happy that you're receiving virtual surround sound, right?
Well, if you recall, at the very beginning of this, I pointed out how everyones ears are different. So, there's no one set of formulas that is going to work for everyone. So what do you do if you're Dolby? You pick the formula that is going to result in the most accurate sound reproduction for the most people possible. Unfortunately, that leaves some of us out to dry if their formulas don't modify the sound wave in a way that's similar enough to how our own ears would do it. Their artificial methods don't work for everyone. Also, there is a minuscule delay associated with applying these formulas to alter the sound wave, but you probably won't notice it unless it's being done in software.
Again, we have a conundrum. TV speakers aren't the most accurate. Stereo headphones are decent. 5.1 speakers are nice, but aren't the most convenient due to price and how loud they can be. 5.1 headsets are accurate, but some of them can have low sound quality. Virtual surround sound products are highly accurate for a small portion of the population, somewhat accurate for a decent sized population, and highly inaccurate for the rest.
What's my solution if you want good positioning from headphones? Try out a Dolby Headphone product, and try out a set of discrete 5.1 headphones. Keep the one that works best.
Now, there's a number of Dolby Headphone products. For console gamers, we're a bit more limited. The Turtle Beach DSS is well suited to Turtle Beach's line of wired headphones that have a way to connect to the Xbox 360 and have chat volume controls. The Astro MixAmp is well suited for PC style headphones that use dual 3.5mm jacks or a single 4-pole 3.5mm jack.
On the 5.1 front, the Tritton AX Pro and Razer Barracuda HP-1 have decent sound quality. But the Barracuda doesn't have a way to conveniently connect to the Xbox 360 or PS3. The AX Pro has convenient connections for both consoles. But the discrete 5.1 headset with the best sound quality is the Turtle Beach HPA2. It's a PC headset (like the Barracuda) so it needs special equipment in order to be used with consoles and there's a few other small drawbacks.
Now that I've given you a fairly decent overview of how sound works and the current technologies employed for 3D sound, let's talk about a few special cases.
Special Case- Turtle Beach HPX Series
First up is the Turtle Beach HPX Series (I'll refer to the series as HPX), which I consider the HPA2 to be a part of. The HPX is a curious headset. It can operate in stereo, Dolby Headphone, or true discrete 5.1 sound depending on what source you feed it. And it connects to these difference sources through different adapter cables. For simplicity, the combination of the headset with an adapter cable is given an extra designation. So we have the HPX-1, HPX-2, HPX-SS, HPA2, and AK-R8. A bit of background info, the HPX series has it's origins with the now obsolete Turtle Beach HPA headset.
The HPX series all have four drivers per earcup like in a true 5.1 setup. However, when used in the HPX-1 or HPX-2 configuration, it can only receive two channels of audio (those two channels can be plain stereo, or they can be Dolby Headphone virtual surround sound depending on the source). The left channel is played through two of the drivers in the left earcup (front left and rear left) and the right channel is played through front right and rear right. This doesn't give you any additional directional benefit compared to just playing it through one drive per earcup. Remember that towards the beginning of this article, I stated that the number of channels isn't necessarily locked into the number of drivers. So the effect is just that you have a bit more powerful sound compared to a single driver playing the same source.
The HPX-SS, HPA-2, and AK-R8 all utilize all four speakers per earcup. And if the source is 5.1, then they act as a true 5.1 headset. The HPX-SS is the basic version, it isn't amplified. The HPA-2 is exactly like the HPX-SS but includes an inline amplifier that helps power all 8 drivers (it requires a lot of electricity to adequately power that many drivers, and most devices don't adequately power the HPX-SS). The AK-R8, which isn't manufactured anymore, combined the HPX series headset with a Turtle Beach Audio Advantage SRM sound card that feeds it the true 5.1 source, but must be connected to a computer.
So, what that all boils down to is that if you buy the right adapter, you can convert any HPX Series headset to any other in that series. I can buy an HPX-1, and buy the inline amplifier that's normally with the HPA-2, and this converts my HPX-1 to an HPA-2. It behaves 100% as an HPA-2 when used this way. This makes the HPX series extremely versatile.
Special Case- Psyko 5.1
The other headset I wanted to talk about is the Psyko Audio 5.1 headset. It's different from any other headset before, and I'd like to congratulate Psyko Labs for their creativity. It's currently a PC only, and costs a lot of money. But they've demonstrated an Xbox version in the past and so this may be coming to consoles in the future. It uses a new system for creating direction in sound, and it's possible that this is the future.
If you'll recall, I talked about how the delay in time from when sound reaches your right ear to when it reaches the left and vice versa can help your brain determine the direction that the sound originated from, in addition to how your ear modifies the sound wave. Well, Psyko's solution is to place a row of drivers in the top headband of their headset, and create "tunnels" for the sound wave to travel to in order to reach your ear. The front left driver is placed towards the left side of the headband, and travels through these tubes (they called them waveguides) to each ear. However, since the driver is placed towards the left, there sound travels through the left waveguide and reaches the left ear slightly faster than the right side, resulting in the right side being slightly delayed. This simulates how sound works in the real world. And that's how the Psyko 5.1 creates its directional sound, by each driver's sound waves traveling through its two waveguides to your ears. It's a clever idea. I haven't tried this headset myself due to the expensive price. But my hope is that this method of generating sound becomes more common and more refined (and less expensive!) and we get a new wave of high quality surround sound headsets.
If you have any questions or comments, please give me feedback. I want this to be easy to understand, yet still give you an advanced understanding of how sound works.

