What Is Digital Sound?
Updated: Feb 11
To understand digital audio, we have to compare it to its analog counterpart. Analog sound is represented in time by a continuous wave of energy. It is the variation of this energy that moves a speaker forward and backward in its place, creating the air molecule displacement once again. As we mentioned earlier, sound is the continuous change of amplitude (or energy) through time. In digital audio, there is no such thing as continuous—only the illusion of a continuum.
In 1928, mathematician Harry Nyquist developed a theory based on his finding that he could reproduce a waveform if he could sample the variation of sound at least twice in every period of that waveform. A period is a full cycle of the sound (see Figure 1.5), measured in Hertz (this name was given in honor of Heinrich Hertz, who developed another theory regarding the relation between sound cycles and their frequency in 1888). So, if you have a sound that has 20 Hz, you need at least 40 samples to reproduce it. The value that he kept in the sample is the voltage of that sound at a specific point in time. Obviously, in the ’20s, computers were not around to keep the large number of values needed to reproduce this theory adequately, but as you probably guessed, we do have this technology available now.
Figure 1.5 The bits in a digital recording store a discrete amplitude value, and the frequency at which these amplitude values are stored in memory as they fluctuate through time is called the sampling frequency
How Sampling Works In computer terms, the amplitude (or voltage) is measured and its value is stored as a number. The number of bits used to store this voltage value determines the size of this number and the precision of this value. In other words, every bit keeps the value of the amplitude (or voltage) as a binary number. The more bits you have, the more values you have. You may compare this with color depth. When you have 8 bits of color, you have a 256-color palette. A 16-bit resolution yields 65,000 colors, and so on. In sound, colors are replaced by voltage values. The higher the resolution in bit depth, the smaller the increments are between these voltage values. This also means that the more increments you have, the less noise your amplifier creates as it moves from one value to another.
Because the computer cannot make the in-between values, it jumps from one value to the next, creating noise-like artifacts, also called digital distortion. Obviously, this is not something you want in your sound. So, the more values you have to represent different amplitudes, the closer your sound resembles the original analog signal in terms of amplitude variation. Time (measured in Hertz) is the frequency at which you capture and store these voltage values, or bits. Like amplitude (bits), the frequency greatly affects the quality of your sound. As mentioned earlier, Nyquist said that you need two samples per period of the waveform to be able to reproduce it. This means that if you want to reproduce a sound of 100 Hz, or 100 vibrations per second, you need 200 samples. This is called your sampling frequency, and like the frequency of your sound, it is also measured in Hertz. Because in reality, you have complex sounds and high frequencies, you need much higher sampling frequencies than the one mentioned previously. Because most audio components, such as amplifiers and speakers, can reproduce sounds ranging from 20 Hz to 20 kHz, the sampling frequency standard for compact disc digital audio was fixed at 44.1 kHz—a little bit more than twice the highest frequency produced by your monitoring system.
The first thing you notice when you change the sampling rate of a sound is that the more samples you have, the sharper and crisper the sound. The fewer samples you have, the duller and mushier it gets. Why is this? Well, because you need twice as many samples as there are frequencies in your sound, the more samples you have in your recording, the more high harmonics you capture in a sound. When you lose high harmonics, the sound appears duller to your ears. It is those harmonics that add definition to the sound. So, the more samples you have, the sharper the sound. If your sampling rate is too low, you not only lose harmonics, but also fundamentals. And this changes the tonal quality of the sound altogether.
Figure 1.6 shows two sampling formats. The one on the left uses less memory because it samples the sound less often than the one on the right and has fewer bits representing amplitude values. As a result, there are fewer samples to store, and each sample takes up less space in memory. But consequently, it does not represent the original file very well and will probably create artifacts that will render it unrecognizable. In the first set of two images on the top, you can see the analog sound displayed as a single line.
Figure 1.6 Low resolution/low sampling rate vs. high resolution/high sampling rate.
The center set of images demonstrates how the amplitude value of the sample is kept and held until the next sampled amplitude value is taken. As you can see in the right column, a more frequent sampling of amplitude values renders a much more accurate reproduction of the original waveform. If you look at the resulting waveform in the lower set of images, this becomes even more obvious when you look at the line representing the outline of the resulting waveform. The waveform on the right is closer to the original analog signal than the one on the left.
Sampling is simply the process of taking a snapshot of your sound through time. Every snapshot of your sound is kept and held until the next snapshot is taken. This process is called “Sample and Hold.” As mentioned earlier, the snapshot keeps the voltage value of the sound at a particular point in time. When playing back digital audio, an amplifier keeps the level of the recorded voltage value until the next sample. Before the sound is finally sent to the output, a certain amount of low-level noise is sometimes added to the process to hide the large gaps that may occur between voltage values, especially if you are using a low bit rate and low sampling rate for digital recording. This process is called dithering. Usually, this makes your sound smoother, but in low-resolution recordings (such as an 8-bit recording), it adds a certain amount of audible noise to your sound. Again, if this dithering wasn’t there, you might not hear noise, but your sound might be a little distorted.