The real world is analog, a continuous stream of varying data.
Just look around you, beyond your computer or phone. There's an infinite amount of visual information. If you zoom into one part of your visual field, you can notice more and more details.
Now sing a little song to yourself. That's an infinite stream of audio information. Your voice is constantly changing in big and little ways, microsecond by microsecond.
Analog data is infinitely detailed. Computers can only store digital data, finite data in a binary representation.
So how can we capture the wondrous analog world of our senses and convert it into digital data? We can use a process of sampling, quantization, and binary encoding.
An analog signal
Let's start with a simple analog signal, a waveform representing a sound:
A graph with an x-axis labeled "t" that goes from 0 to 330 and a y-axis labeled "v" that goes from -100 to 100. A curvy line goes up and down across the graph.
All analog signals are continuous both in the time domain (x-axis) and in the amplitude domain (y-axis). That means that there is a precise value for every possible value of time, even as specific as "1.2345 seconds", and that value may be as precise as "47.8291824806423964 volts".
The first step is sampling, where we take a sample at regular time intervals. This step reduces the continuous time domain into a series of discrete intervals.
In this signal, where time varies from 0 to 330 milliseconds, we could take a sample every 30 milliseconds:
A graph with an x-axis labeled "t" that goes from 0 to 330 and a y-axis labeled "v" that goes from -100 to 100. A curvy line goes up and down across the graph. A series of straight lines intercept the curvy line every 30 units on the x-axis.
That gives us 12 samples of the signal between 0 and 330 milliseconds.
Now we can express the signal as a series of sampled points:
(0, 7) (30, 95.98676803710936) (60, -71.43289186523432) (90, -106.55949554687498) (120, -97.21617085937501) (150, -70) (180, -29.045472375000003) (210, 6.171340345703143) (240, 24.439022283203116) (270, -74.45763529492186) (300, -31.31245312500002) (330, 24)
The y-values are only as precise as our computer can store; numbers stored in computers aren't infinitely precise and may be rounded off.
🔎 Is 12 samples enough? Play around with the sampling interval in the interactive below and observe the effect of choosing different intervals:
The inverse of the sampling interval is the sampling rate: the number of samples in a second (or other unit of time). For example, a sampling interval of 30 milliseconds corresponds to a sampling rate of 33.33 samples per second.
According to the Nyquist-Shannon sampling theorem, a sufficient sampling rate is anything larger than twice the highest frequency in the signal. The frequency is the number of cycles per second and measured in Hz (hertz). If a signal has a maximum frequency of 500 Hz, a sufficient sampling rate is anything greater than 1000 Hz.
A typical sampling rate for music recordings is 48 kHz (48,000 samples per second). That's a little over double the highest frequency that humans can hear, 20 kHz. If the audio only contains human speech, as is often the case for phone calls, a much smaller sampling rate of 8 kHz can be used since 4kHz is the highest frequency in most speech.
After sampling, we are still left with a wide range in the amplitude domain, the y values. The next step of quantization reduces that continuous amplitude domain into discrete levels.
For our simple signal, where amplitude varies from -100 to 100 volts, we can apply a quantization interval of 25 volts:
A graph with an x-axis labeled "t" that goes from 0 to 330 and a y-axis labeled "v" that goes from -100 to 100. Sampled points are shown as orange circles. Lines go from the x-axis to near each of the sampled points, at an intersection with horizontal grid lines.
Now the 12 points all have y values that are multiples of 25:
(0, 0) (30, 100) (60, -75) (90, -100) (120, -100) (150, -75) (180, -25) (210, 0) (240, 25) (270, -75) (300, -25) (330, 25)
🔎 What's the best quantization interval? Play around with different quantization intervals below and observe how far the quantized points are from the sampled points:
The ideal quantization interval depends on our use case and physical constraints. If there is enough space to represent thousands of different y values, then we can use a very small quantization interval. If there is limited space, then we can use a large interval.
The quantizing step always introduces some amount of quantization error, which is measured by comparing the actual signal value with the quantized value at each sampled point. However, some level of quantization is always necessary for storing analog data in digital form, due to the finite nature of a computer's memory and its numeric precision.
That brings us to the final step: binary encoding. If there is a limited set of quantized y values, the computer does not need to store the actual value. Instead, it can store a much smaller value that represents the quantized y value.
For this signal, a quantization interval of 25 resulted in 9 possible y values. We can map the 9 values to the binary numbers
A graph with an x-axis labeled "t" that goes from 0 to 330 milliseconds and a y-axis labeled "v" that goes from -100 to 100. A series of lines are shown every 30 milliseconds, with each line ending at a circle that intersects a horizontal grid line.
We can then encode the signal into this binary sequence:
0100 1000 0001 0000 0000 0001 0011 0100 0101 0001 0011 0101
For a computer to understand that sequence, our digitized version would also need to include a description of how the sequence was sampled and encoded.
This encoding uses 4 bits per sample. The number of bits per sample is also know as the bit depth. The lowest bit depth is 1, which can only describe 2 values (0 or 1). The standard bit depth for telephone calls is 8 bits (256 values) and the recommended bit depth for YouTube music videos is 24 bits (over 16 million values).
🔎 Play around again with the quantization interval and observe how the bit depth changes. What intervals only need 2 bits? 4 bits? 6 bits?
We often store analog signals in digital storage so that we can reproduce them later, like playing back an audio file or displaying an image. When a device wants to convert a digitized signal back into an analog signal, it will attempt to reconstruct the original continuous signal.
For this signal, a simple reconstruction strategy could interpolate a smooth curve through the quantized points:
A graph with an x-axis labeled "t" that goes from 0 to 330 milliseconds and a y-axis labeled "v" that goes from -100 to 100. A series of lines are shown every 30 milliseconds, with each line ending at a circle that intersects a horizontal grid line. A curve is overlaid on top that joins those circles.
How well does that match the original? We can overlay the curves to see the difference visually:
A graph with an x-axis labeled "t" that goes from 0 to 330 milliseconds and a y-axis labeled "v" that goes from -100 to 100. A curvy line goes across the graph and is overlaid with another similar curvy line.
The reconstructed signal looks very close to the original, but misses a few details. If we can decrease the sampling interval and lower the quantization error, we can bring the reconstructed curve closer to the original signal. We could also use different strategies for reconstructing the signal.
🔎 Play around below with different sampling rates and quantization intervals. How close can you get to the original curve?
The first step of sampling converted an infinite stream to a finite sequence. In quantization, the values in that sequence were approximated. Finally, the values were encoded into bits for storage on a computing device. At some later point, a device could interpret those bits to attempt a reconstruction of the original infinite stream of continuous values.
Whenever we convert analog data to digital data, whether it's audio or visual, our goal is to sample data with enough precision so that we can reconstruct it later at the desired quality level but not exceed our data storage capacity.
Land-line telephones use relatively low sampling rates and bit depths, since the data must travel over telephone lines, whereas movie directors record film at very high sampling rates and bit depths, so that they may replay it on giant screens later.
🤔 Find a device near you that converts analog data into digital data. What sort of space constraints does it have for storing or transferring the data? What sort of detail is lost in the digitized version?
Want to join the conversation?
- Why are their few spoken lectures for this subject? I do not learn well from notes.(26 votes)
- Could someone please give me a basic understanding of this lesson? I think a video for this lesson would really help a lot of people.
I am really confused.(8 votes)
- A big task when trying to translate information from the daily world into the digital is reduction.
Reduction means you take a problem and reduce it to a simpler, one we or rather the computer can manage.
In the article the problem is sound. Now reproducing sound one to one is an impossible task, because the tiny tiny differences that can't all be quantified; there are just too many.
So what we reduce the problem. We create buckets in which we put sounds depending on their pitch because we choose the buckets they're finite and so we can create computer programs that do a pretty good job at reproducing sounds.
Another way to imagine it would be like, a bad artist wants to draw all the continents (I'm a really bad one :^) ), just outright drawing the landmasses is out of the question, the artist is too bad for that. So what we do is use a ruler to draw boxes (boxes are easy with a ruler) that represent the continents and then we draw smaller boxes inside the boxes and use an eraser to give the landmasses their general shape.
So using smaller and smaller boxes we create a picture of the continents even though the limitation of being a bad artist.
Does that help?(37 votes)
- Hi, can you define quantization interval? Like I get what it means, I just need it explained for notes, I can't word it out.(2 votes)
- Discrete data is limited to certain values that make countable, it might be really large but the data point is either in one category or it isn't (e.g. red, green, blue, or older than 21, younger than 21). Continuous data has no strict boundaries, everything can move on a given interval (e.g. temperature).
Quantization transforms continuous data into discrete data. That means we turn data that could take on any value on a given interval and make it discrete, i.e. create specific categories where we can put in (we create buckets which we use to sort the data).
The quantization interval is the step size we use in our discrete set. So if we think of age as continuous data and make it discrete, we have to pick groups (or buckets) which we use to categorize the data. So we could pick the quantization interval 5 and group all people in the data set into the ages 0, 5, 10, 15, 20, 25 ..., using a rounding scheme to put everyone in the right bucket.(20 votes)
- I really confuse with this topic. Is there a video or something else because I read it a million times but can't understand anything?(4 votes)
Converting analog (continuous) data to binary is done in 3 steps.
Note: Let's assume that we have an coordinate plane where x represents time and y represents the output like let's just say sound.
1. Sampling. First we take samples of "x" values (where "x" is a variable that represents time). We choose a sampling interval and we take samples of x-y pairs at these points. Right now we have collected (stored) x-y pairs where x is an integer (based of the sampling interval) and y could be some crazy decimal.
2. Quantization. In quantization we essentially round the 'y' values to the nearest multiple of a number we choose. This number is called the quantization interval. Now we have x-y pairs where x & y are both integers.
3. Binary Encoding. Now we need to write down these x-y pairs to be stored. To do this we count the number of unique y values we have. If we have only 4 unique y values for an example, that means we could represent all of these with only 2 bits.
00 --> y1
01 --> y2
10 --> y3
11 --> y4
So now we can store those numbers representing those y-values: but note that the computer storing this will need to include instructions for other computers to read on how they sampled & stored this: this way, the computer reading this file can essentially do the reverse of what this did!
Hope this helps,
- Convenient Colleague(12 votes)
- In the below quoted section of this article, not sure if the binary sequence is full of typos or I'm misunderstanding the meaning here. The binary sequence doesn't match the image referenced.
"We can then encode the signal into this binary sequence:
0110 1001 0001 0000 0001 0011 0101 0110 0111 0000 0001 0111"
The image shows points at:
0100 1000 0001 0000 0000 0001 0011 0100 0101 0001 0011 0101(9 votes)
- From the author:Great catch! I must have changed the interval when I first wrote that binary code. I'll update it, thanks for the feedback!(2 votes)
- “If we can increase the sampling interval and lower the quantization error, we can bring the reconstructed curve closer to the original signal.”
there may be a typo? Won't "increase the sampling interval" bring the reconstructed curve further yo the origin?(4 votes)
- From the author:I agree, that's a typo. We can either decrease the sampling interval or increase the sampling rate. I'll fix, thank you Jerry!(6 votes)
- Thanks for the article- I just want to clarify the meaning of the sound wave example.
Is it true that the frequency of a sound wave corresponds to the pitch of the sound (meanwhile the amplitude of the wave corresponds to the volume of the sound)? If so, is the process of converting analog to binary data just a process that stores information to recreate the varying pitches and volumes of real-life sounds to the most accurate extent possible?(4 votes)
- Yes, that's correct and yes that's the goal of the conversion. The article explains the steps necessary to achieve that target.(3 votes)
- Wait, what are cycles? What is frequency, exactly?(2 votes)
- "The frequency is the number of cycles per second and measured in Hz (hertz)."
Sound can be measured using oscillating waves, to measure sound we use the frequency of the oscillation.
I'm not really satisfied with this explanation, so I would suggest you search for
"20Hz to 20kHz (Human Audio Spectrum)"
on Youtube, that should give you a better idea about why frequency and cycles are used when talking about sound.(4 votes)
- Are "play around" boxes made using java-script ?(2 votes)
- You can check out the source code on Pamelas project page.
- Hey, whenever I interact in any way with the displays (change the sampling rate or quantization interval, change which values are seen, etc.) the graph simply reverts to a blank gray square. Changing the settings back doesn't help. The values graphed on the side don't change. Is this my fault?(2 votes)
- A similar thing happened to me where the grey square was showing. You just have to wait for it to load, it sort of takes a while. But if its not loading then refresh the page. But if thats not working then it might be glitching or something.(1 vote)