Main content

### Course: AP®︎/College Computer Science Principles > Unit 1

Lesson 3: Limitations of storing numbers# Number limits, overflow, and roundoff

When computer programs store numbers in variables, the computer needs to find a way to represent that number in computer memory. Computers use different strategies based on whether a number is an integer or not. Due to limitations in computer memory, programs sometimes encounter issues with

**roundoff**,**overflow**, or**precision**of numeric variables.### Integer representation

An

**integer**is any number that can be written without a fractional component. The same term is used in both programming and in math, so hopefully it's familiar to you.All of these numbers are integers: $120$ , $10$ , $0$ , $-20$ .

How can a programming language represent those integers in computer memory? Well, computers represent

*all data*with bits, so we know that ultimately, each of those numbers is a sequence of 0s and 1s.To start simple, let's imagine a computer that uses only 4 bits to represent integers. It can use the first bit to represent the sign of the integer, positive or negative, and the other 3 bits for the absolute value.

In that system, the number 1 would be represented like this:

0 | 0 | 0 | 1 |
---|---|---|---|

+/- | |||

sign |

The $0$ in the sign bit represents a positive number, and the $1$ in the right most bit represents the ${2}^{0}$ ($1$ ) place of the value.

What's the largest number this system could represent? Let's fill all the value bits with $1$ and see:

0 | 1 | 1 | 1 |
---|---|---|---|

+/- | |||

sign |

That's the positive number $7$ , since ${2}^{2}+{2}^{1}+{2}^{0}=(4+2+1)=7$ .

### Overflow

What would happen if we ran a program like this on the 4-bit computer, where the largest positive integer is 7?

```
var x = 7;
var y = x + 1;
```

The computer can store the variable

`x`

just fine, but `y`

is one greater than the largest integer it can represent with the 4 bits. In a case like this, the computer might report an "overflow error" or display a message like "number is too large". It might also truncate the number (capping all results to 7) or wrap the number around (so that 8 becomes 1). We don't want to end up in any of those situations, so it's important we know the limitations of our language and environment when writing programs.

Fortunately, most modern computers use 64-bit architectures which can store incredibly large integers. In JavaScript, the largest safe integer is 9,007,199,254,740,992, equivalent to ${2}^{53}-1$ . Beyond that, and we're in the danger zone.

### Floating-point representation

We've seen there are limitations to storing integers in a computer. Numbers that aren't integers, like fractions and irrational numbers, are even trickier to represent in computer memory.

Consider numbers like $2/5$ , $1.234$ , $9.999999$ , or the famously never-ending $\pi $ .

Computer languages typically use

**floating-point representation**for non-integers (and sometimes integers, too). It's similar to "scientific notation", a representation you might know from other studies.In floating-point representation, a number is multiplied by a base that's raised to an exponent:

Since computers use the binary system instead of the decimal system, the base for floating-point numbers is $2$ instead of $10$ . Because of that, numbers that are exactly powers of $2$ are the simplest to represent:

Numbers between powers of 2 look like this:

What about non-integers? Once again, powers of $2$ are the simplest to represent.

Floating-point can also represent fractions between powers of 2:

Once the computer determines the floating point representation for a number, it stores that in bits. Modern computers use a 64-bit system that uses 1 bit for the sign, 11 bits for the exponent, and 52 bits for the number in front.

Here's $0.375$ in that binary floating-point representation:

`0011111111011000000000000000000000000000000000000000000000000000`

The exact translation to bits is more complicated than we can go into here, but is a great topic for those of you who want to dive deeper.

### Roundoff errors

Floating-point representation still can't fully represent all numbers, however. Consider the fraction $1/3$ and its floating point representation:

In binary, $.\stackrel{\u2015}{3}$ is still an infinitely repeating sequence:

`0101010101…`

We can't store an infinite sequence in a computer! At some point, the computer has to end the number somehow, either by chopping it off or rounding to the nearest floating point number. Computers have to do that fairly often, as even fractions like $1/10$ (which is a short $0.1$ in decimal) end up as infinitely repeating sequences once converted to binary.

We often don't notice the lower precision of a number's representation until we use it in calculations. That's when we can experience a

**roundoff error**in the results.✏️ Here's a program that attempts to add $0.1+0.1+0.1$ . In the non-computer world, we know that's $0.3$ . But in the computer, each of the $0.1$ values is stored as a rounded-off binary fraction, and when they're added together, they don't quite equal what we expect…

The more bits we can use, the more precise our numbers and calculations will be. Modern 64-bit systems offer a high enough precision for low-stakes calculations.

Perhaps at some point in your life, you'll find yourself writing programs that calculate voting outcomes, power a self-driving car, or even launch a rocket. When the stakes are high, precision matters.

🙋🏽🙋🏻♀️🙋🏿♂️Do you have any questions about this topic? We'd love to answer— just ask in the questions area below!

## Want to join the conversation?

- 'Floating-point can also represent fractions between powers of 2:

0.750=1.5×2−1'

But wait a minute. 1.5 is also a floating-point. So how you represent that? This example does not explain much.(47 votes)- From the author:Apologies for the confusing explanation. Let's step through how the computer actually represents that number in floating point representation and see if that helps.

Let's look at 0.750:

0.750=1.5×2^−1

The number 1.5 is called either the "mantissa" or the "significand". The number -1 is called the "exponent" (as per normal math term).

The floating point representation of 0.750 in binary needs to include the sign (positive/negative), the mantissa, and the exponent.

Here's the binary:

001111111110

The first bit represents the sign, where 0 is positive.

The next 11 bits represents the exponent -1:

01111111110

That's the decimal number 1022. According to the floating point standard, the exponent is calculated by subtracting 1023 from that value. 1022-1023 is -1, which is indeed the exponent.

The final 52 bits represent the mantissa 1.5:

1000000000000000000000000000000000000000000000000000

That's a 1 followed by 51 zeros. According to the floating point standard, the first bit represents 1/2 (0.5), the second bit represents 1/4, the third bit represents 1/8, etc. The goal is for those bits to be able to represent values between 0 and 1, which then is considered a mantissa between 1 and 2. In this case, there is a 1 in the first bit, so this mantissa is 1.5.

The key thing here is that the mantissa only ever needs to be between 1.0 and 2.0 (excluding 2). If it goes above that, then the exponent can be increased instead.

I find it helpful to see what happens when you change bits in a representation. You can do that at this tool:

https://float.exposed/0x3fe8000000000000

It currently represents 0.75 in 64-bit floating point. Try clicking the 1s or 0s to see what happens when they take on different values.

I also recommend this explanation of floating point representation:

http://fabiensanglard.net/floating_point_visually_explained/(105 votes)

- Much harder to understand these without videos :/(34 votes)
- Why is a floating point called that way? What is floating about it?(14 votes)
- From the author:Great question. The floating part is the decimal (between the whole part and the fractional part), as floating point representation can both represent very large numbers with a lot of digits before the decimal (like 1292929.1) and very small numbers with a lot of digits after the decimal (like 1.29292929). Floating point representation can use its 52 bits to represent both the digits in the whole part and the digits in the fractional part. A contrasting type of representation is "fixed point", which always uses a certain number of bits to represent the whole part and a certain number of bits to represent the fractional part.

Here's a nice longer explanation: https://stackoverflow.com/questions/7524838/fixed-point-vs-floating-point-number(33 votes)

- I didn't understand how floating point representation number like 0.375 in wrtten in binary.It's getting very confusing for me!

Somebody help me out..(8 votes)- There are 3 parts of a floating-point representation (using the IEEE-754 standard). The first bit is used to determine the sign of the number, 0 is positive and 1 is negative. The next section is the exponent of the number represented in scientific form with an added bias. The final section of the representation is the mantissa of the number in scientific form after dropping the leading 1.

Suppose we want to convert 0.375 to its floating-point representation.

0.375 = 0.011 = 1.1 * 2^(-2)

sign bit = 0 since 0.375 is positive

exponent = bias + original exponent = 1023 + (-2) = 1021 = 01111111101

mantissa = number after the leading 1 = 1000000000000000000000000000000000000000000000000000

floating-point representation = sign bit, exponent, mantissa = 0011111111011000000000000000000000000000000000000000000000000000

(In this example, the floating-point representation is in double precision format.)(14 votes)

- are there some other programming scenarios when 2^1024 = infinity?(2 votes)
- In JavaScript, Math.pow(2, 1024) is Infinity due to the limitations of how numbers are stored in JavaScript. However, those limitations vary by language and environment. I just ran the same calculation in a Java environment and got a result of Infinity, but when I ran it in my local Python environment, I got a numeric result.(21 votes)

- Why do modern computers only use 64 bits? Can't they use as many bits as they want? Since more bits give more precise calculations.(3 votes)
- Hello!

This subject gets pretty deep, but it boils down to bit and computer architecture. 64 bits is 8^2 so 8 bytes, and as we learned a few lessons ago computers these days only work in chunks of bytes. Then most computers don't need more then 64 bits because it would be overkill unless we were in a professional setting where calculations needed to be exact.

Thanks for your question!(16 votes)

- I realized there is a pattern in binary numbers

00000

00001

00010

00011

00100

00101

00110

00111

01000

.....

the last row goes 010101010101010......

the second last row goes 00110011001100110011........

the third last row goes 000011110000111100001111.......

the fourth last row goes 000000001111111100000000111111110000000011111111.....

the fifth last row goes 00000000000000001111111111111111000000000000000011111111111111110000000000000000111111111111111100000000000000001111111111111111......

and so on(8 votes)- Nice observation!

If you look at the period between changing phases (1 -> 0 or 0 -> 1), you'll see you get exactly 2^i which corresponds to the ith row's contribution to the binary representation.

For instance, the third row changes every 2^2 = 4 cycles. Hence, the 3rd position (starting from 0) contributes 4 to the binary representation.(7 votes)

- I was experimenting with this in Swift using doubles and floats. I wrote:

var result2: Double = 0.1 + 0.1 + 0.1

var result3: Float = 0.1 + 0.1 + 0.1

The double returned 0.30000000000000004 and the float returned 0.3. In Swift Double represents a 64-bit floating point number and Float represents a 32-bit floating point number. So it did that because a float is... i guess... Less specific?(5 votes)- Funnily enough, yes.

A number like 0.1 isn't easy store, because it's difficult to turn into a power of two. You see when you tell your computer to work with the number 0.1, the number has to be converted.

Conversion (as a human mind you!) would work something like (we multiply by two, cut off the 1 if it pops up and don't stop until we hit 0 or we're tired of the algorithm)

0.1 * 2 = 0.2 | 0

0.2 * 2 = 0.4 | 0

0.4 * 2 = 0.8 | 0

0.8 * 2 = 1.6 | 1

0.6 * 2 = 1.2 | 1

0.2 * 2 = 0.4 | 0

at this point we're starting a loop :(

so the decimal 0.1 converts to the binary 0.00011001100110011 .... (the string 0011 will just keep repeating making the result more precise but never actually reaching 0.1)

A double can store twice the amount of information of a floating point, so it can happen that this extra space is used to "fix" rounding issues, by being more precise.(11 votes)

- then how are modern computers able to give accurate computations when some numbers require infinite bits? Also how was the computer able to display 2^1023 (8.98846567431158e+307) when it wasn't able to display 9007199254740993?(3 votes)
- I cound not answer ur problem and look around for a bit and found this:

Modern computers use finite-precision arithmetic to perform computations, which means that they can only represent numbers with a limited number of bits. This means that for numbers that require an infinite number of bits to represent exactly, such as irrational numbers like pi, the computer can only represent an approximation of the number. However, in practice, these approximations are usually accurate enough for most applications.

Regarding your second question, the reason why a computer can display 2^1023 (8.98846567431158e+307) but not 9007199254740993 is due to the way that numbers are represented in the computer's memory. In most modern computers, numbers are represented using a fixed number of bits, typically 32 or 64 bits. This means that the largest number that can be represented using 64 bits is 2^64 - 1, which is approximately 1.8 x 10^19.

In the case of 9007199254740993, which is larger than the largest number that can be represented using 64 bits, the computer cannot represent it exactly and therefore it is rounded to the nearest representable number. On the other hand, 2^1023 is well within the representable range for a 64-bit number and can therefore be displayed accurately.(11 votes)

- why does the computer perceive 0.1 as infinitely repeating?(5 votes)
- The binary number system cannot accurately represent 0.1 with finite digits. This is the same way that base-10 uses an infinite decimal expansion (0.33333...) for 1/3.(7 votes)