Main content
Algebra (all content)
Course: Algebra (all content) > Unit 11
Lesson 31: Logarithmic scale (Algebra 2 level)Benford's law (with Vi Hart, 2 of 2)
Vi Hart visits Khan Academy and talks about the mysteries of Benford's Law with Sal. Created by Sal Khan.
Want to join the conversation?
- I heard about an indigenous culture somewhere (S. America?) in which the counting was logarithmic. Specifically, when asked what number was half-way between "1" and "9", the answer given was "3." Sorry I do not have a source, but it may have been Radiolab or something like that. Very interesting.(38 votes)
- Thanks for that - I looked it up! They are the Mundurucu culture in the Amazon(27 votes)
- So powers of two follow Benford's distribution when expressed in decimal, but if I were to express them in binary (1, 10, 100, 1000, ...) everything falls in the '1' bucket. Question: do the "natural" data (like populations, financial data, physical constants) follow Benford's law under every number base?(22 votes)
- Well, in base 2, powers of 2 become the special case, while powers of 10 would now follow Benford's Law, I think (but expressed in a different way). According to Wikipedia there's a more general version of the rule that applies to any/all bases.
It also has examples where the same data still follows the law even if you use different units.(18 votes)
- Usually, Sal is an excellent teacher for me, but I still have a few questions. I understand that on a logarithmic scale this makes sense, but it doesn't work on a linear scale. There are still the same number of numbers between 1 and 2 as there are between 2 and 3. So, why is there a greater probability for a number to land between 1 and 2 than there a chance to land between 2 and 3? Also, would Benford's distribution also apply to non-exponential sets?(10 votes)
- You're on the right train of thought.
When you ask '[W]hy is there a greater probability for a number to land between 1 and 2 than there a chance to land between 2 and 3?', you need to consider that (assuming we are accepting any arbitrary set of numbers obtained from some real-world observations) the sub-range you refer to as 'between 1 and 2' refers just as much to those numbers between 10 and 20 (also 100 and 200 and so on, hence we are working with logarithmic scales), such that the sub-ranges of numbers (those numbers starting with the same digit) being examined at any given scale may always be considered as having an equal probability of 'being picked' as any of the other sub-ranges, but only when considering that individual scale. In this way you are correct, but your confusion stems from realising this truth, whilst not yet seeing the whole picture.
The next step involves thinking about how these probabilities vary with the scale being considered. As we might be talking about any set of numbers obtained from some real-world observations, the scale (think also about the range or upper limit) that such numbers span is, hopefully intuitively, not at all biased to a neat, round number like 1000. By this I mean that in the natural world there is no reason for, for example, the population of a town to tend more towards that tidy number of 1000 or 10,000; for it to do so would be as strange as saying that populations prefer to stop growing at nice round numbers, for the sake of making equal those probabilities of picking each of our sub-ranges (10 to 20, 200 to 300, etc.). Hopefully this is not too convoluted to follow. It is then this complete lack of bias of 'real-world data scales' toward nice round numbers that is responsible for changing up the probabilities of picking those sets of sub-ranges (I believe this is a direct answer to your question quoted above, which now, hopefully, in sufficient context should make more sense to you. Also, this is my first KA comment, so sorry if it's not so concise.)(24 votes)
- If we look at the behavior of the universe we will see a lot of exponential and logarithmic behavior in it. Physical constants are what determine the 'universe', so then there exist a relation Physical constants - exponential that would make it follow the Benford's Law?(17 votes)
- Probably. Exponents and logarithms are related to Benford's law; constants are also related to the law (see previous Benford video); so there must be a relation between exponents and constants. Which isn't too surprising -- constants with exponents form some important laws of nature, themselves.(5 votes)
- Basically the reason seems obvious, to get to 2 u must pass through 1, to get to 3 you pass through 1 and 2...etc... sotherefore the chances of a random stop being on 1 are greater than 2 because everyone passes through or stops on one, the remainder (slightly less) all pass through or stop on 2. Cute relationship with Fibonnacci but hardly surprising methinks...(4 votes)
- Uh. That reasoning is a bit sketchy. The very first
1
at the beginning of a logarithmic scale isn't the one you must magically pass through to get to everything else. In fact, if you go left on the scale the first thing you hit is9
, which you must "pass though" to get down to8
, etc etc all the way down to the first time you hit a one at1/10th
.(8 votes)
- all natural increments follow logarithmic progression. so, a random series based on such a progression is likely to follow the benford's law. Increase in population is natural increment just like Fibonacci sequence or stock market data. Although, i am still confused why physical constants follow this law when the choice of scale lies entirely in our hand. would they follow this law for any choice of scales or is the present scale chosen something special?(2 votes)
- I just can't help but feel this concept is being made more complicated than it needs to be...
All of these examples are in some sense sequential. Population growth, stock market changes, etc. For example: before a country can have a population of 20, it has to have a population of 19, 18, 17, so on and so forth.
And so even on that tiny scale (20), you can immediately see that the chances of starting with a 1 are 11 out of 20. And this persists all the way up to 99 - you have to roll against the 11/20 odds for every number before 90 before you even get a chance at a 9. And once you move into the 100s, 1000s, etc. these odds become even more drastic.
Basic odds (20 marbles, 11 are blue) means that a 1 is more likely. And probability repeated to absurdly high numbers just enforces that trend.(3 votes) - Does the logarithmic scale have to be based on log_10? I suspect that plotting powers of eight on an octal logarithmic scale would break the Bensford's law.
Is there some non decimal numeric system that doesn't follow Bensford's law for any statistics?(3 votes) - What is Fibonacci sequence?? I have no clue about them...
Could someone help me with an easy explanation.....
Tnx in advance...
Captain Leo
#PrayForBarcelona(1 vote)- The Fibonacci sequence is the one that starts 1,1 and then proceeds with the recursive definition Fₙ = Fₙ₋₁ + Fₙ₋₂
So we have:
F₁ = 1
F₂ = 1
F₃ = F₂ + F₁ = 1 + 1 = 2
F₄ = F₃ + F₂ = 2 + 1 = 3
F₅ = F₄ + F₃ = 3 + 2 = 5
etc
Giving 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, ...
The sequence has all sorts of interesting mathematical properties. Not all of them easy to understand. Use your favourite search engine, and you'll be able to find numerous articles on the Fibonacci numbers.(4 votes)
- What do they mean by "most significant digit"?(1 vote)
- The first digit from left-to-right that is not a zero. For example for 879 the most significant digit is 8. For 0.00365 the most significant digit is 3.(4 votes)
Video transcript
SAL: So where we left
off in the last video, Vi and myself had
posed a mystery to you. We had talked about
Benford's law. VI: And we asked, what
is up with Benford's law? SAL: This idea that, if you
took just random countries and took their population and
took the most significant digit in their population and plotted
the numbers of countries that their most significant
digit is a 1, versus a 2, versus a 3, you just
had it was much more likely that it would be a 1. Or that, if you took physical
constants of the universe, that they're most
likely to have 1 as their most significant digit. VI: I wish we had more graphs,
because graphs are fun. SAL: Yes. VI: But if you
look at information from the stock market
or anything, what's up? SAL: Yes. And it seems to all
follow this curve. And what was
extremely mysterious-- and this is where we
finished off the last video-- was if you look at
pure, I would say, compounding phenomenon, like
for example, the Fibonacci sequence, or powers of 2,
that exactly fits the Benford distribution. It exactly fits this. If you take all the
powers of 2, a little bit over 30% of those powers of
2, all of the powers of 2 have 1 as their most
significant digit. What is this? 17? Roughly 17% of
all of them have 2 as their most significant digit. VI: Yeah. Although in this case,
there's an infinite number in every set, so
it's harder to graph. SAL: But if you
wanted to try it out, you could take the first
million powers of 2 and then find the percentage. And that will probably give
you a pretty good approximation of things. VI: Yeah. So to me, that's
like less mysterious. On the one hand, wow, this
fits exactly with mathematics. But that also gives you
a really good handle, because you realize, alright,
there's something here I can actually take a look at. SAL: You could take
a look at it and it starts to become something
you can dig deeper in. And we said, in
the last video, we wanted you to pause it and think
about why this is happening because, frankly, we had
to do the same thing. And a big clue
for us was when we looked at a logarithmic scale. And we're looking at
one right over here. And just to be clear,
what's going on in this logarithmic scale is you
see equal spaces on this scale are powers of 10. So on a linear scale,
this would be a 1. And maybe this would
be a 2 and then a 3. Or if we wanted to
say that this is a 2, you would say this is a 1, this
is a 10, this would be a 20, then would be a 30,
so on and so forth. But in a logarithmic
scale, equal distances are times 10 or, in this case,
if we're taking powers of 10. So this is 1:10, then
10:100, then 100:1,000. And you see how the numbers
in between fall out, that the space between
1 and 2 is pretty big. And then 2 and 3 is still pretty
big, but a little bit smaller. And then 3 and 4 gets smaller
and smaller and smaller, until you get to 10. And that's a pretty
big clue about what's going on with Benford's law. VI: Yeah. It seems to match up somehow. So there's a connection here. SAL: And it actually turns out--
and this actually a very big clue-- that this, if
you take this area right here as a percentage
of this entire area, it's exactly this percentage. It's exactly that
percentage there. And if you take this area as a
percentage of that entire area, it's exactly this
percentage, that roughly 17%, or whatever that number
is right over there. So that's a huge clue. VI: Yeah, or at
least for powers of 2 or a Fibonacci sequence
thing-- for powers, it definitely makes sense. SAL: Yes, for any powers. And so the logic
is-- and this is now our biggest clue-- is to
actually plot the powers of 2 on a logarithmic
scale like this. VI: All right, let's
see where they fall. SAL: All right,
let's try it out. So 2 to the zeroth power is 1. 2 to the first power is 2. Then you get to 4. Then you get to 8. Then you get to
16, which is going to be someplace around here. Then you want to
go to 32, which is going to be someplace
around there. That's 30, so this is 32. Then you want to go to 64. And so this is 40, 50, 60. 64 is going to be
right over there. And so what you see is, when
you plot the powers of 2 on this logarithmic scale,
they're equal distance apart. So you keep stepping along. If you were to plot
on a linear scale, they'd get farther
and farther apart. VI: Yeah. SAL: Actually, twice as
far apart every time. But on this scale right over
here they are equally spaced. So what's happening
is you have something that's just equally
stepping along here. You can imagine even just
like walking along this. And if your sidewalk is shaped
like this logarithmic scale, the probability on
any given step, as you do many, many steps, or as
you count all the steps, you're going to have
many, many more steps that fall into the block
that's between 1 and 2, or between 10 and
20, than you will, for example, the block
that's between 9 and 10. VI: Yeah, if you just take
a random point along here, you're more likely to fall
in a area starting with 1. SAL: Right, one of these areas. Exactly, starting with 1, so
between 1 and 2, or 10 and 20, or 100 and-- and
that's exactly-- VI: So taking equal steps
is going to give you that distribution, unless
your steps happen to-- because there's
special cases, right? So if you're getting-- SAL: Or people walk
logarithmically. [LAUGHS] VI: Yeah, if you walk from 1 to
10, if your steps are 10 long-- SAL: Yeah. In special cases, yes. VI: So that's what
happens there. SAL: If your steps are 10 long-- VI: You happen to
exactly on [INAUDIBLE]. SAL: Right. But if you're anything--
any slight variation away from that exact
thing, and then you will get the distribution. VI: Yeah. You're going to end up
stepping all over the place. SAL: The Benford's distribution. VI: Benford's distribution. SAL: Even though I think
we now understand why, it's still fascinating. VI: Yeah. Well, this explains it
for these number series. SAL: Yes. VI: So now we have
to somehow figure out how to connect that to the
real world information. SAL: The general idea,
well, so for populations. And we read up a
little about it. And Benford's
distribution tends to work for things that
grow exponentially. VI: Yes. SAL: Like powers of 2. VI: Like powers of 2. SAL: Like powers of 2. And populations
grow exponentially. VI: Yeah. And in finance, a lot of
things also grow exponentially. SAL: Yes. Or decline exponentially. Either way. [LAUGHTER] SAL: But it tends to
operate exponentially. You keep growing
by 10% every year. That's an exponential path. What's fascinating is
physical constants. And we actually aren't 100%
sure why this is happening. VI: No. This is still crazy to me. SAL: We only have theories here. And the general idea--
because, you know, physical constants is sort
of dependent on the units you're dealing with. They're depending on a
whole bunch of things. Actually, I have a few
very loose theories. But I'll let you all
think about that more. VI: OK. SAL: All right? And so, hopefully,
you all enjoyed this.