A proof of work protocol
is a vehicle really by which somebody
can effectively prove to you that
they've engaged in a significant amount
of computational effort. Proof of work protocols
often amount to puzzles. And these puzzles that
can, on the one hand, be challenging to
solve-- and by that I mean they require some
serious computational effort and really can't be
short circuited-- but on the other hand,
that effort can actually be easily verified, and it can
be verified in far less time than it took to conduct that
effort in the first place. OK. And there are a number of
applications of such protocols. So for example, if you've heard
of the Bitcoin, the Bitcoin electronic payment system,
that system actually leverages a proof of work
scheme within the context of creating what are known
as transaction block chains. Now aside from Bitcoin,
which is a very recent user of these types of
proof of work schemes, these schemes have been
proposed in the past for other applications. For example, proof
of work schemes have been proposed
for doing things like deterring denial-of-service
attacks, or DoS attacks. And here the idea is
that the requester of a particular
service would have to solve a very specific
computational problem, a proof of work puzzle, before being
allowed to use a service. And the idea here is that the
computational effort exerted is effectively a way to
throttle the requester. The responder can,
in turn, easily check if the requester carried
out the requisite work, and only if that work was
carried out will the responder respond to that
request for service. Now the original
application for these types of proof of work
protocols, the first place that I've seen it
proposed, is in the context of being able to
deter spam email. And then obviously, we all know
what spam email is hopefully. These are messages that you
don't want in your inbox that maybe come to you in
an unsolicited fashion. And the idea here is that
a proof of work protocol, it turns out it can be tied
to a particular email message. And this is conceptually
like affixing a postage stamp to a message, but rather than
paying for that stamp using money, you're basically paying
for that stamp via CPU cycles. So for a legitimate sender
who is only sending out a small number of messages, this
type of proof of work protocol will not amount to very much. It's going to be a minor
deterrent since it's only executed a very small
number of times. It's kind of an
impediment, but it's not something that's
so unreasonable. Now for a spammer, who might be
sending out a lot of messages, maybe hundreds of thousands,
or millions of messages, it might be prohibitively
expensive to repeatedly expend so many CPU cycles for each
message and each sender to whom that message
is being sent. So hopefully this
gives you a flavor for some of the applications of
these proof of work protocols. Let me actually dive in to
how they work in practice. So first of all,
the way that I like to think of these protocols
is that typically, they work relative to a
given challenge string. And I'm going to
call this challenge string-- we'll label
it with the letter c. So c's going to be kind
of a challenge string. And what the person trying to
engage in the protocol will do, the prover of the
work, will basically try to come up with a
corresponding proof that is tied to this
challenge string. It's going to be kind
of a response associated with this challenge. It has a very specific
mathematical property in relation to this challenge. And as you point out, maybe when
I talk about a challenge string here, for example in
the context of spam, this challenge
string might actually represent an email message. So it's going to
be something very specific to the task at hand. Now what the prover will do is
come up with a response string, and let's call the
response string r. Actually, let's use
the term p for it, since maybe we can think
of it as a proof, a proof or a response. And the idea is that the prover
will come up with this proof or response string, and he has
to come up with a string such that, when you concatenate the
challenge and the response, and you take the two
together, and you apply a cryptographic hash
function-- so let's say I come up with
a cryptographic hash function, like SHA-256, or
anything of that nature. If I take the challenge
string and the proof string and concatenate together and
apply the cryptographic hash function, apply these
mathematical transformations that represent the
cryptographic hash function, I want to come up with
a proof string such that the output under
this hash function will have a very
specific property. The prefix of the output, the
first large number of bits will all be 0. So let's say the first 40
bits, or first 30 bits, or some number of
bits will be 0. And then the other bits can be
whatever they would normally be. So obviously, what
you're trying to do here is come up with a
proof string that has a relationship with
the challenge string. And that relationship
happens to be one that is taken with respect
to a particular hash function, and really incorporates
or considers what the output of
the hash function will be when the proof string is
concatenated with the challenge string. And if you, let's say, have
a good cryptographic hash function, then
the only known way to find this type
of a proof string is to effectively try a lot
of different possibilities, effectively doing
brute force, by trying a lot of different proof strings
until you find one that works. Now if you, let's
say, needed to find an output that contained about
40 consecutive 0's in it, that would require you to
perform about 2 to the power 40 steps, 2 to the power
40 different hash function indications. You'd have to try 2 to the
40 different strings, and one of them what would likely
work if you tried 2 to the 40 such strings. That actually requires you to
try about, and 2 to the 40 just to give you a sense, is
approximately 1 trillion. So if you tried a trillion
different strings out, and you hashed them
each, you would likely come up with one string that
had the first 40 bits being 0. Now sometimes it might
take you a lot less than a trillion steps. Sometimes it might take
you a little bit more. You may get very lucky. You might get very unlucky. But on average, it will take
you about 1 trillion steps to find a string where the
first 40 bits are equal to 0. So this is something
that's not easy, but it's also not outside
the realm of possibility. Now to understand
why it's really hard to solve these types of
proof of work schemes more efficiently than maybe
simply doing brute force, I think it's helpful to
recall that the output of a cryptographic hash function
looks more or less random. In fact, each output bit looks
like a series of coin flips. So it's kind of like
flipping the coin, and if it comes up heads,
you would have a 0, and if it comes up tails,
you can think of it as a 1. And so what you're
really doing is saying, if I flipped 40 coins,
what are the odds that you would have 40 consecutive
heads on those 40 coin flips? Now obviously that
likelihood is very small, but it's not outside the
realm of possibility. If you flipped 40 coins and
you flipped those 40 coins about a trillion times,
you would actually expect to see one instance
in which all 40 coins came up as heads out of
a trillion tries. Now one interesting thing with
these proof of work schemes, is they can be ratcheted
up or ratcheted down. So for example,
let's say you want to require even more
computational heavy lifting to come up with a
correct proof string. Let's say you want to
increase the work that's going to be proved here. What you can effectively
do, in that case, is you could just increase
the requirement on a number of leading 0's. So every time you
add an additional 0, you effectively double the
computational horsepower needed on average. And that's because
you effectively requiring one more coin
flip to come up heads, and that entails doubling
the number of coin flips. So if I had 41 coin flips and
I required 41 straight heads, that would require about
twice as much effort as just requiring
40 straight heads. And likewise, every time you
remove a 0 from consideration, or the requirement,
that will reduce the computational horsepower
needed to about half of what it was previously. So for example, if I only
required the first 39 bits to be 0, that would require
about half as many coin flips as requiring the
first 40 bits to be 0. Now the neat thing is that once
you come up with a solution-- let's say that somebody
tries a trillion times and they finally come up
with a proof string that works-- it's very easy to
validate that this proof string is in fact a
correct proof of work. All you have to do is,
you take the challenge and you take the proof string
and you hash them together. So for example, if somebody
proposes this one string, let's call it p prime, all you
do is you take the challenge and you take p prime,
and you input them into a hash
function, and you see if the first 40 bits are all 0. So all this requires you to do
is apply a hash function once to the concatenation
of c and p prime, and you can verify
that the output indeed has the requisite number
of 0's in front of it. And if you see that the output
has the requisite number of 0's, then you can consider
the proof of work valid, because you know it must
have taken somebody a lot of time, a lot of tries
really, to provide or come up with the string p prime, such
that the concatenation of c and p prime gives
you a number of 0's under the application of this
cryptographic hash function. So as you can see, these
schemes are quite simple, but quite clever
at the same time. They really amount to coming
up with a proof string that has a very specific,
mathematical relationship with the original
challenge string. So hopefully this
video gave you a flavor for the mechanics of how these
proof of work protocols work.