Current time:0:00Total duration:10:32
Bitcoin: Proof of work
A proof of work protocol is a vehicle really by which somebody can effectively prove to you that they've engaged in a significant amount of computational effort. Proof of work protocols often amount to puzzles. And these puzzles that can, on the one hand, be challenging to solve-- and by that I mean they require some serious computational effort and really can't be short circuited-- but on the other hand, that effort can actually be easily verified, and it can be verified in far less time than it took to conduct that effort in the first place. OK. And there are a number of applications of such protocols. So for example, if you've heard of the Bitcoin, the Bitcoin electronic payment system, that system actually leverages a proof of work scheme within the context of creating what are known as transaction block chains. Now aside from Bitcoin, which is a very recent user of these types of proof of work schemes, these schemes have been proposed in the past for other applications. For example, proof of work schemes have been proposed for doing things like deterring denial-of-service attacks, or DoS attacks. And here the idea is that the requester of a particular service would have to solve a very specific computational problem, a proof of work puzzle, before being allowed to use a service. And the idea here is that the computational effort exerted is effectively a way to throttle the requester. The responder can, in turn, easily check if the requester carried out the requisite work, and only if that work was carried out will the responder respond to that request for service. Now the original application for these types of proof of work protocols, the first place that I've seen it proposed, is in the context of being able to deter spam email. And then obviously, we all know what spam email is hopefully. These are messages that you don't want in your inbox that maybe come to you in an unsolicited fashion. And the idea here is that a proof of work protocol, it turns out it can be tied to a particular email message. And this is conceptually like affixing a postage stamp to a message, but rather than paying for that stamp using money, you're basically paying for that stamp via CPU cycles. So for a legitimate sender who is only sending out a small number of messages, this type of proof of work protocol will not amount to very much. It's going to be a minor deterrent since it's only executed a very small number of times. It's kind of an impediment, but it's not something that's so unreasonable. Now for a spammer, who might be sending out a lot of messages, maybe hundreds of thousands, or millions of messages, it might be prohibitively expensive to repeatedly expend so many CPU cycles for each message and each sender to whom that message is being sent. So hopefully this gives you a flavor for some of the applications of these proof of work protocols. Let me actually dive in to how they work in practice. So first of all, the way that I like to think of these protocols is that typically, they work relative to a given challenge string. And I'm going to call this challenge string-- we'll label it with the letter c. So c's going to be kind of a challenge string. And what the person trying to engage in the protocol will do, the prover of the work, will basically try to come up with a corresponding proof that is tied to this challenge string. It's going to be kind of a response associated with this challenge. It has a very specific mathematical property in relation to this challenge. And as you point out, maybe when I talk about a challenge string here, for example in the context of spam, this challenge string might actually represent an email message. So it's going to be something very specific to the task at hand. Now what the prover will do is come up with a response string, and let's call the response string r. Actually, let's use the term p for it, since maybe we can think of it as a proof, a proof or a response. And the idea is that the prover will come up with this proof or response string, and he has to come up with a string such that, when you concatenate the challenge and the response, and you take the two together, and you apply a cryptographic hash function-- so let's say I come up with a cryptographic hash function, like SHA-256, or anything of that nature. If I take the challenge string and the proof string and concatenate together and apply the cryptographic hash function, apply these mathematical transformations that represent the cryptographic hash function, I want to come up with a proof string such that the output under this hash function will have a very specific property. The prefix of the output, the first large number of bits will all be 0. So let's say the first 40 bits, or first 30 bits, or some number of bits will be 0. And then the other bits can be whatever they would normally be. So obviously, what you're trying to do here is come up with a proof string that has a relationship with the challenge string. And that relationship happens to be one that is taken with respect to a particular hash function, and really incorporates or considers what the output of the hash function will be when the proof string is concatenated with the challenge string. And if you, let's say, have a good cryptographic hash function, then the only known way to find this type of a proof string is to effectively try a lot of different possibilities, effectively doing brute force, by trying a lot of different proof strings until you find one that works. Now if you, let's say, needed to find an output that contained about 40 consecutive 0's in it, that would require you to perform about 2 to the power 40 steps, 2 to the power 40 different hash function indications. You'd have to try 2 to the 40 different strings, and one of them what would likely work if you tried 2 to the 40 such strings. That actually requires you to try about, and 2 to the 40 just to give you a sense, is approximately 1 trillion. So if you tried a trillion different strings out, and you hashed them each, you would likely come up with one string that had the first 40 bits being 0. Now sometimes it might take you a lot less than a trillion steps. Sometimes it might take you a little bit more. You may get very lucky. You might get very unlucky. But on average, it will take you about 1 trillion steps to find a string where the first 40 bits are equal to 0. So this is something that's not easy, but it's also not outside the realm of possibility. Now to understand why it's really hard to solve these types of proof of work schemes more efficiently than maybe simply doing brute force, I think it's helpful to recall that the output of a cryptographic hash function looks more or less random. In fact, each output bit looks like a series of coin flips. So it's kind of like flipping the coin, and if it comes up heads, you would have a 0, and if it comes up tails, you can think of it as a 1. And so what you're really doing is saying, if I flipped 40 coins, what are the odds that you would have 40 consecutive heads on those 40 coin flips? Now obviously that likelihood is very small, but it's not outside the realm of possibility. If you flipped 40 coins and you flipped those 40 coins about a trillion times, you would actually expect to see one instance in which all 40 coins came up as heads out of a trillion tries. Now one interesting thing with these proof of work schemes, is they can be ratcheted up or ratcheted down. So for example, let's say you want to require even more computational heavy lifting to come up with a correct proof string. Let's say you want to increase the work that's going to be proved here. What you can effectively do, in that case, is you could just increase the requirement on a number of leading 0's. So every time you add an additional 0, you effectively double the computational horsepower needed on average. And that's because you effectively requiring one more coin flip to come up heads, and that entails doubling the number of coin flips. So if I had 41 coin flips and I required 41 straight heads, that would require about twice as much effort as just requiring 40 straight heads. And likewise, every time you remove a 0 from consideration, or the requirement, that will reduce the computational horsepower needed to about half of what it was previously. So for example, if I only required the first 39 bits to be 0, that would require about half as many coin flips as requiring the first 40 bits to be 0. Now the neat thing is that once you come up with a solution-- let's say that somebody tries a trillion times and they finally come up with a proof string that works-- it's very easy to validate that this proof string is in fact a correct proof of work. All you have to do is, you take the challenge and you take the proof string and you hash them together. So for example, if somebody proposes this one string, let's call it p prime, all you do is you take the challenge and you take p prime, and you input them into a hash function, and you see if the first 40 bits are all 0. So all this requires you to do is apply a hash function once to the concatenation of c and p prime, and you can verify that the output indeed has the requisite number of 0's in front of it. And if you see that the output has the requisite number of 0's, then you can consider the proof of work valid, because you know it must have taken somebody a lot of time, a lot of tries really, to provide or come up with the string p prime, such that the concatenation of c and p prime gives you a number of 0's under the application of this cryptographic hash function. So as you can see, these schemes are quite simple, but quite clever at the same time. They really amount to coming up with a proof string that has a very specific, mathematical relationship with the original challenge string. So hopefully this video gave you a flavor for the mechanics of how these proof of work protocols work.