Computer scientist Amit Sahai, PhD, is asked to explain the concept of zero-knowledge proofs to 5 different people; a child, a teen, a college student, a grad student, and an expert. Using a variety of techniques, Amit breaks down what zero-knowledge proofs are and why it's so exciting in the world of cryptography.
Amit Sahai, PhD, is a professor of computer science at UCLA Samueli School of Engineering.
From S1 E16 of Computer Scientist Explains One Concept in 5 Levels of Difficulty | WIRED
Jacques Patarin[1], my cryptography teacher at university introduced us with zero-knowledge proofs with a simple real-life zero knowledge scheme. All you need is 5 cards, 3 identical red and 2 identical blacks.
You give one black and one red to each person (Alice and Bob), and keep the last red.
The scheme is the following:
Alice will puts her two cards ON TOP of the remaining red card: to say yes, she puts her black card on top of her red card, to say no she does the opposite (red on top of black).
Bob will put his pair of cards BELOW the remaining red card, and to say yes he puts his black card at the bottom, with his red card in between, and to say no he does the opposite.
Then you cut the deck enough times to obfuscate who've done what, and you know that they've both day yes if you have the two blacks cards next to each other (or both at each ends of the deck). If anyone (or both of them) said no, you'd have black cards separated by one red card.
I have designed one of such card decks and released it under an open source license at DKC - Dining Kid-Cryptographers [0]. This was done as part of an EU Hackathon at Google, Belgium.
Hi. Could you add a section that explains what this is about? The linked repo does not explain what the objective of the game is supposed to be.
I understand that the game is related to "the dining cryptographers", but I haven't memorized that piece. I read it once? I can probably find it on Wikipedia?
I recommend making your game a self-contained work that does not rely on at least one of the players remembering some details about a very specific thought experiment.
Hi! Thanks for your feedback, this is appreciated.
While there is no background information about the game in the DKC repository at the moment, the game comes with a set of cards where the equipment, rules, and finishing criteria are described [1]. If there is interested for that, I will add more context on the game and its contribution towards understanding secure multi-party computations for the youngest. I may still have the slide deck I used to present this work at the end of the hackathon, which may provide some further introductory material.
You divide the cards stack in two and put the top part at the bottom of the other part (which was the bottom part and will be now the top). The relative order of the cards is maintained: you don’t shuffle them. But if you cut them in this way several times, it obfuscates the original location of the cards (as it’s in practice impossible to follow their position changes) so you can’t tell who put which card where. Therefore the final arrangement, when revealed, is anonymized.
Okay gotcha, what was confusing me was how we could randomize it and preserve the relationships of the cards at the same time, but I see now this shuffle is more like "secure by magic trick" than "secure by cryptography", that makes sense. Or equivalently, we put it into a black box, that flips a secure coin, and either does or doesn't swap the cards. Thank you.
The only thing that makes sense is keeping the middle red card in place and swapping the top two for the bottom two repeatedly? Anything else could change the answers?
I personally prefer the red ball - green ball version, where a colourblind person can find out if both balls are the same color or not. Simpler, with fewer variables and still very clean in its explanation.
I don't really understand this. Who is hiding what from whom? Is the idea that the outside observers can discover that either Alice or Bob said no, but not know which one?
Why do that with five cards? Why can Alice and Bob not just hand a single card to you, saying "Yes" or "No". You then mix the cards and reveal them. Now you know if there's a "No" but not who said it.
Just using single cards will let bystanders know when both Alice and Bob say "No". The 5 card solution makes it impossible to differentiate between when both say "No" and when only one says "No".
If you can verify half of a set of bits, sometimes you can use an error-correcting code to verify a full set of bits. Or sometimes to verify a property of those bits.
Think of RAID storage. If some drives can't be read, but not too many, you can still read all the data.
These days I'm working on "proof of execution" and "proof of data availability", which are an interesting application of zero-knowledge proofs to very large sequences of values. The steps aren't revealed, but the rule which the steps must follow is revealed.
It's almost magical how it's possible to prove that N billion steps of a simulated virtual computer, run exactly according to specification, complete with CPU and memory, given a particular set of input files (or even a hash of an input) and program (or hash of a program), produces a particular set of output files.
All summarised in a short proof that a verifier can easily and quickly check, so it doesn't have to redo the computation itself - it can trust the outputs it receives, due to the proof.
Does this mean we could have cryptocurrencies with proof-of-work mechanisms that perform useful work? You could have a decentralised crowd sourced general supercomputer which could perform folding-at-home, neural network training etc. and you could get paid for it with no middleman?
Sort of, yes, but doing that would be a waste of energy, compared with altruistically donating your cycles like in folding@home.
The cost of proving execution is high compared with the cost of execution without proof.
(Even in a dedicated proof-of-execution hardware CPU, which by the way I'm also working on a little, in case someone out there finds themselves looking to hire someone doing this sort of thing ;-)
On the other hand, for proof-of-stake and similar systems, these kinds of proofs completely change the dynamics because there is no need for every node to duplicate all the state machine updates. Current blockchains have hundreds or thousands of full nodes, each of them executing the same code to verify the behaviour is the same. They can accept proofs instead, so that executions are done by only a few nodes. That greatly changes the scaling cost.
With a few more tricks it starts to look more like a giant CRDT with lightweight nodes maintaining just a "view" of the part they are interested in, and some more tricks on top of that for latency hiding, it starts to look more real-time, as in MMO levels of interactive performance, not the blockchain you are used to.
Roughly, closer to 100x-10000x than +10%. GPU vs CPU vs FPGA vs ASIC makes a lot of difference. New algorithms are being discovered or developed that improve the speed. I think we may see roughly 10x in the next few years on ideal hardware for the task, but the economics might not be there to pay for building it.
Can you achieve something similar by having multiple (e.g. 5) randomly chosen nodes perform the same calculation and compare their results? Or is there always a way around that?
That's a different sort of proof, the computer equivalent of social proof.
It's not as cryptographically strong as zero-knowledge proofs can be. Those 5 nodes could all be dishonest, unless you know they are not.
Whereas you don't have to trust a zero-knowledge prover at all. A dishonest prover will "always" be detected (for a definition of "always" similar to general cryptography, i.e. extreme probabilities). Zk proofs (of the type I'm using) are more like fancy hash functions, where there's one correct answer and no node-specific key. Unlike, say, TLS PKI where you have to obtain certificates and trust particular nodes. (There are sometimes keys in zk, called "trusted setup", but those keys are computed in distributed ceremonies and shared with everyone.)
That said, there are some interesting tricks, similar to your idea, for distributing a zk computation among nodes who cross-verify-and-prove each other's work without needing to trust each other, reducing the amount of work per node and perhaps latency.
You do need the social proof type as well in practice. The zk proofs assert some fact which may accompany data, and some clever networking protocols can build on that, like proofs of (recent) data availability. But for consensus protocols, the zk proofs don't tell you which of those facts everyone is collectively agreeing to. For that, comparing results is required.
In blockchains where the p2p networking is automated, typically there's an assumption that your node will find at least one honest node with respect to the consensus, or some small number, and their information will outweigh bad information from other nodes. While bootstrapping more honest nodes are required than later, when following a networking in its live state.
But there's always a possibility that a node is only able to connect to a coordinated group of dishonest nodes, especially if someone is hacking the network aroud the target node, e.g. by affecting the routers. That's called an eclipse attack.
Note: I keep talking about blockchains because you asked about PoW, and that's where a lot of the zk proof of execution and proof of data availability stuff is focused these days, and it's a natural fit when multiple nodes and networking get discussed. But there are many non-blockchain uses as well!
IMO the strongest applications are blockchain-related, but here are some other ideas which have been suggested -
- One can selectively disclose certain aspects of their identity. E.g. given a commitment to my DNA, I can prove that it doesn't match that of a suspect. (Though the police need some way to trust the commitment, perhaps a signature from the firm that sequenced it.)
- We can build trustless games, like an implementation of battleship where cheating is cryptographically hard.
- We could have trustless computation services, like ec2, or like SETI@home, etc. Though it's difficult to make this practical given the computational overhead of ZKPs.
- ZKPs can turn any delay function (like an iterated hash) into a verifiable delay function (VDF). VDFs could enable trustless lotteries/sortition.
- Given a general-purpose ZKP, it's trivial to build a signature scheme, e.g. by proving knowledge of a hash pre-image. This can sometimes lead to very practical signature schemes, such as Picnic, one of the leading hash-based signature proposals. Ring signatures are another generally-useful application.
- I think ZKPs have also been used as a building block in some FHE and MPC systems, though I don't know much about those areas.
- One can prove the existence of an exploit before divulging the details. This might be useful for e.g. convincing everyone to disable a certain TLS primitive which was just broken.
Yes. Also networks that perform simple versions of useful work exist today. I work on a network which uses zk proofs to check storage of user input data and pays out blockchain consensus rewards accordingly.
Interesting! Sounds somewhat like what the Cairo programming language & VM does https://www.cairo-lang.org; Turing-complete provable computation using a fairly funky "arithmetic" CPU architecture.
Yes, like Cairo. In fact I almost applied to a job working with Cairo recently, but I was told someone else had just started it.
It's possible to build more standard CPUs as well which use ordinary arithmetic, logic and memory, at some cost to efficiency. But the cost is almost entirely borne by the prover.
As a start I suggest you look at whatever is written about SNARK and STARK in the Ethereum world. Vitalik Buterin's blog has some good technical articles (literally look for those words in the titles). They don't cover proofs of CPU-like systems, though. There are a lot of other articles, papers, etc. I'm just suggeting those search terms as a starting point.
The Cairo system from Starkware is a pretty good, modern CPU-like system (albeit using finite fields and an unusual CPU model, instead of normal numbers), which is simpler than its predecessors in some ways because the techniques have matured and they have been able to use techniques devised by many different R&D groups before that iteration.
As far as I'm aware, however, performance at the billions of steps level is more on the frontier.
I'd love to share details of what I've been working on, and I hope I will be able to one day. One problem is I'm looking for work, and there's an interesting company I'm negotiating with but their view of IP may prevent me from publishing my own work in this area after I've started. Meanwhile, I haven't published yet, even though I've done a lot of work on it. I do not like this, I always want to share, but...
The explanation I read used Finding Wally as an example:
I can put a piece of paper with Wally's silhouette cut off in front of the book.
Align Wally to fit into the silhouette and you're done!
The paper just needs to be considerably bigger than the actual book, so other people can tell I know where Wally is without actually knowing his location.
The network demonstration of knowledge of the secret passage by 40 coin flips appears unnecessarily complicated.
Why not simply send Mick down the left path and show him returning from the right? That directly demonstrates Mick knows a passage from left to right (at least to all observers at the fork. home viewers still worry about video editing).
To be a proper zero knowledge proof, it mustn't prove to bystanders that Mick knows the secret.
With your simple proposal, everyone watching is convinced. In the more complicated version, only the person calling left/right knows for sure. As far as everyone else knows, Mick doesn't actually know the secret and the reporter colluded with Mick to pre-arrange a sequence of left/right calls.
This is the second time I've read this Ali Baba paper and I'm pretty sure I got hung up on this, too. And I don't see what that property is good for (though I am aware that other people do).
Maybe the paper should have a preface. It might work for entertaining someone who already knows all about zero knowledge proofs, but it doesn't seem to do much to educate those who do not.
Try any cryptography textbook! This is the whole point of zero knowledge proofs: that you can convince someone, yet they have no evidence (i.e, "zero knowledge") that they can take to another person and expect to convince them.
No, the zero knowledge part is that they only learn that you know a secret, but they don't learn anything about the secret. It's not about convincing others.
Yeah, I understand why the author wants to turn it into a probabilistic argument, but it doesn't quite work with the circular cave analogy. You need something where the entry and the exits are different, and Mick Ali were able to use his secret to affect the outcome at the exit.
What if Mick Ali were going down one of those twisty waterslide tubes, and could say his secret phrase to shunt himself off to a different fork, so he could come out at a different tube exit at the bottom?
if you just sent Mick down the left path and he came out the right, then a video would conclusively show he knew the password.
And this is why "How to explain zero-knowledge protocols to your children" is probably the worst way to explain zero-knowledge protocols to anyone. Its not explaining what a zero-knowledge proof is or how it works. It's explaining what a simulator is when proving a protocol is zero-knowledge . Oh, and the explanation only works for interactive protocols.
I think the point is, that a zero knowledge proof only works between two parties, the proofer and the verifier.
If I want to proof something to you I cant just say "Hey I showed it to Bob over there. Ask him if I'm telling the truth."
But that doesnt really come through in the analogy.
> I think the point is, that a zero knowledge proof only works between two parties
Ah! I did not get that part of it. Thanks for clarifying.
I don't think the story made that part clear enough. For example, in the epilogue it says "Countless people saw Mick Ali on television, and he became famous."
But, as you say, one of the main points was that watching it on television shouldn't have been enough to convince the viewers at home. They could have been watching the faked version.
It would have been clearer if the epilogue had said "Countless viewers saw Mick Ali on television and remained skeptical, but the reporter himself was convinced Ali must have been telling the truth."
Yes, you must have missed something. If you want someone to help clarify it for you then you'll need to be more specific. Who do you think is cheating? How do you think they are cheating? Why do you think they are cheating?
Computer scientist Amit Sahai, PhD, is asked to explain the concept of zero-knowledge proofs to 5 different people; a child, a teen, a college student, a grad student, and an expert. Using a variety of techniques, Amit breaks down what zero-knowledge proofs are and why it's so exciting in the world of cryptography.
Amit Sahai, PhD, is a professor of computer science at UCLA Samueli School of Engineering.
From S1 E16 of Computer Scientist Explains One Concept in 5 Levels of Difficulty | WIRED
https://youtu.be/fOGdb1CTu5c