rosenblog
https://mrosenberg.pub
ZolaenWed, 23 Sep 2020 01:17:00 +0000A Bookmarklet to Download Zoom RecordingsTue, 15 Sep 2020 15:41:00 +0000Michael Rosenberg
https://mrosenberg.pub/blog/zoom/
https://mrosenberg.pub/blog/zoom/<p><strong>Update (Sep. 23, 2020):</strong> Zoom disabled right-click on their video sites. I've updated the script to re-enable right-click :)</p>
<h1 id="overview"><a class="zola-anchor" href="#overview" aria-label="Anchor link for: overview">§</a>
Overview</h1>
<p>This is a <em>bookmarklet</em> (that is, a bookmark with script inside) that lets you download recorded Zoom calls from the Zoom cloud, regardless of whether the owner has downloads enabled. If it's visible to you, you can download it.</p>
<h1 id="why"><a class="zola-anchor" href="#why" aria-label="Anchor link for: why">§</a>
Why</h1>
<p>I like to watch lectures in the park, where there is no WiFi. So now I download the lectures in advance and take them wherever I go.</p>
<h1 id="installation"><a class="zola-anchor" href="#installation" aria-label="Anchor link for: installation">§</a>
Installation</h1>
<p>Right click the link below and select "Bookmark This Link". Put the bookmark wherever you want.<br /><br /></p>
<p><a href='
javascript:(function(){
const download_link_id = "__zoomdl_link";
if (!window.location.hostname.endsWith("zoom.us")
|| document.getElementById(download_link_id)) {
return;
}
const video_url = document.getElementById("vjs_video_3_html5_api").src;
var download_link = document.createElement("a");
download_link.id = download_link_id;
download_link.href = video_url;
download_link.innerHTML = "Download Video (right click → Save Link As)";
download_link.style = "font-weight: bold";
download_link.onclick = function() { return false; };
var container = document.getElementsByClassName("main")[0];
container.prepend(download_link);
function enableContextMenu(aggressive = false) {
void(document.ondragstart = null);
void(document.onselectstart = null);
void(document.onclick = null);
void(document.onmousedown = null);
void(document.onmouseup = null);
void(document.body.oncontextmenu = null);
enableRightClickLight(document);
if (aggressive) {
enableRightClick(document);
removeContextMenuOnAll("body");
removeContextMenuOnAll("img");
removeContextMenuOnAll("td");
}
}
function removeContextMenuOnAll(tagName) {
var elements = document.getElementsByTagName(tagName);
for (var i = 0; i < elements.length; i++) {
enableRightClick(elements[i]);
}
}
function enableRightClickLight(el) {
el || (el = document);
el.addEventListener("contextmenu", bringBackDefault, true);
}
function enableRightClick(el) {
el || (el = document);
el.addEventListener("contextmenu", bringBackDefault, true);
el.addEventListener("dragstart", bringBackDefault, true);
el.addEventListener("selectstart", bringBackDefault, true);
el.addEventListener("click", bringBackDefault, true);
el.addEventListener("mousedown", bringBackDefault, true);
el.addEventListener("mouseup", bringBackDefault, true);
}
function restoreRightClick(el) {
el || (el = document);
el.removeEventListener("contextmenu", bringBackDefault, true);
el.removeEventListener("dragstart", bringBackDefault, true);
el.removeEventListener("selectstart", bringBackDefault, true);
el.removeEventListener("click", bringBackDefault, true);
el.removeEventListener("mousedown", bringBackDefault, true);
el.removeEventListener("mouseup", bringBackDefault, true);
}
function bringBackDefault(event) {
event.returnValue = true;
(typeof event.stopPropagation === "function") && event.stopPropagation();
(typeof event.cancelBubble === "function") && event.cancelBubble();
}
enableContextMenu();
})();
'>
Zoom Rec. ⬇️
</a></p>
<h1 id="how-to-use"><a class="zola-anchor" href="#how-to-use" aria-label="Anchor link for: how-to-use">§</a>
How to Use</h1>
<ol>
<li>When you're on the Zoom website, watching a recorded Zoom call, click the Zoom Rec. ⬇️ bookmarklet</li>
<li>A link will appear at the top of the page telling you to right click it</li>
<li>Right click the link, select "Save Link As", and save the video file onto your computer</li>
</ol>
<h1 id="source-code-and-license"><a class="zola-anchor" href="#source-code-and-license" aria-label="Anchor link for: source-code-and-license">§</a>
Source Code and License</h1>
<p>Source code and license are available on the <a href="https://github.com/rozbb/zoomdl-bookmarklet">Github page</a>.</p>
Hash Functions are not (Quantum) Random Oracles, but only TechnicallySun, 01 Mar 2020 00:00:00 +0000Michael Rosenberg
https://mrosenberg.pub/blog/qrom/
https://mrosenberg.pub/blog/qrom/<p><em>This post is part of a <a href="/assets/pdfs/qrom_survey.pdf">survey</a> I co-wrote with <a href="https://www.cs.umd.edu/~erblum/">Erica Blum</a> and <a href="https://llamakana.github.io/">Makana Castillo-Martin</a> for a quantum computation class at the University of Maryland in Fall 2019.</em></p>
<p>In this post, we define the Random Oracle Model (ROM) and the Quantum ROM (QROM), and give some examples of their uses and their flaws. We show that the QROM is unsound in the same way that the ROM is, and we conclude that that's OK.</p>
<h1 id="definitions-and-notation"><a class="zola-anchor" href="#definitions-and-notation" aria-label="Anchor link for: definitions-and-notation">§</a>
Definitions and Notation</h1>
<p>I really don't know what background to assume, but to balance comprehensibility and length of post, I've just chosen the things that I had to remind myself about when writing this.</p>
<h2 id="turing-machines"><a class="zola-anchor" href="#turing-machines" aria-label="Anchor link for: turing-machines">§</a>
Turing Machines</h2>
<p>You can imagine Turing Machines as abstract machines that run programs specified by some programming language. Like your computer!</p>
<h2 id="random-oracles"><a class="zola-anchor" href="#random-oracles" aria-label="Anchor link for: random-oracles">§</a>
Random Oracles</h2>
<p>An <strong>oracle</strong> is an abstraction—a black box—which Turing Machines can query and get a response from. I say "black box" and not "Turing Machine program," for example, because an oracle is not necessarily a program that can be written down. Rather, it is an abstraction whose inner workings we have no information about. So when we talk about an oracle behaving a certain way, we are describing what kinds of outputs it gives in response to certain inputs, not how the oracle functions internally. When we want to express that a Turing Machine $\mathcal{A}$ has access to an oracle $\mathcal{O}$, we write $\mathcal{A}^\mathcal{O}$. This means that $\mathcal{A}$ can call $\mathcal{O}(x)$ on any input $x$ it pleases.</p>
<div class="math-def">
<p><strong>Definition (informal):</strong> A <strong>random oracle</strong> in a cryptosystem is a publicly accessible oracle (i.e., all parties can query it) which produces random and consistent responses. "Random" means that, until the oracle is queried for the first time on a given input, every possible response is equally likely. "Consistent" means that $\mathcal{O}$ always returns the same value on a given input.</p>
</div>
<p>Notice that implementing a random oracle is hard. You would have to construct a table that contains the output for every possible input. This table would be massive, so when we have to simulate a random oracle in practice, we normally do all this on the fly. That is, for each new input, we generate a new random value and save it in a table along with the input. This way, the oracle can respond consistently to previously seen queries.</p>
<h2 id="turing-reductions"><a class="zola-anchor" href="#turing-reductions" aria-label="Anchor link for: turing-reductions">§</a>
Turing Reductions</h2>
<p>It's worth it to go over the definition of Turing reduction here, because the purpose of the random oracle model is to permit reductions that were not previously possible (as far as we're aware).</p>
<div class="math-def">
<p><strong>Definition:</strong> A <strong>Turing Reduction</strong> from problem $X$ to problem $Y$ is a Turing Machine $\mathcal{A}$ such that, if $\mathcal{A}$ is given access to a Turing Machine $\mathcal{Y}$ which solves $Y$, then $\mathcal{A}^\mathcal{Y}$ can efficiently<sup class="footnote-reference"><a href="#efficient">1</a></sup> solve $X$. We say "$X$ <strong>reduces to</strong> $Y$", or "$X \leq Y$" if and only if such a machine $\mathcal{A}$ exists.</p>
</div>
<p>In this definition, we take "problem" to mean a question of the form "does this thing satisfy relation $R$" or "what is a thing that makes this satisfy relation $R$". Examples are</p>
<ul>
<li>Given an integer $N$, what is a prime number that divides $N$? The relation here is
$$
R = {(N, p) : p \textrm{ prime} \wedge p \mid N }
$$</li>
<li>Given a polynomial $P$ and a value $x$, is $x$ a root of $P$? The relation here is
$$
R = {(P, x) : P(x) = 0}
$$</li>
</ul>
<h1 id="the-random-oracle-model"><a class="zola-anchor" href="#the-random-oracle-model" aria-label="Anchor link for: the-random-oracle-model">§</a>
The Random Oracle Model</h1>
<p>The Random Oracle Model (introduced in 1993 by <a href="https://cseweb.ucsd.edu/~mihir/papers/ro.pdf">Bellare and Rogaway</a>) was invented as a response to a schism in the cryptographic community in the 1990s: practitioners were implementing all kinds of cryptographic protocols which used hash functions, while theoreticians had no way of proving that these schemes were secure. The aim of the ROM was to bridge this gap by formalizing a small assumption that many were making anyway: hash functions appear to behave randomly. Actually describing what "appear to behave randomly" means is more of an exercise in philosophy than math, so we'll leave that out.<sup class="footnote-reference"><a href="#randomly">2</a></sup> But the gist of the formalization is captured in the following definition:</p>
<div class="math-thm">
<p><strong>Definition (informal):</strong> We say that <strong>$X$ reduces to $Y$ in the ROM</strong> if and only if there is a reduction from $X$ to $Y'$, where $Y'$ is the same problem as $Y$ but with every hash function replaced with a random oracle.</p>
</div>
<p>Take note of what is not being said: <strong>no claim is ever made that a reduction which holds in the ROM necessarily holds when hash functions are used instead of random oracles</strong>. Accordingly, and appropriately, <strong>the ROM is considered a heuristic</strong> for cryptosystems which use hash functions; it does not necessarily prove anything about the scheme with hashes. Rogaway and Bellare go out of their way to state precisely this in their introduction of the ROM:</p>
<blockquote>
<p>In order to bring to practice some of the benefits of provable security, it makes sense to incorporate into our models objects which capture the properties that practical primitives really seem to possess, and view these objects as basic even if the assumptions about them are, from a theoretical point of view, very strong...We stress that the proof is in the random oracle model and the last step is heuristic in nature. It is a thesis of this paper that significant assurance benefits nonetheless remain.</p>
</blockquote>
<p>Innumerably many cryptosystems have been proven secure in the ROM (where "proven secure" means "reduced from a problem believed to be hard"). The <a href="https://bristolcrypto.blogspot.com/2015/08/52-things-number-47-what-is-fiat-shamir.html">Fiat-Shamir Transform</a> (which is most often used in the ROM) has, alone, probably been used to construct hundreds of cryptosystems. The argument for the ROM goes deeper, though. <a href="https://eprint.iacr.org/2015/140">Koblitz and Menezes</a> make a strong argument that, not only does the ROM allow us to prove schemes secure that are otherwise unprovably secure (e.g., the Full-Domain Hash signature scheme), but it also gives us constructions that are less brittle to misuse (avoiding things like the duplicate signature key selection attack in the GHR signature scheme).</p>
<p>All this is to say that, while the ROM is just a heuristic, it's a really really useful one.</p>
<h2 id="hash-functions-are-not-random-oracles"><a class="zola-anchor" href="#hash-functions-are-not-random-oracles" aria-label="Anchor link for: hash-functions-are-not-random-oracles">§</a>
Hash Functions are not Random Oracles</h2>
<p>Earlier I said that no claim is made that hash functions can be modeled as random oracles. And that's good, because it turns out that it's false. A neat way of proving this is to construct a digital signature algorithm which is secure in the ROM, but insecure under any choice of hash function. I'll repeat that because this is really surprising: <strong>there is a signature scheme which is secure in the ROM, but completely insecure under ANY choice of hash function</strong>. As far as I can tell, the first example of such a scheme was given<sup class="footnote-reference"><a href="#revisitedsig">3</a></sup> in 1998 by CGH, but I'd like to use a different example<sup class="footnote-reference"><a href="#maurer">4</a></sup> by HMR, which I think is a lot cleaner and which I'll paraphrase and prove in this section.</p>
<p>Suppose there's a secure signature scheme<sup class="footnote-reference"><a href="#euf">5</a></sup> $\mathcal{S}$ with signature algorithm $\mathsf{Sign}_k(m)$ where $k$ is the signing key. We define a new signature scheme $\mathcal{Evil}$ with signing algorithm $\mathsf{EvilSign}^\mathsf{H}_k(m)$ which has access access to some oracle $\mathsf{H}$ (we'll compare the case where $\mathsf{H}$ is an oracle $\mathcal{O}$ versus when it's a hash function $f$). Denote the security parameter by $\lambda$. On input $m$, $\mathsf{EvilSign}^\mathsf{H}_k$ will calculate $b := \mathsf{D^H}(m)$, where $\mathsf{D}$ is an algorithm we'll define in a second. If $b = 0$, then the algorithm returns $\mathsf{Sign}_k(m)$, otherwise, it does the completely insecure thing and returns $k$.</p>
<p>$\mathsf{D^H}(m)$ is the algorithm that we use to distinguish between random oracles and hash functions. It exploits the idea that a hash function has a representation as a program, whereas a random oracle needs a massive truth table in order to describe its behavior. $\mathsf{D^\mathsf{H}}$ will interpret its input $m$ as the description of a (Universal Turing Machine) program $\pi$. It then checks whether $\pi(i) = \mathsf{H}(i)$ for all $0 \leq i < 2|\pi|+\lambda$. If this equality fails for any $i$, then $\mathsf{D}$ outputs 0. Otherwise, it outputs 1. We claim two things:</p>
<div class="math-thm">
<p><strong>Claims:</strong></p>
<ol>
<li>If $\mathsf{H}$ is a hash function $f$, there exists an adversary that can always make $\mathsf{D}$ output 1.</li>
<li>If $\mathsf{H}$ is a random oracle $\mathcal{O}$, $\mathsf{D}$ outputs 0 with high probability (where the probability is taken over random choice of $\mathcal{O}$).</li>
</ol>
<p><strong>Proof of (1):</strong> This one's easy: the adversary simply sends the encoding of $f$ itself. Then $\pi(i) = \mathsf{H}(i) = f(i)$ for all $i$.</p>
<p><strong>Proof of (2):</strong> This can be proven by bounding the likelihood that there exists a program that can represent the truth table of $\mathcal{O}$. For a moment, let's consider just programs of length at most $\ell$. Let $q_\ell = 2\ell+\lambda$.</p>
<p>The set of all outputs up to $q_\ell$ of all random oracles, pessimistically assuming that the oracles are binary-valued, is $\{(\mathcal{O}(1), \mathcal{O}(2), \ldots, \mathcal{O}(q_\ell)) : \mathcal{O} : \mathbb{N} \to \{0,1\}\}$. This set has size $2^{q_\ell}$. In comparison, consider the set of program outputs of programs of length at most $\ell$. Again, assume pessimistically that all programs of length at most $\ell$ halt and return binary values. Then the size of the set, $\{(\pi(1), \pi(2), \ldots, \pi(q_\ell)) : |\pi| \leq \ell\}$, is at most $2^{\ell+1}$. This is a much smaller set than that of the oracle outputs. So if $\mathcal{O}$ is chosen randomly, the likelihood that there exists a $\pi$ of length at most $\ell$ that describes $\mathcal{O}$ out to $q_\ell$ many places is</p>
<p>$$
p_\ell
= \Pr_{\mathcal{O}}\left[
\exists \pi :
|\pi| \leq \ell
\wedge \mathcal{O}(1) = \pi(1)
\wedge \cdots \wedge \mathcal{O}(q_\ell) = \pi(q_\ell)
\right]
= {\sum_{|\pi| \leq \ell} \Pr_\mathcal{O}\left[
\mathcal{O}(1) = \pi(1)
\wedge \cdots \wedge \mathcal{O}(q_\ell) = \pi(q_\ell)
\right]}
\leq \frac{2^{\ell+1}}{2^{q_\ell}}
= 2^{-\ell-\lambda+1}
$$</p>
<p>Then by the union bound, the probability $p$ that a random oracle has any program $\pi$ that agrees with it at the first $2|\pi| + \lambda$ values is</p>
<p>$$
p \leq \sum_{\ell = 0}^\infty p_\ell \leq \sum_{\ell = 0}^\infty 2^{-\ell-\lambda+1} = 2^{-\lambda+2}
$$</p>
<p>which is inverse-exponential in the security parameter.</p>
</div>
<p>If we are in the ROM, then it follows from claim (II) that the only way to construct a forgery in scheme $\mathcal{Evil}$ is to construct a forgery in $\mathcal{S}$. Since $\mathcal{S}$ is assumed to be secure against forgery, this is not possible. Thus, this scheme is secure in the ROM.</p>
<p>Further, it follows from claim (I) that for any choice of hash function $f$, $\mathsf{EvilSign}$ can be completely broken by an efficient adversary. Thus, we have constructed a scheme that is secure in the ROM and insecure under <em>any</em> choice of hash function!</p>
<h2 id="is-the-rom-broken"><a class="zola-anchor" href="#is-the-rom-broken" aria-label="Anchor link for: is-the-rom-broken">§</a>
Is the ROM Broken?</h2>
<p>Given this ridiculous mismatch of security expectations, one may ask: is the ROM a reasonable heuristic to use if it so clearly fails to reflect reality in at least one cryptosystem? This is a matter of opinion.</p>
<p>The $\mathcal{Evil}$ cryptosystem above is a pathological example by any standard. The signature function is literally programmed to reveal the secret key some of the time. It turns out that weird examples like these are actually the only examples we have of the ROM failing to hold up in the real world. Further, there are a handful of examples where avoiding the ROM has actually introduced exploitable vulnerabilities into a cryptosystem<sup class="footnote-reference"><a href="#standardvulns">6</a></sup>. These vulnerabilities normally arise from, in a broad sense, things having too much algebraic structure.</p>
<p>On the other hand, an unrealistic model is an unrealistic model. If there is a cryptosystem, albeit a contrived one, which serves as a counterexample to the claim that hash functions are interchangeable with random oracles, then why should we believe that there are some cryptosystems in which this claim is true?<sup class="footnote-reference"><a href="#revisited">7</a></sup> We don't have any proofs so far of being able to model a hash function with a random oracle in a certain cryptosystem, only counterexamples.</p>
<p>I'll hold off weighing in until the conclusion, but I think this is a reasonable question to ask, and not one that science or math can really answer. This is one of those cases where philosophy intersects cryptography in a way that can affect what people choose to research and how they construct their solutions.</p>
<h1 id="the-quantum-random-oracle-model"><a class="zola-anchor" href="#the-quantum-random-oracle-model" aria-label="Anchor link for: the-quantum-random-oracle-model">§</a>
The Quantum Random Oracle Model</h1>
<p>It turns out that we have to go through this same moral conundrum when we talk about the QROM, because $\mathcal{Evil}$ is secure in the QROM too! Again, to be concrete, <strong>there is a signature scheme that is secure against quantum-capable adversaries in the QROM, but completely insecure for ANY choice of hash function</strong>. To get to this claim, we'll need some of the basics of quantum computation theory.</p>
<h2 id="a-quick-rundown-of-quantum-computation"><a class="zola-anchor" href="#a-quick-rundown-of-quantum-computation" aria-label="Anchor link for: a-quick-rundown-of-quantum-computation">§</a>
A Quick Rundown of Quantum Computation</h2>
<p>"Qubit" is a fancy term you might have heard before. A qubit is the most basic unit of data in Quantum Land, so it'll be worth it to give a rigorous definition. First, we often fix the number of qubits in a system by fixing the dimension of the system. In particular, we say that a <strong>$b$-qubit system</strong> is a subset of $\mathbb{C}^{2^b} \cong \mathbb{C}^2 \otimes \cdots \otimes \mathbb{C}^2$ (tensored $b$ many times). A <strong>qubit</strong> is an element of the unit sphere $B = \{\|z\| = 1 : z \in \mathbb{C}^{2^b}\}$ (where the norm is the $L_2$ norm). The operations on qubits we care about are <strong>unitary operators</strong>, i.e., linear functions that preserve length, i.e., linear functions from $\mathbb{C}^{2^b}$ to $\mathbb{C}^{2^b}$ which map $B$ to $B$. We represent the $i$-th basis vector (0-indexed) of $\mathbb{C}^{2^b}$ by writing $\langle \mathsf{bin}(i)\rangle$, where $\mathsf{bin}(i)$ denotes the binary expansion of $i$.<sup class="footnote-reference"><a href="#braket">8</a></sup> So for example, $\langle 000 \rangle$ denotes the 0-th basis vector of $\mathbb{C}^8$ (<em>not</em> the zero vector!), and $\langle 011 \rangle$ denotes the 3rd basis vector. Sometimes we will want to split up the basis vector label into two parts. To denote this, we say $\langle x,y \rangle = \langle x \| y\rangle $, i.e., the concatenation of the binary strings $x$ and $y$. For example, $\langle 001, 011 \rangle = \langle 001011 \rangle$.</p>
<p>Notice that the set of bitstrings of length $b$ is in bijection with the basis vectors of $\mathbb{C}^{2^b}$, so we can talk about "linear combinations of bitstrings" in a way that makes sense. For example, $v = (1/\sqrt{2})\cdot \langle 110 \rangle - (i/\sqrt{2})\cdot \langle 010\rangle$ is a vector in $B$. We say that this qubit $v$ represents a <strong>superposition</strong> of the bitstrings $\langle 110\rangle$ and $\langle 010\rangle$. This representation as the linear combination of bitstrings is unique for the same reason that the representation of a vector in terms of the standard basis is unique.</p>
<p>We can define linear operators that act on these bitstrings by defining mappings from basis elements to basis elements. For example, if $f$ is a function from $\{0,1\}^3$ to $\{0,1\}^3$ (i.e., bitstrings of length 3 to bitstrings of length 3), and $f(010) = 111$, then we can define a linear operator $F$ that takes $\langle 010\rangle$ to $\langle 111\rangle$. This gives a unitary operator whenever $f$ is a bijection on the bitstrings. But what if $f$ isn't a bijection? What if $f(001) = f(010) = 111$? Then you can use a neat trick: define $F$ such that it takes the basis vector $\langle x, y \rangle$ to $\langle x, y \oplus f(x)\rangle$. This trick works because the output "remembers" the input, so the function is always invertible. For example,</p>
<p>$$
F(\langle 010, 000 \rangle) = \langle 010, 111 \rangle
\quad\textrm{and}\quad
F(\langle 001, 000 \rangle) = \langle 001, 111 \rangle
$$</p>
<p>Operators that are defined like this are almost always given $y = 0$. Finally, notice that we had to make the dimension of the space larger, and thus increase the number of qubits in our system in order to accommodate more bits. Indeed, doubling the dimension to account for non-injectivity of bit-valued functions is common practice.</p>
<p>Alright, that's about all the background necessary to present the QROM and the final result.</p>
<h2 id="quantum-random-oracles"><a class="zola-anchor" href="#quantum-random-oracles" aria-label="Anchor link for: quantum-random-oracles">§</a>
Quantum Random Oracles</h2>
<p>The QROM (introduced in 2010 by <a href="https://eprint.iacr.org/2010/428.pdf">Boneh et al.</a>) comes from a simple insight: if we believe that quantum computers could run hash function circuits in superposition (and we do), then why don't we model hash functions as random oracles, but more quantum-y? This is, in essence, the same idea as the ROM, except the random oracles are allowed to be queried in quantum superposition. More specifically,</p>
<div class="math-def">
<p><strong>Definition:</strong> Given a random oracle $\mathcal{O}$, the associated <strong>quantum random oracle</strong> $\mathcal{O}_\mathrm{quant}$ is the unitary operator that maps a superposition of bitstrings to the superposition of its values when evaluated with $\mathcal{O}$. Concretely, for every basis vector $\langle x, y \rangle$, define</p>
<p>$$
\mathcal{O}_\mathrm{quant}(\langle x, y \rangle) := \langle x, y \oplus \mathcal{O}(x)\rangle
$$</p>
</div>
<p>By this definition, $\mathcal{O}_\mathrm{quant}$ is a unitary operator, so it makes sense to say</p>
<p>$$
\mathcal{O}_\mathrm{quant}\left(
\frac{1}{\sqrt 3} \langle 010, 000 \rangle
+ \frac{i}{\sqrt 3} \langle 011, 000 \rangle
- \frac{i}{\sqrt 3} \langle 110, 000 \rangle
\right)
=
\frac{1}{\sqrt 3} \langle 010, \mathcal{O}(010) \rangle
+ \frac{i}{\sqrt 3} \langle 011, \mathcal{O}(011) \rangle
- \frac{i}{\sqrt 3} \langle 110, \mathcal{O}(110) \rangle
$$</p>
<p>Notice that a quantum random oracle is at least as powerful as a random oracle, since you can always query with the superposition of just a single value:</p>
<p>$$
\mathcal{O}_\mathrm{quant}(\langle x, 0 \rangle) = \langle x, \mathcal{O}(x)\rangle
$$</p>
<p>In fact, quantum random oracles are strictly more powerful than classical random oracles. Boneh et al. show a <em>separation result</em>. In particular, they present an identification scheme that is secure in the ROM, and insecure in the QROM.<sup class="footnote-reference"><a href="#separation">9</a></sup></p>
<h2 id="quantum-hash-functions-are-not-quantum-random-oracles"><a class="zola-anchor" href="#quantum-hash-functions-are-not-quantum-random-oracles" aria-label="Anchor link for: quantum-hash-functions-are-not-quantum-random-oracles">§</a>
Quantum Hash Functions are not Quantum Random Oracles</h2>
<p>Since the QROM is strictly more powerful than the ROM, it follows that QROM-security implies ROM-security. But we'd like to show that the ROM-secure $\mathcal{Evil}$ scheme above is also QROM-secure. Luckily, Boneh et al. also give us the conditions for when this converse statement is true.</p>
<div class="math-thm">
<p><strong>Theorem:</strong> If a reduction to a problem $P$ holds in the ROM, and the reduction is history-free, then the reduction to problem $P$ holds in the QROM as well.</p>
</div>
<p>A <strong>history-free</strong> reduction is defined formally in the paper, but it suffices to say that a history-free reduction is a reduction that does not rewind its solver $\mathcal{Y}$, does not record the oracle queries, and does not modify oracle behavior based on previous queries.</p>
<p>Recall that the above proof of security in the ROM was actually a reduction to the security of the underlying hypothetical signature scheme $\mathcal{S}$. To prove QROM-security, we need to tweak the assumption a bit: instead of assuming that $\mathcal{S}$ is secure against classical adversaries, we need to further assume it is secure against quantum-capable adversaries. Admitting this, the only thing that needs to be shown is that the proof of ROM-security of $\mathcal{Evil}$ is history-free. Indeed, the reduction never recorded queries, rewound the solver, or modified its own behavior, so it's pretty clear that it's history free.<sup class="footnote-reference"><a href="#proof">10</a></sup></p>
<h1 id="conclusions"><a class="zola-anchor" href="#conclusions" aria-label="Anchor link for: conclusions">§</a>
Conclusions</h1>
<p>And that's it! We've just constructed a signature scheme (assuming the existence of a quantum-secure<sup class="footnote-reference"><a href="#euf">5</a></sup> signature scheme) that is secure against quantum adversaries when modeling the hash functions as quantum-accessible random oracles, but is completely insecure under <em>any</em> choice of hash function!</p>
<p>I want to reiterate that the failure of the (Q)ROM to reflect real-world security should not necessarily be taken as a sign that we should stop using it, though some argue precisely that. My personal, highly unqualified opinion is that this construct has given us far more than it's taken, and despite the fact that it sometimes fails to tell us something about our world, I am convinced, as Rogaway and Bellare are, that it captures "the properties that practical primitives really seem to possess."</p>
<div class="footnote-definition" id="efficient"><sup class="footnote-definition-label">1</sup>
<p>"Efficient" meaning polynomial-time in the size of its inputs. Also of course not every Turing reduction is polynomial time, but there's only so much couching I can do before I lose everyone.</p>
</div>
<div class="footnote-definition" id="randomly"><sup class="footnote-definition-label">2</sup>
<p>This is hard because the obvious definition of random is "given a bunch of input-output pairs, you are unable to tell anything about the output on any input you haven't already seen". In the context of a random oracle, this makes sense, because an oracle would be chosen at random at the beginning of the experiment. But for a hash function, if the hash function is publicly known, then the adversary can just make the queries itself. If we instead sample from a family of hash functions, that would eliminate that problem, but the ROM-secure schemes we're talking about aren't ones which randomly sample from a hash function family in every instantiation.</p>
</div>
<div class="footnote-definition" id="revisitedsig"><sup class="footnote-definition-label">3</sup>
<p>Section 4 in <a href="https://eprint.iacr.org/1998/011">Canetti, Goldreich, Halevi</a></p>
</div>
<div class="footnote-definition" id="maurer"><sup class="footnote-definition-label">4</sup>
<p>Section 2 in <a href="https://eprint.iacr.org/2003/161">Holenstein, Maurer, Renner</a></p>
</div>
<div class="footnote-definition" id="euf"><sup class="footnote-definition-label">5</sup>
<p>As a reminder, a digital signature scheme consists of algorithms $\mathsf{KeyGen}$, $\mathsf{Sign}$, and $\mathsf{Ver}$. By "secure" I mean EUF-CMA-secure, meaning that an adversary can't forge a valid signature for a message it's never seen before. And by "quantum-secure" I mean the same thing but against a quantum-capable adversary.</p>
</div>
<div class="footnote-definition" id="standardvulns"><sup class="footnote-definition-label">6</sup>
<p><a href="https://eprint.iacr.org/2015/140.pdf">This paper</a> by Koblitz and Menezes presents good arguments for what the ROM has bought us and what avoiding the ROM has wreaked.</p>
</div>
<div class="footnote-definition" id="revisited"><sup class="footnote-definition-label">7</sup>
<p><a href="https://eprint.iacr.org/1998/011">Canetti, Goldreich, Halevi</a> generally argue that the ROM should be avoided for this reason.</p>
</div>
<div class="footnote-definition" id="braket"><sup class="footnote-definition-label">8</sup>
<p>I know that this notation is nonstandard, but I'm not using bra-ket notation because it's not necessary for the rest of this piece.</p>
</div>
<div class="footnote-definition" id="separation"><sup class="footnote-definition-label">9</sup>
<p>This is section 3 in <a href="https://eprint.iacr.org/2010/428.pdf">Boneh et al.</a> The gist is there is an identification scheme that will either do the correct thing, or do the insecure thing if the adversary is able to compute some number of hash collisions in $O(\sqrt[3]{N})$ time. Since the best you can do in the classical case is find a collision with probability 1/2 in $O(\sqrt N)$ time, this is secure for all classical adversaries. But a quantum adversary running Grover's algorithm can compute several collisions in $O(\sqrt[3]{N})$ time with high probability.</p>
</div>
<div class="footnote-definition" id="proof"><sup class="footnote-definition-label">10</sup>
<p>I left out the details here only because it's very boring and takes up a bunch of space. If you want read the full proof, see section 3.2 in the aforementioned <a href="/assets/pdfs/qrom_survey.pdf">survey paper</a>.</p>
</div>
Supersingular Isogeny Key Exchange for Not-Quite BeginnersWed, 08 Jan 2020 16:55:20 +0000Michael Rosenberg
https://mrosenberg.pub/blog/sidh/
https://mrosenberg.pub/blog/sidh/<p>I recently read a <a href="https://eprint.iacr.org/2019/1321">great introductory paper</a> on Supersingular Isogeny Diffie-Hellman (SIDH) by Craig Costello and wanted to summarize just the math of it (with some simplifications) for myself. Hopefully this summary is clear enough to also be useful to people who aren't myself.</p>
<span id="continue-reading"></span><h1 id="background"><a class="zola-anchor" href="#background" aria-label="Anchor link for: background">§</a>
Background</h1>
<p>We deal with supersingular (we won't define this word) elliptic curves defined over finite fields of the form $\mathbb{F}_{p^2}$.</p>
<p>For SIDH, we deal only with curves such that $E(\mathbb{F}_{p^2}) \cong \mathbb{Z}_{p+1} \times \mathbb{Z}_{p+1}$. The kernel of the left multiplication map $P \to [a]P$ is of the form $\mathbb{Z}_a \times \mathbb{Z}_a$. We say that a map $\phi: E \to E'$ is an $\ell$-isogeny iff $|\ker \phi| = \ell$. Notice that $\ell$-isogenies only exist for $\ell \mid p+1$. Every $\ell$-isogeny $\phi: E \to E'$ also has an associated $\ell$-isogeny, called the dual isogeny $\psi: E' \to E$, such that $\phi \circ \psi$ is the left-multiplication map $[\ell]$. Vélu's formulas tell us that every subgroup $G \leq E$ corresponds to a unique isogeny out of $E$ (to some curve $E'$) with kernel G. They also tell us how to construct an isogeny given its desired kernel.</p>
<div class="math-thm">
<p><strong>Claim:</strong> On a supersingular curve $E$ of the form above, for any prime $\ell \mid p+1$, there are exactly $\ell+1$ nontrivial $\ell$-isogenies out of $E$.</p>
<p><strong>Proof:</strong> Using Vélu's formulas, it suffices to show that E has exactly $\ell+1$ many subgroups of prime order $\ell$. Firstly, since these are of order $\ell$, they all must be subgroups of $E[\ell] \cong \mathbb{Z}_\ell \times \mathbb{Z}_\ell$. Let $P$ and $Q$ denote the free generators of $E[\ell]$.
Consider the family of cyclic subgroups generated by $P + [k]Q$ for $0 \leq k < \ell$. We claim that these are not only distinct, but disjoint. Suppose for distinct $k$, $k'$ that</p>
<p>$$
\alpha(P + [k]Q) = \beta(P + [k']Q)
$$</p>
<p>for some integers $\alpha$, $\beta$. Rearranging,</p>
<p>$$
[\alpha]P - [\beta]P = [\beta k]Q - [\alpha k']Q
$$</p>
<p>Since $P$ and $Q$ are linearly independent, both sides of this equation must be the identity. Therefore,</p>
<p>$$
\alpha \equiv \beta\ (\textrm{mod } \ell)
$$</p>
<p>Putting this together with the fact that the RHS is the identity,</p>
<p>$$
\beta k \equiv \alpha k' \equiv \beta k'\ (\textrm{mod } \ell),
$$</p>
<p>we conclude $k \equiv k'\ (\textrm{mod } \ell)$. This gives us a family of $\ell$ many disjoint subgroups. The final subgroup to include is the one generated by $Q$. It is clear that this is also disjoint from the previously mentioned subgroups. Finally, there cannot be any more order $\ell$ subgroups, since we have listed all the possible cyclic subgroups of $\mathbb{Z}_\ell \times \mathbb{Z}_\ell$, and all groups of prime order are cyclic.</p>
</div>
<p>A <strong>supersingular $\ell$-isogeny graph</strong> is a graph whose nodes are $j$-invariants (i.e., elements of $\mathbb{F}_{p^2}$ which are in bijective correspondence with isomorphism classes of elliptic curves) and whose edges represent $\ell$-isogenies between them. Note this is an undirected graph because every $\ell$-isogeny has a dual $\ell$-isogeny going in the other direction. Also, by the above claim, every node has degree $\ell+1$. It turns out that this is an expander graph, which means random walks on it mix quickly. In other words, the diameter of an expander graph on $N$ elements is roughly $\log(N)$. SIDH exploits this property to make its security claims.</p>
<h1 id="sidh"><a class="zola-anchor" href="#sidh" aria-label="Anchor link for: sidh">§</a>
SIDH</h1>
<p>The purpose of SIDH is to compute a shared secret by exchanging public keys. Same idea as Diffie Hellman.</p>
<h2 id="public-parameters"><a class="zola-anchor" href="#public-parameters" aria-label="Anchor link for: public-parameters">§</a>
Public Parameters</h2>
<ul>
<li>A prime $p = 2^{e_A}3^{e_B} - 1$. This way, $p+1$ is divisible by powers of 2 and 3. $e_A$ and $e_B$ are chosen so that $2^{e_A} \approx 3^{e_B}$.</li>
<li>A starting curve $E$ defined over $\mathbb{F}_{p^2}$</li>
<li>Two points $P_A,Q_A \in E$ such that $\langle P_A, Q_A \rangle \cong \mathbb{Z}_{2^{e_A}} \times \mathbb{Z}_{2^{e_A}}$.</li>
<li>Two points $P_B,Q_B \in E$ such that $\langle P_B, Q_B \rangle \cong \mathbb{Z}_{3^{e_B}} \times \mathbb{Z}_{3^{e_B}}$.</li>
</ul>
<h2 id="protocol"><a class="zola-anchor" href="#protocol" aria-label="Anchor link for: protocol">§</a>
Protocol</h2>
<p>Here's the gist of the protocol: Alice takes a walk on a 2-isogeny graph and marks her stopping point. Bob takes a walk on a 3-isogeny graph and marks his stopping point. The constraint that $2^{e_A} \approx 3^{e_B}$ means that the number of possible stopping points that Alice can get to is roughly equal to the number of possible stopping points Bob can get to. They then share their stopping points with each other. Then both of them do the same walks as before, but starting at each other's stopping point. The final endpoints of Alice's and Bob's walks are isomorphic (as elliptic curves, since points on the isogeny graph are curves), which means they have the same $j$-invariant.</p>
<p>Concretely, here's the protocol sequence. Alice initiates:</p>
<p><strong>Alice:</strong></p>
<ol>
<li>Picks a random secret scalar $0 \leq k_A < 2^{e_A}$ and defines a secret generator $S_A = P_A + [k_A]Q_A$. Note that $S_A$ has order $2^{e_A}$ because $P_A$ and $Q_A$ are linearly independent. By the proof above, all the possible subgroups $\langle S_A\rangle$ are disjoint.</li>
<li>Constructs a $2^{e_A}$-isogeny $\phi_A: E \to E/\langle S_A\rangle$. This is done iteratively by repeatedly constructing 2-isogenies with elements of $\langle S_A\rangle$ in the kernel. The way you do this is by getting a point $R_A = [2^{e_A-1}]S_A$ of order 2. Using Vélu's formulas, you can construct a $\phi_0: E \to E/\langle R_A\rangle$. Now let $S'_A = \phi_0(S_A)$. Then $S'_A$ has order $2^{e_A-1}$. Let $R'_A = [2^{e_A-2}]S'_A$, ... Once you've constructed $\phi_0$, ..., $\phi_{e_A}$, let $\phi_A$ be the composition of all of these.</li>
<li>Sends Bob $(\phi_A(E), \phi_A(P_B), \phi_A(Q_B))$, where $\phi_A(E)$ is the description of the output curve. We call this tuple Alice's "public key".</li>
</ol>
<p><strong>Bob:</strong></p>
<ol>
<li>Picks a secret $0 < k_B ≤ 3^{e_B}$, lets $S_B = P_B + [k_B]Q_B$, and constructs a $3^{e_B}$-isogeny $\phi_B: E \to E/\langle S_B\rangle$ in the same way as above.</li>
<li>Sends Alice $(\phi_B(E), \phi_B(P_A), \phi_B(Q_A))$. We call this tuple Bob's "public key".</li>
</ol>
<p><strong>Alice:</strong></p>
<ol>
<li>Uses the same method to construct $\psi_A: \phi_B(E) \to \phi_B(E)/\langle T_A\rangle$ where
$$
T_A = \phi_B(S_A) = \phi_B(P_A) + [k_A]\phi_B(Q_A)
$$</li>
<li>Computes the $j$-invariant of $\psi_A(\phi_B(E))$</li>
</ol>
<p><strong>Bob:</strong></p>
<ol>
<li>Same idea as above: Construct $\psi_B: \phi_A(E) \to \phi_A(E)/\langle T_B\rangle$ where
$$
T_B = \phi_A(S_B) = \phi_A(P_B) + [k_B]\phi_A(Q_B)
$$</li>
<li>Computes the $j$-invariant of $\psi_B(\phi_A(E))$</li>
</ol>
<p><strong>End of Protocol</strong></p>
<p>Note that, since $j$-variants of isomorphic curves are equal, and
$$
(E/\langle S_A\rangle)/\langle T_B\rangle \cong E/\langle S_A, S_B\rangle \cong (E/\langle S_B\rangle)/\langle T_A\rangle,
$$</p>
<p>the computed $j$-invariants are the same. Thus, it makes sense to have the $j$-invariant be the shared secret.</p>
<h1 id="an-attack-on-static-public-keys"><a class="zola-anchor" href="#an-attack-on-static-public-keys" aria-label="Anchor link for: an-attack-on-static-public-keys">§</a>
An Attack on Static Public Keys</h1>
<p>This attack is due to <a href="https://eprint.iacr.org/2016/859">Galbraith, Petit, Shani, and Ti</a>. If Alice keeps reusing the same $k_A$, Bob can determine its value by by initiating $\lceil\log_2(k_A)\rceil$ many SIDH exchanges. Here's the exploit:</p>
<p>Bob can figure out the bottom bit of $k_A$ by publishing $\phi_B(Q_A) + L_2$ instead of $\phi_B(Q_A)$, where $L_2$ is a point of order 2. If $k_A$ is odd, i.e., if $L_2$ is not killed by $k_A$, then the protocol does not produce agreement, because $T_A + L_2$ is not the image of $S_A$ in Bob's curve $\phi_B(E)$. If $k_A$ is even, then $L_2$ is killed by $k_A$ and the protocol completes successfully. Thus, the bottom bit is leaked.</p>
<p>Say Bob finds it was even. Now Bob wants to know if $k_A \equiv 0\ (\textrm{mod } 4)$ or $2\ (\textrm{mod } 4)$. Then he sends $\phi_B(Q_A) + L_4$ where $L_4$ is a point of order 4, and uses the same logic. If Bob had found $k_A$ was odd, then he would send $\phi_B(P_A) - L_4$ instead of $\phi_B(P_A)$, and $\phi_B(Q_A) + L_4$ instead of $\phi_B(Q_A)$. So if $k_A$ is $1\ (\textrm{mod } 4)$, then</p>
<p>$$
\begin{align*}
&\phi_B(P_A) - L_4 + k_A(\phi_B(Q_A) + L_4)
\newline &= \phi_B(P_A) - L_4 + [k_A]\phi_B(Q_A) + L_4
\newline &= \phi_B(P_A) + [k_A]\phi_B(Q_A)
\end{align*}
$$</p>
<p>and the protocol succeeds. Otherwise,</p>
<p>$$
\begin{align*}
&\phi_B(P_A) - L_4 + k_A(\phi_B(Q_A) + L_4)
\newline &= \phi_B(P_A) - L_4 + [k_A]\phi_B(Q_A) + [3]L_4
\newline &= \phi_B(P_A) + [k_A]\phi_B(Q_A) + [2]L_4
\end{align*}
$$</p>
<p>and the protocol fails. Bob can continue like this to recover each successive bit of $k_A$.</p>
<h1 id="patching-it-up"><a class="zola-anchor" href="#patching-it-up" aria-label="Anchor link for: patching-it-up">§</a>
Patching it up</h1>
<p>How do you prevent the above attack from working? Ideally, Alice would like to be able to tell when she receives a malicious public key, i.e., one of the form $\phi_B(Q_A) + L$ for some low-order point $L$. If she could do this, then she would be able to terminate the protocol early and not reveal anything about the shared secret.</p>
<p>Unfortunately, given just a public key share, there's no way to tell if a given it was modified in this way or not.</p>
<p>But what if Alice was given more than a public key share? What if Bob somehow gave Alice his <em>private</em> ephemeral key as well? Then Alice would be able to compute Bob's public key share from his private key and check that it matches the one she received from Bob. If they match, then the public key was honest and the protocol completes successfully. If they don't match, then Alice can conclude that Bob cheated and she can abort the protocol early. This is precisely what the <a href="https://sike.org/">SIKE</a> key encapsulation mechanism does. Here's the new protocol (a little simplified):</p>
<p><strong>Alice</strong></p>
<ol>
<li>Picks her secrets randomly, as before</li>
<li>Publishes her public key $pk_A$, computed as before (i.e., a description of a curve, and two points on that curve)</li>
</ol>
<p><strong>Bob</strong></p>
<ol>
<li>Picks a random bitstring $m$</li>
<li>Computes his secret scalar $k_B = G(m \| pk_A)$, where $G$ is a pseudorandom number generator. He also computes the corresponding public key $pk_B$.</li>
<li>Computes the $j$-invariant of the SIDH exchange (as described above) between $k_B$ and $pk_A$</li>
<li>Derives a symmetric key $\kappa = \textrm{KDF}(j)$, where $\textrm{KDF}$ is some key-derivation function.</li>
<li>Computes $c = \textrm{Enc}_\kappa(m)$, where $\textrm{Enc}$ is the encryption function for some symmetric encryption scheme.</li>
<li>Sends Alice the tuple $(pk_B, c)$</li>
<li>Computes the shared secret (to be used if the protocol complete successfully) $K = \textrm{KDF}(m \| k_B \| c)$.</li>
</ol>
<p><strong>Alice</strong></p>
<ol>
<li>Uses $pk_B$ and her secret $k_A$ to derive the $j$-invariant, denoted $j'$</li>
<li>Computes $\kappa' = \textrm{KDF}(j')$</li>
<li>Decrypts $m' = \textrm{Dec}_{\kappa'}(c)$ (recall Alice received $c$ from Bob)</li>
<li>Computes the secret scalar $k_B' = G(m' \| pk_A)$ and its corresponding public key $pk_B'$</li>
<li>Checks that $pk_B' = pk_B$. If these are not equal, the protocol aborts.</li>
<li>Derives the shared secret $K = \textrm{KDF}(m' \| k_B' \| c)$</li>
</ol>
<h2 id="analysis"><a class="zola-anchor" href="#analysis" aria-label="Anchor link for: analysis">§</a>
Analysis</h2>
<p>Let's give an informal sketch of how this prevents information leakage.</p>
<p>Say Bob wants Alice to leak the lowest bit of her secret scalar $k_A$. The only leakage mechanism he has access to is learning whether the protocol succeeded or failed (since this can be used to convey a bit of information). He needs the protocol to do one thing when $k_A$ is even, and the other thing if it's odd.</p>
<p>If Bob computes and conveys everything honestly (i.e., as described in this new protocol), then it should be clear that the protocol succeeds with 100% probability. Thus, this leaks nothing. If Bob lies about $c$, then $m' = \textrm{Dec}_{\kappa'}(c)$ is certainly going to differ from $m$,<sup class="footnote-reference"><a href="#caveat">1</a></sup> which results in a different $k'_B$ and thus a different $pk_B'$. So that strategy causes the protocol to always fail with overwhelming probability, and thus leak nothing.</p>
<p>So what if Bob lies about $pk_B$ and tries the "add a low-order point" trick from the previous section? As before, this gives Bob control over the value $j'$ that Alice derives. If $k_A$ is even, it will be the same as Bob's $j$, and if it's odd it will be different. So far so good. Continuing on the "good" branch, if $j' = j$, then Alice's steps 2, 3, and 4 all continue to derive the same values as Bob's. In step 5, Alice can see that the $pk_B$ she received doesn't match the public key corresponding to $k_B'$, so Alice aborts. On the "bad" branch, if $j' \neq j$, then all the values derived in steps 2, 3, and 4 are wrong, and Alice is certain to abort. Thus, this strategy also causes the protocol to always fail with overwhelming probability, thus leaking nothing!</p>
<p>I hope that explanation was understandable. The way the authors of SIKE prove security is actually far more elegant than this: it turns out that the "patch" I described above is a result of a much more general theorem.<sup class="footnote-reference"><a href="#fuji">2</a></sup> Also, the patch I presented isn't totally accurate, so if you want more detail I encourage you to check out the <a href="https://sike.org/files/SIDH-spec.pdf">spec</a> (algorithms 1 and 2, and the proofs of security in section 4.3).</p>
<div class="footnote-definition" id="caveat"><sup class="footnote-definition-label">1</sup>
<p>This relies on some assumption about the symmetric key primitive and on $\textrm{KDF}$, so that even if you modified $j'$ as well, you couldn't get $c$ to decrypt to a chosen plaintext</p>
</div>
<div class="footnote-definition" id="fuji"><sup class="footnote-definition-label">2</sup>
<p>The FO transformation is precisely made to turn an IND-CPA PKE into an IND-CCA PKE with very little overhead. <a href="https://cs.uni-paderborn.de/fileadmin/informatik/fg/cuk/Lehre/Abschlussarbeiten/Bachelorarbeiten/2014/BA_Lippert_FOT_final.pdf">Here</a>'s a paper describing what it is and how it works. <a href="https://eprint.iacr.org/2017/604">Here</a> is the extremely influential paper that the authors of SIKE cite.</p>
</div>
<div class="footnote-definition" id="sike"><sup class="footnote-definition-label">3</sup>
<p>SIKE is currently in <a href="https://csrc.nist.gov/Projects/post-quantum-cryptography/round-2-submissions">round 2</a> of the NIST post-quantum cryptography competition</p>
</div>
Better Encrypted Group ChatWed, 10 Jul 2019 00:00:00 +0000Michael Rosenberg
https://mrosenberg.pub/blog/molasses/
https://mrosenberg.pub/blog/molasses/<h1 id="introducing-molasses"><a class="zola-anchor" href="#introducing-molasses" aria-label="Anchor link for: introducing-molasses">§</a>
Introducing molasses</h1>
<p>Broadly, an end-to-end encrypted messaging protocol is one that ensures that only the participants in a conversation, and no intermediate servers, routers, or relay systems, can read and write messages. An end-to-end encrypted <em>group</em> messaging protocol is one that ensures this for all participants in a conversation of three or more people.</p>
<p>End-to-end encrypted group messaging is a necessary problem to solve. Whether it be for limiting liability, providing verifiable client-side security, or removing a single point of failure, there <a href="https://medium.com/@RiotChat/why-we-need-end-to-end-encryption-for-online-communications-e7448e0be2c3">are</a> <a href="https://www.schneier.com/academic/paperfiles/paper-keys-under-doormats-CSAIL.pdf">good</a> <a href="https://www.nytimes.com/2019/07/01/opinion/slack-chat-hackers-encryption.html">reasons</a> for a group messaging host to use an end-to-end encrypted protocol.</p>
<p>Existing solutions such as Signal, WhatsApp, and iMessage have inherent problems with scaling, which I'll discuss in detail, that make it infeasible to conduct group chats of more than a few hundred people. The <a href="https://datatracker.ietf.org/doc/draft-ietf-mls-protocol/">Message Layer Security (MLS)</a> protocol aims to make end-to-end group chat more efficient while still providing security guarantees like forward secrecy and post-compromise security.<sup class="footnote-reference"><a href="#pcs">1</a></sup></p>
<p>To these ends, I have been working on <a href="https://github.com/trailofbits/molasses"><code>molasses</code></a>, a Rust implementation of MLS, designed with safety, ease-of-use, and difficulty-of-misuse in mind.</p>
<h2 id="molasses-has-helped-refine-the-mls-spec"><a class="zola-anchor" href="#molasses-has-helped-refine-the-mls-spec" aria-label="Anchor link for: molasses-has-helped-refine-the-mls-spec">§</a>
Molasses has helped refine the MLS spec</h2>
<p>The primary contribution of <code>molasses</code> has been in detecting errors in the specification and other implementations through unit and interoperability testing. <code>molasses</code> implements most of MLS draft 6. Why not all of draft 6? There was an error in the spec that made it impossible for members to be added to any group. This broke all the unit tests that create non-trivial groups. Errors like this are hard to catch just by reading the spec; they require some amount of automated digging. Once they are found, the necessary revisions tend to be pretty obvious, and they are swiftly incorporated into the subsequent draft.</p>
<p>Iterating this discovery/patching process using <code>molasses</code> has given me a chance to put the spec through its paces and help make things clearer. This winter internship (also called a "winternship" by nobody) project has been a great experience, especially as a first-time IETF contributor.</p>
<h1 id="how-to-build-encrypted-group-chat"><a class="zola-anchor" href="#how-to-build-encrypted-group-chat" aria-label="Anchor link for: how-to-build-encrypted-group-chat">§</a>
How to build encrypted group chat</h1>
<p>In this section we derive why MLS is constructed the way it is (hint: for efficiency reasons), and how it compares to other solutions (hint: it's better).</p>
<p>First off, MLS works on a lower level than most chat applications. It is a protocol upon which applications can be built. For example, MLS does not govern group permissions such as who can add people to the chat (this can be done at the application level while using MLS under the hood). Thus, we can leave things like formal rule systems out of the conversation entirely when analyzing the protocol. Here, we're only going to consider the sending of messages and the removal of members.</p>
<p>The constructions in this section make use of cryptographic primitives such as digital signatures, Diffie-Hellman key exchange, (a)symmetric encryption, and key-derivation functions. If the reader feels underprepared in any of these areas, a quick skim of the sections in <a href="https://nostarch.com/seriouscrypto"><em>Serious Cryptography</em></a> on ECIES and Authenticated Diffie-Hellman should be sufficient.</p>
<p>Without further ado,</p>
<h2 id="a-motivating-problem"><a class="zola-anchor" href="#a-motivating-problem" aria-label="Anchor link for: a-motivating-problem">§</a>
A Motivating Problem</h2>
<p>Wilna is planning a retirement party for an acquaintance, Vince. The logistics are a nightmare, so she invites her friends Xavier, Yolanda, and Zayne to help her plan. They would like to make a group chat on Slack so they can all stay on the same page, but they remember that Vince is an infrastructure manager for Slack—he can see <em>all</em> the messages sent over any Slack server in the world. This is a problem, since they want to give Vince a nice long vacation upstate and they want it to be a surprise. Vince's position poses even more problems: he happens to manage every single server in town. Even if Wilna purchases her own server to mediate the group chat, Vince will be tasked with managing it, meaning that he can read everything the server stores.</p>
<p>What Wilna needs is a <strong>centralized end-to-end encrypted group chat</strong>, i.e., a group chat where every member can broadcast messages and read all incoming messages, but the single server that mediates these messages cannot read anything. For clarity, we'll distinguish between <strong>application messages</strong>, which carry the textual content of what a group member wants to say to everyone else in the group, and <strong>auxiliary messages</strong> (called "Handshake messages" in MLS), which members use to manage group membership and cryptographic secrets. Since this is all mediated through one server, the members can rely on the server to broadcast their messages to the rest of the group.</p>
<p>With the setup out of the way, what are the options?</p>
<h2 id="solution-1-pairwise-channels"><a class="zola-anchor" href="#solution-1-pairwise-channels" aria-label="Anchor link for: solution-1-pairwise-channels">§</a>
Solution #1: Pairwise Channels</h2>
<p>Suppose Wilna, Xavier, Yolanda, and Zayne all know each other's public keys for digital signatures. This means that each pair of people can do an authenticated Diffie-Hellman key exchange over some auxiliary messages and derive a shared symmetric key called the <strong>pair key</strong>. This process produces six separate <strong>pairwise channels</strong>, represented here:</p>
<p><img src="/assets/images/post_molasses/pairwise_channels.svg" alt="pairwise channels between 4 members" /></p>
<p>If Wilna wants to send an application message <em>m</em> to the group, she has to encrypt it three separate times (once for each member of the group) and send all the ciphertexts:</p>
<figure>
<img src="/assets/images/post_molasses/pairwise_app_msg.svg" alt="Pairwise encryption of application message">
<figcaption>
The grey arrows represent application messages encrypted under a symmetric key.
</figcaption>
</figure>
<p>Note that Wilna isn't making use of the server's ability to broadcast messages, since each member in the group can only decrypt messages encrypted under their own pair keys. Generalizing this, if there is a group of size <em>N</em>, sending an application message requires a member to encrypt and send <em>N-1</em> times. Roughly speaking, this is how iMessage does group chat.<sup class="footnote-reference"><a href="#good_source">2</a></sup></p>
<p>Great, so that's just three encryptions per person. This probably takes at most a few milliseconds on a phone. What's the issue? The issue is what about the WhatsApp groups with >10,000 members where my aunts talk about who's getting married next?? Do you want them to do 9,999 encryptions every time they send something? I do, but they probably don't. To accomodate my aunts, we need to get cleverer.</p>
<h2 id="solution-2-sender-keys"><a class="zola-anchor" href="#solution-2-sender-keys" aria-label="Anchor link for: solution-2-sender-keys">§</a>
Solution #2: Sender Keys</h2>
<p>Instead of having a key between every user in the group, let's give every user a <strong>sender key</strong> that they use to encrypt application messages. This is roughly what Signal,<sup class="footnote-reference"><a href="#good_source">2</a></sup> WhatsApp,<sup class="footnote-reference"><a href="#good_source">2</a></sup> and Keybase<sup class="footnote-reference"><a href="#keybase_source">3</a></sup> do. If you're a group member, you have to go through the following setup:</p>
<ol>
<li>Randomly generate your sender key</li>
<li>For every user in the group, encrypt your sender key with your pair key that you share with that user</li>
<li>Send every user their encrypted copy of your sender key as an auxiliary message</li>
</ol>
<p>After the setup, which requires <em>N-1</em> encryptions for each user in a group of size <em>N</em> (that's Θ(<em>N</em><sup>2</sup>) total auxiliary messages), we finally see some efficient behavior. To send an application message <em>m</em>, Wilna:</p>
<ol>
<li>Encrypts <em>m</em> with her sender key <em>precisely once</em></li>
<li>Broadcasts the ciphertext to the group</li>
</ol>
<figure>
<img src="/assets/images/post_molasses/sender_key_app_msg.svg" alt="Sender key encrypted message broadcast">
<figcaption>
The grey arrows represent application messages encrypted under a symmetric key.
</figcaption>
</figure>
<p>Although there are three arrows here, they are all the same ciphertext, so the application message only needs to be encrypted and broadcast <em>once</em>. Thus, after the setup phase, each outgoing application message only costs a single encryption. So we're done, right? Wrong, of course wrong. Because</p>
<h3 id="what-about-removal"><a class="zola-anchor" href="#what-about-removal" aria-label="Anchor link for: what-about-removal">§</a>
What about Removal?</h3>
<p>The fallacy here is that the setup phase runs once. It actually runs every time the group is modified. Suppose in the process of premeditating this "retirement party," the group finds out that Zayne has been leaking details to Vince the whole time. Naturally, they kick Zayne out. Now Zayne still knows all the sender keys, so if he talks to Vince and gets an encrypted transcript of the group conversation that happened after his departure, he would still be able to decrypt it. This is a no-no, since Zayne has already defected. To prevent this from happening, each remaining user in the group has to create a new sender key and share it with everyone else through their pairwise channels. Again, this is Θ(<em>N</em><sup>2</sup>) total auxiliary messages, which can be a <em>lot</em>. So if we want to tolerate tons of group modifications,<sup class="footnote-reference"><a href="#removal">4</a></sup> we're going to have to find a way to bring down the number of auxiliary messages sent during the setup phase, while still being able to keep using sender keys for application messages. A well-known secret in computer science is that when the naïve solutions of pairs and lists don't work, there's a next logical step:</p>
<h2 id="solution-3-trees"><a class="zola-anchor" href="#solution-3-trees" aria-label="Anchor link for: solution-3-trees">§</a>
Solution #3: Trees</h2>
<p>We would like to have sender keys (since they make application messages efficient). We also want to be able to transmit new sender keys to subsets of the group without using too many auxiliary messages. The important insight here is that, when we remove a member, we shouldn't need to <em>individually</em> send new keying information to every single remaining member like we had to in the previous solution. After all, we need to send this to the whole group minus just one person. So why not have public keys that cover large subsets of the group, and use those for sending auxiliary messages? This is exactly what the MLS ratchet tree (a.k.a. TreeKEM) affords us.</p>
<p>The MLS <strong>ratchet tree</strong> is a binary tree<sup class="footnote-reference"><a href="#tree">5</a></sup> whose leaves correspond to members of the group, and whose non-leaf nodes, called <strong>intermediate nodes</strong>, carry a Diffie-Hellman public key and private key. Intermediate nodes don't represent people, computers, or locations on a network; they're just pieces of data that facilitate auxiliary message sending. We also allow nodes to be <strong>blank</strong>, meaning that they do not have an associated keypair. A node that does have an associated keypair is said to be <strong>filled</strong>. Every member in the group retains a copy of the ratchet tree, minus the private keys. Knowledge of the private keys follows the ratchet tree property:</p>
<p><strong>Ratchet Tree Property</strong> If a member <em>M</em> is a descendant of intermediate node <em>N</em>, then <em>M</em> knows the private key of <em>N</em>.</p>
<p><em>*deep breath*</em> Sender keys are derived via key-derivation function (KDF) from the root node's private key, and private keys are derived via KDF from its most-recently updated child's private key.<sup class="footnote-reference"><a href="#falsehoods">6</a></sup> Upon the removal of a user, new private keys are distributed to the <strong>resolutions of the copath nodes</strong>, i.e, the maximal non-blank nodes of the subtrees whose root is the sibling of an updated node.</p>
<p>That paragraph alone took about 10 minutes to write, so let's just see</p>
<h3 id="a-small-example"><a class="zola-anchor" href="#a-small-example" aria-label="Anchor link for: a-small-example">§</a>
A Small Example</h3>
<p>We start off with a group like so</p>
<p><img src="/assets/images/post_molasses/initial_tree.svg" alt="Initial tree" /></p>
<p>Zayne wants out, so Yolanda removes him.<sup class="footnote-reference"><a href="#self_removal">7</a></sup> To remove him, Yolanda will first blank out Zayne and all his ancestors:</p>
<figure>
<img src="/assets/images/post_molasses/blanked_tree.svg" alt="Tree after blanking">
<figcaption>
The boxes with red slashes through them represent blank nodes
</figcaption>
</figure>
<p>Yolanda needs to contribute new keying information to the new group so that the new sender keys can be derived from the new root's private key. To do this, she generates a new personal keypair pub<sub><em>Y'</em></sub> and priv<sub><em>Y'</em></sub> and derives all her ancestors' keypairs by iteratively applying a KDF to the private key and computing its corresponding public key (this is called "ratcheting," whence "ratchet tree").</p>
<figure>
<img src="/assets/images/post_molasses/updating_tree_1.svg" alt="Tree with new secrets">
<figcaption>
The green circles indicate recently updated nodes
</figcaption>
</figure>
<p>But Yolanda isn't done. Wilna and Xavier need to be told about these new keys somehow. It's Yolanda's job to share this info. In particular,</p>
<ol>
<li>
<p>Every member needs to get a copy of the public keys of all updated nodes (i.e., Yolanda's own public key and all her ancestors'). This is important. The public keys are part of the shared group state, and shared group state is how a bunch of values in the MLS protocol are derived.</p>
</li>
<li>
<p>Every member needs to get a copy of the private keys of their nearest modified ancestor. This is in order to preserve the ratchet tree property.</p>
</li>
</ol>
<p>Remember that the end goal is still to derive the sender keys, which means that Wilna and Xavier need to be told the value of the root private key, priv<sub><em>Y'''</em></sub>. This will be a consequence of item two above.</p>
<p>Since everyone needs public keys and public keys are not secret, Yolanda can just broadcast them as unencrypted auxiliary messages. But private keys are more sensitive. She needs to encrypt them for <em>just</em> the members who need them. This is where we use the ratchet tree property. If she wants Wilna and Xavier to be able to read an auxiliary message containing priv<sub><em>Y'''</em></sub>, she need only encrypt the message under pub<sub><em>WX</em></sub>, since Wilna and Xavier are descendants of the <em>WX</em> intermediate node, and will therefore be able to decrypt anything encrypted under pub<sub><em>WX</em></sub>.<sup class="footnote-reference"><a href="#hpke">8</a></sup> This describes how the auxiliary messages are sent to the rest of the group:</p>
<figure>
<img src="/assets/images/post_molasses/updating_tree_2.svg" alt="Tree and auxiliary messages">
<figcaption>
The solid black arrows above indicate public-key encrypted messages. The dashed arrows indicate plaintext messages. The arrows do not indicate who is doing the sending (since that's all Yolanda). They're just meant to illustrate where in the tree the values are coming from and whom they're intended for.
</figcaption>
</figure>
<p>Now Wilna and Xavier will update their view of the tree by saving the public keys and decrypting the root private key. Thus, everyone is on the same page and the ratchet tree property is preserved. Finally, everyone re-derives their sender keys, and the removal is complete.</p>
<p><img src="/assets/images/post_molasses/post_remove_tree.svg" alt="Tree post-removal" /></p>
<p>Note that Zayne's position remains blank after the removal. This saves the members from the computational overhead of shuffling themselves around and recomputing their ancestors' keypairs. MLS defines two ways to prevent removed members from overcrowding the tree: it allows blank nodes to be removed from the right end of the tree after removals (not applicable in the example above), and it allows new members to be added in the position of previously removed members. So if the "party-planners" above wanted to replace Zayne, they could do so without making the tree bigger.</p>
<p>This example illustrates the smaller details in updating keys, but it doesn't do a particularly good job at illustrating which node secrets are sent to which other nodes in the resolutions of the copath nodes. To give an idea, here's</p>
<h3 id="a-much-bigger-example"><a class="zola-anchor" href="#a-much-bigger-example" aria-label="Anchor link for: a-much-bigger-example">§</a>
A Much Bigger Example</h3>
<p>Suppose Zayne wants to break out and go solo, but still feels the desire to be in a boy band. After cloning himself 15 times, Zayne #1 notices that one of the clones, Zayne #11, keeps hinting at breaking off and doing a solo career of his own. Zayne #1 acquiesces and removes him from the group. He sees what he's created. Zayne #1 looks up at the stars. War soon.</p>
<p>Let's see what auxiliary messages were sent when Zayne #11 was booted. In this removal process, Zayne #1 generates new secrets, ratchets them all the way up the tree, and shares them with the appropriate subtrees:</p>
<figure>
<img src="/assets/images/post_molasses/updating_clone_tree.svg" alt="Updating tree of clones">
<figcaption>
The green circles still represent the updated nodes. The solid arrows represent the private key of its tail being encrypted under the public key of its head.
</figcaption>
</figure>
<p>Notice on the right hand side of the tree, since you can't encrypt to a blank node, the root private key needs to be encrypted under three separate public keys. The dashed arrows were omitted for clarity, but it's still true that the public keys of all the circled nodes are broadcasted in this step.</p>
<p>With this larger example, you might start to see some pattern in how many auxiliary messages are sent per tree update. Let's play</p>
<h3 id="can-you-eyeball-the-asymptotic-behavior"><a class="zola-anchor" href="#can-you-eyeball-the-asymptotic-behavior" aria-label="Anchor link for: can-you-eyeball-the-asymptotic-behavior">§</a>
Can You Eyeball the Asymptotic Behavior?</h3>
<p>We got efficient application messages with sender keys, and we'd like to say that we got efficient auxiliary messages with TreeKEM so we can call it a day. Is this true? Absolutely not, at least not entirely. Let's first talk about the example above, where we start off with a tree whose nodes are all filled.</p>
<h4 id="removal-in-a-filled-tree"><a class="zola-anchor" href="#removal-in-a-filled-tree" aria-label="Anchor link for: removal-in-a-filled-tree">§</a>
Removal in a Filled Tree</h4>
<p>The Zayne example is actually worst-case removal behavior in a filled tree in terms of number of auxiliary messages (you should prove this to yourself: what would happen if Zayne #1 removed Zayne #6 instead?). If there are <em>N</em> many members in the group, there are at most log(<em>N</em>)-1 encrypted auxiliary messages that don't have to deal with blank nodes, and another log(<em>N</em>)-1 that do. Plus, there are log(<em>N</em>) many public keys to share. So, to complete the sage wisdom from computer scientists of days past, if you use trees, you get <em>O</em>(log(<em>N</em>)) behavior. This is way better than the quadratic number of auxiliary messages we saw in solution #2. The same WhatsApp group of kibbitzing mumehs now only takes about 3log<sub>2</sub>(10,000) ≈ 40 total auxiliary messages to establish a new set of sender keys (assuming a filled tree) instead of the <em>N</em>(<em>N</em>-1) ≈ 99 million total auxiliary messages required previously.</p>
<h4 id="removal-in-a-tree-with-blanks"><a class="zola-anchor" href="#removal-in-a-tree-with-blanks" aria-label="Anchor link for: removal-in-a-tree-with-blanks">§</a>
Removal in a Tree with Blanks</h4>
<p>This logarithmic behavior is fantastic, but we only checked for the very specific case where we start with a full group and then remove one person. How efficient is it when we remove a single person from a group that already has some blanks? The good news is that it's still better than Θ(<em>N</em><sup>2</sup>). The bad news is that the worst case is...well let me just show you.</p>
<p>Suppose every odd-numbered Zayne was removed from the group besides Zayne #1. Finally, Zayne #2 deals the finishing blow, removing Zayne #1 and restoring peace. This is what the update looks like:</p>
<figure>
<img src="/assets/images/post_molasses/linear_tree.svg" alt="A tree with linear removal behavior">
</figure>
<p>That's <em>N</em>-1 messages to remove a single person! As mentioned before, this can be a prohibitively large number of auxiliary messages for large <em>N</em>. Even worse, it may be possible for malicious group members to strategically remove people until the tree reaches the worst-case state, thus slowing down group operations for everyone in the group.</p>
<p>Dealing with this situation is an open issue, and people are actively working on resolving or at least mitigating it. As of this writing, though, there are no proposed solutions that would materially improve the worst-case behavior.</p>
<h1 id="conclusion-and-more-info"><a class="zola-anchor" href="#conclusion-and-more-info" aria-label="Anchor link for: conclusion-and-more-info">§</a>
Conclusion and More Info</h1>
<p>It's underwhelming to end at an open issue, but this is where the protocol stands today. Efficiently updating keys is at the crux of end-to-end group messaging. The TreeKEM method, edge cases and all, is one of the most important singular contributions that MLS makes. Given that there's still at least one open issue in the spec, you may wonder</p>
<h2 id="how-close-is-the-protocol-to-being-done"><a class="zola-anchor" href="#how-close-is-the-protocol-to-being-done" aria-label="Anchor link for: how-close-is-the-protocol-to-being-done">§</a>
How close is the protocol to being done?</h2>
<p>No clue. MLS has plenty of open issues (nine as of this writing) and is being tweaked constantly. Draft 7 landed just this month, and it completely overhauled the symmetric key schedule. Inefficiencies are being shaved down as issues around authenticity, confidentiality, deniability, etc. are being patched.</p>
<h2 id="what-are-the-other-implementations"><a class="zola-anchor" href="#what-are-the-other-implementations" aria-label="Anchor link for: what-are-the-other-implementations">§</a>
What are the other implementations?</h2>
<p>The unofficial reference implementation, <a href="https://github.com/cisco/mlspp"><code>mlspp</code></a>, is used to create test vectors that we implementers all test against. There's also <code>MLS*</code>, a project at Inria to implement and formally model the protocol in <a href="https://www.fstar-lang.org/">F*</a>. And there's even another Rust implementation, <a href="https://github.com/wireapp/melissa/"><code>melissa</code></a>, being written at Wire.</p>
<h2 id="remind-me-why-you-re-writing-yet-another-rust-implementation"><a class="zola-anchor" href="#remind-me-why-you-re-writing-yet-another-rust-implementation" aria-label="Anchor link for: remind-me-why-you-re-writing-yet-another-rust-implementation">§</a>
Remind me why you're writing yet another Rust implementation?</h2>
<p>The more implementations the better. Writing this implementation has helped find errors in <code>mlspp</code> and the specification itself.</p>
<p>Errors found in <code>mlspp</code> include missing important fields (missing protocol version and missing hash of WelcomeInfo, which enforces sequencing), incorrect tree addressing (using leaf indices instead of node indices and vice-versa), and incorrectly generated test vectors. Errors in the specification that we found include ambiguities (how are removed nodes pruned from the ratchet tree?), logical impossibilities (how can you add a user to the group if your WelcomeInfo doesn't include the current decryption keys?), and deontological omissions (SHOULD<sup class="footnote-reference"><a href="#should">9</a></sup> a user verify the broadcasted pubkeys against their derived pubkeys or not?).</p>
<h2 id="ok-great-but-why-rust"><a class="zola-anchor" href="#ok-great-but-why-rust" aria-label="Anchor link for: ok-great-but-why-rust">§</a>
Ok great, but why Rust?</h2>
<p><em>*cracks knuckles*</em></p>
<p>I thought it would be nice to have an MLS implementation that has a clear API (thanks to <code>molasses</code>' careful design and Rust's strong typing), memory-safe semantics (thanks to the Rust borrow checker), thorough documentation (thanks to <code>cargo doc</code> and <code>molasses</code>' current 43% comment-code ratio), and good performance (thanks to <a href="/assets/images/resf_orig.jpg">ZERO-COST-ABSTRACTIONS</a>. Of course, none of these features make up for the fact that <code>molasses</code> is not formally verified like <code>MLS*</code> and may never be, but hey, nobody ever complained that cotton isn't as bulletproof as kevlar, cuz those are for different things.</p>
<h2 id="how-can-i-help"><a class="zola-anchor" href="#how-can-i-help" aria-label="Anchor link for: how-can-i-help">§</a>
How can I help?</h2>
<p>I don't recommend filing issues with <code>molasses</code> quite yet. The spec is moving too quickly and the library has to be redesigned accordingly each time. If you would like to contribute, the <a href="https://datatracker.ietf.org/wg/mls/about/">MLS IETF page</a> has a mailing list where you can read and participate in discussions. The organizers are helpful and patient, and I appreciate them immensely. If you want to write your own implementation, see the <a href="https://github.com/mlswg/mls-implementations">implementers' Github repo</a> for organizing info and test vectors.</p>
<p>If you are interested in reading more about the protocol and seeing some of the other open issues, you should give <a href="https://datatracker.ietf.org/doc/draft-ietf-mls-protocol/">the spec</a><sup class="footnote-reference"><a href="#spec">10</a></sup> a read.</p>
<div class="footnote-definition" id="pcs"><sup class="footnote-definition-label">1</sup>
<p>Full post-compromise security, i.e., the problem of non-deterministically deriving all new shared data so as to make the excluded parties unable to participate, is actually not easily achieved in this scheme. There is ongoing research in characterizing how post-compromise secure MLS is after a certain number of group updates.</p>
</div>
<div class="footnote-definition" id="good_source"><sup class="footnote-definition-label">2</sup>
<p><a href="https://eprint.iacr.org/2017/666.pdf">Source</a>. This is a fantastic paper which provides a lot of context for this article. Seriously, if you want to understand this topic better, you should read the MLS spec and this paper and compare the two, since they differ in pretty subtle but significant ways. E.g., the ART scheme used in the paper does not allow intermediate nodes to be blank, which affects confidentiality of messages sent to offline members.</p>
</div>
<div class="footnote-definition" id="keybase_source"><sup class="footnote-definition-label">3</sup>
<p><a href="https://rwc.iacr.org/2019/slides/keybase-rwc2019.pdf">Source</a></p>
</div>
<div class="footnote-definition" id="removal"><sup class="footnote-definition-label">4</sup>
<p>The problem of Removal in this article is a placeholder for (a weaker form of) post-compromise security. Here, "group modifications" includes updating key material without changing group membership.</p>
</div>
<div class="footnote-definition" id="self_removal"><sup class="footnote-definition-label">7</sup>
<p>MLS doesn't allow users to remove themselves. This is a quirk of the protocol, but it doesn't really affect anything.</p>
</div>
<div class="footnote-definition" id="tree"><sup class="footnote-definition-label">5</sup>
<p>Specifically, it is a left-balanced binary tree. This is fancy computer talk for "every left subtree is full," which itself is fancy computer talk for "it behaves good when stuffed into an array."</p>
</div>
<div class="footnote-definition" id="falsehoods"><sup class="footnote-definition-label">6</sup>
<p>Both these statements are technically false, but it’s way easier to think of things this way, and it’s close enough to the truth imo. In reality, sender keys are derived from a long chain of secret values relating to group state and state transitions. Node private keys are simpler, but they are also derived from chains of other secrets called "node secrets" and "path secrets." As always, see the spec for more details.</p>
</div>
<div class="footnote-definition" id="hpke"><sup class="footnote-definition-label">8</sup>
<p>If you're confused why I say all these keys are Diffie-Hellman keys and then use public-key encryption, it's because the public-key encryption in MLS is done with <a href="https://en.wikipedia.org/wiki/Integrated_Encryption_Scheme">ECIES</a>. More specifically, it's <a href="https://tools.ietf.org/html/draft-barnes-cfrg-hpke-01">HPKE</a>.</p>
</div>
<div class="footnote-definition" id="should"><sup class="footnote-definition-label">9</sup>
<p>The all-caps "SHOULD" means something specific in IETF RFCs. Its meaning is governed by not one but two RFCs, which are referred to as <a href="https://tools.ietf.org/html/bcp14">Best Current Practice 14</a>. The linguistic conventions of RFCs are super cool and alone make it worth skimming a few specs and paying attention to their "conventions and terminology" sections. <a href="https://tools.ietf.org/html/rfc8446">TLS</a> is as good a place to start as any.</p>
</div>
<div class="footnote-definition" id="spec"><sup class="footnote-definition-label">10</sup>
<p>If you want a particularly nice reading experience, you should compile the spec yourself from <a href="https://github.com/mlswg/mls-protocol">source</a>. It really is appreciably better.</p>
</div>
Siderophile: Expose your Crate's UnsafetyMon, 01 Jul 2019 00:00:00 +0000Michael Rosenberg
https://mrosenberg.pub/blog/siderophile/
https://mrosenberg.pub/blog/siderophile/<p><em>This article was originally posted on Trail of Bits' <a href="https://blog.trailofbits.com/2019/07/01/siderophile-expose-your-crates-unsafety/">blog</a> on July 1, 2019</em></p>
<p><strong><em>Siderophile ([ˈsidərəˌfīl]) — Having an affinity for metallic iron</em></strong></p>
<p>Today we released a tool, <a href="https://github.com/trailofbits/siderophile"><code>siderophile</code></a>, that helps Rust developers find fuzzing targets in their codebases.</p>
<p>Siderophile trawls your crate's dependencies and attempts to finds every unsafe function, expression, trait method, etc. It then traces these up the callgraph until it finds the function in your crate that uses the unsafety. It ranks the functions it finds in your crate by badness—the more unsafety a function makes use of, the higher its badness rating.</p>
<p>We created Siderophile for an engagement where we were delivered a massive Rust codebase with a tight timeframe for review. We wanted to fuzz it but weren't even sure where to start. So, we created a tool to determine which functions invoked the most unsafe behavior. We were able to speed up our bug discovery by automating the targeting process with siderophile. We're now open-sourcing this tool so everyone can benefit from it!</p>
<h1 id="sample-output"><a class="zola-anchor" href="#sample-output" aria-label="Anchor link for: sample-output">§</a>
Sample Output</h1>
<p>Here is a sample of <code>siderophile</code> when run on <code>molasses</code>, a crate we're building that implements the [MLS cryptographic protocol]({% link _posts/2019-07-10-molasses.md %}):</p>
<pre><code>Badness Function
012 molasses::crypto::hash::HashFunction::hash_serializable
005 molasses::crypto::hash::HashContext::feed_serializable
003 molasses::utils::derive_node_values
003 molasses::application::encrypt_application_message
003 molasses::application::decrypt_application_message
003 molasses::group_ctx::GroupContext::new_from_parts
003 molasses::group_ctx::GroupContext::from_welcome
003 molasses::group_ctx::GroupContext::update_transcript_hash
003 molasses::group_ctx::GroupContext::update_tree_hash
003 molasses::group_ctx::GroupContext::update_epoch_secrets
003 molasses::group_ctx::GroupContext::apply_update
...
</code></pre>
<p>As you can see, much of the unsafety comes from the serialization and crypto-heavy routines. We'll be sure to fuzz this bad boy before it goes 1.0.</p>
<h1 id="limitations"><a class="zola-anchor" href="#limitations" aria-label="Anchor link for: limitations">§</a>
Limitations</h1>
<p>This is not guaranteed to catch all the unsafety in a crate's deps. For instance, we don't have the ability to inspect macros or resolve dynamically dispatched methods since unsafe tagging only occurs at a source-level. The ergonomics for the tool could be better, and we've already identified some incorrect behavior on certain crates. If you're interested in helping out, please do! We are actively maintaining the project and have some issues written out.</p>
<h1 id="try-it-out"><a class="zola-anchor" href="#try-it-out" aria-label="Anchor link for: try-it-out">§</a>
Try it Out</h1>
<p><code>siderophile</code> is on Github along with a better explanation of how it works and how to run the tool. You should run it on your Rust crate and set up fuzzers for what it finds. <a href="https://github.com/trailofbits/siderophile">Check it out</a>!</p>
<p>Finally, thanks to <a href="https://github.com/anderejd/cargo-geiger">cargo-geiger</a> and <a href="https://github.com/praezi/rust/">rust-praezi</a> for current best practices. This project is mostly due to their work.</p>
Confidential Transactions from Cryptographic PrimitivesFri, 07 Jun 2019 02:53:07 +0000Michael Rosenberg
https://mrosenberg.pub/blog/confidential-transactions/
https://mrosenberg.pub/blog/confidential-transactions/<p><em>This article was originally posted on NCC Group's crypto services <a href="https://cryptoservices.github.io/cryptography/2017/07/21/Sigs.html">blog</a> on July 21, 2017</em></p>
<p>During my time at NCC Group this summer, I had the opportunity to dig into all sorts of cryptocurrency software to see how they work and what kind of math they rely on. One sought-after property that some cryptocurrencies (ZCash, Monero, all CryptoNote-based coins) support is <a href="https://elementsproject.org/elements/confidential-transactions/">confidential transactions</a>. To explain what this means, we'll first look at what Bitcoin transactions do.</p>
<p>At its core, a Bitcoin transaction is just a tuple $(\{a_i\}, \{b_i\}, \{v_i\})$ where $\{a_i\}$ are the input addresses, $\{b_i\}$ are the output addresses, and $\{v_i\}$ are the amounts that go to each output. We'll ignore the proof-of-work aspect, since it isn't quite relevant to where we're going with this. Each transaction appears unencrypted for the whole world to see in the public ledger. This is all well and good, but it makes transactions <a href="https://bitcoin.org/en/protect-your-privacy">easy</a> to <a href="https://www.sciencemag.org/news/2016/03/why-criminals-cant-hide-behind-bitcoin">trace</a>, even when the coins go through multiple owners. One way to make this harder is to use a tumbler, which essentially takes in Bitcoin from many sources, mixes them around, and hands back some fresh uncorrelated coins (you might be familiar with this concept under the term "money laundering").</p>
<p>The goal of confidential transactions is to let <em>just</em> the participants of a transactions see the $v_i$ values, and otherwise hide them from the rest of the world. But at the same time, we want non-participants to be able to tell when a transaction is bogus. In particular, we don't want a user to be able to print money by spending more than they actually have. This property was easily achieved in the Bitcoin scheme, since the number of Bitcoin in each address $a_i$ is publicly known. So a verifier need only check that the sum of the outputs doesn't exceed the sum of account contents of the input addresses. But how do we do this when the account contents and the output values are all secret? To show how, we'll need a primer in some core cryptographic constructions. There is a lot of machinery necessary to make this work, so bear with me.</p>
<h1 id="schnorr-signatures"><a class="zola-anchor" href="#schnorr-signatures" aria-label="Anchor link for: schnorr-signatures">§</a>
Schnorr Signatures</h1>
<p>The purpose of a signature is to prove to someone who knows your public information that you have seen a particular value. In the case of Schnorr Signatures, I am working in an abelian group $\mathbb{G}$ of prime order $q$ with generator $G$ and I have a public key $P = xG$ where $x \in \mathbb{Z}_q$ is my secret key.</p>
<p>First, we'll start off with Schnorr proof of knowledge. As a prover, I would like to prove to a verifier that I know the value of $x$ without actually revealing it. Here's how I do it:</p>
<ol>
<li>
<p>First, I pick a random $\alpha \leftarrow \mathbb{Z}_q$ and send $Q = \alpha G$ to the verifier.</p>
</li>
<li>
<p>The verifier picks $e \leftarrow \mathbb{Z}_q$ and sends it to me.</p>
</li>
<li>
<p>I calculate $s = \alpha - ex$ and send $s$ to the verifier.</p>
</li>
<li>
<p>Lastly, the verifier checks that $sG + eP = Q$. Note that if all the other steps were
performed correctly, then indeed</p>
<p>$$
sG + eP
=(\alpha - ex)G + exG
=\alpha G - exG + exG
= \alpha G
= Q
$$</p>
</li>
</ol>
<p>We can quickly prove that this scheme is <em>sound</em> in the sense that being able to consistently pass verification implies knowledge of the secret $x$. To prove this, it suffices to show that an adversary with access to a prover $\mathcal{P}$ and the ability to rewind $\mathcal{P}$ can derive $x$ efficiently. Suppose I have such a $\mathcal{P}$. Running it the first time, I give it any value $e \leftarrow \mathbb{Z}_q$. $\mathcal{P}$ will return its proof $s$. Now I rewind $\mathcal{P}$ to just before I sent $e$. I send a different value $e' \neq e$ and receive its proof $s'$. With these two values, I can easily compute</p>
<p>$$
\frac{s - s'}{e' - e} = \frac{\alpha - ex - \alpha + e'x}{e' - e} = \frac{x(e' - e)}{e' - e} = x
$$</p>
<p>and, voilà, the private key is exposed.</p>
<p>Ok that was pretty irrelevant for where I'm going, but I thought it was a nice quick proof. So how can we use proof of knowledge to construct a signature? Well we can tweak the above protocol in order to "bind" our proofs of knowledge to a particular message $M \in \{0,1\}^* $. The trick is to use $M$ in the computation of $e$. This also makes the interactivity of this protocol unnecessary. That is, since I am computing $e$ myself, I don't need a challenger to give it to me. But be careful! If we are able to pick $e$ without any restrictions in our proof-of-knowledge algorithm, then we can "prove" we know the private key to any public key $P$ by first picking random $e$ and $s$ and then retroactively letting $Q = sG + eP$. So in order to prevent forgery, $e$ must be difficult to compute before $Q$ is determined, while also being linked somehow to $M$. For this, we make use of a hash function $H: \{0,1\}^* \to \mathbb{Z}_q$. Here's how the algorithm to sign $M \in \{0,1\}^*$ goes. Note that because this no longer interactive, there is no verifier giving me a challenge:</p>
<ol>
<li>I pick a random $\alpha \leftarrow \mathbb{Z}_q$ and let $Q = \alpha G$.</li>
<li>I compute $e = H(Q \| M)$</li>
<li>I compute $s = \alpha - ex$</li>
<li>I return the signature, which is the tuple $\sigma = (s, e)$</li>
</ol>
<p>Observe that because hash functions are difficult to invert, this algorithm essentially guarantees that $e$ is determined after $Q$. To verify a signature $(s, e)$ of the message $m$, do the following:</p>
<ol>
<li>Let $Q = sG + eP$</li>
<li>Check that $e = H(Q \| M)$</li>
</ol>
<p>Fantastic! We're now a fraction of the way to confidential transactions! The next step is to extend this type of proof to a context with multiple public keys.</p>
<p>(Extra credit: prove that Schnorr is sound in the Random Oracle Model. That is, assume an adversary has the ability to run and rewind the prover $\mathcal{P}$ as before, but now also has to ability to intercept queries to $H$ and return its own responses, as long as those responses are <em>random</em> and <em>consisent</em> with responses on the same query input)</p>
<h1 id="aos-ring-signatures"><a class="zola-anchor" href="#aos-ring-signatures" aria-label="Anchor link for: aos-ring-signatures">§</a>
AOS Ring Signatures</h1>
<p>The signatures that end up being used in confidential transactions are called ring signatures. It's the same idea as a regular signature, except less specific: a ring signature of the message $m$ over the public keys ${P_1, P_2, \ldots, P_n}$ proves that someone with knowledge of <em>one of the private keys</em> ${x_1, x_2, \ldots, x_n}$ has seen the message $m$. So this is a strict generalization of the signatures above, since regular signatures are just ring signatures where $n=1$. Furthermore, it is generally desired that a ring signature not reveal which private key it was that performed the signature. This property is called <em>signer ambiguity</em>.</p>
<p>The <a href="https://www.iacr.org/cryptodb/archive/2002/ASIACRYPT/50/50.pdf">Abe, Okhubo, Suzuki</a> ring signature scheme is a generalization of Schnorr Signatures. The core idea of scheme is, for each public key, we compute an $e$ value that depends on the <em>previous</em> $Q$ value, and all the $s$ values are random except for the one that's required to "close" the ring. That "closure" is performed on the $e$ value whose corresponding public key and private key belong to us.</p>
<p>I'll outline the algorithm in general and then give a concrete example. Denote the public keys by $\{P_1, \ldots, P_n\}$ and let $x_j$ be the private key of public key $P_j$. An AOS signature of $M \in \{0,1\}^*$ is computed as follows:</p>
<ol>
<li>Pick $\alpha \leftarrow \mathbb{Z}_q$. $Q = \alpha G$ and let $e_{j+1} = H(Q \| M)$.</li>
<li>Starting at $j+1$ and wrapping around the modulus $n$, for each pick $s_i \leftarrow \mathbb{Z}_q$ and let $e_{i+1} = H(s_i G + e_i P_i \| M)$</li>
<li>Let $s_j = \alpha - e_jx_j$</li>
<li>Output the signature $\sigma = (e_0, s_0, s_1, \ldots, s_n)$.</li>
</ol>
<p>That's very opaque, so here's an example where there are the public keys $\{P_0, P_1, P_2\}$ and I know the value of $x_1$ such that $P_1 = x_1G$:</p>
<ol>
<li>I start making the ring at index 2: $\alpha \leftarrow \mathbb{Z}_q $. $e_2 = H(\alpha G \| M)$.</li>
<li>I continue making the ring. $s_2 \leftarrow \mathbb{Z}_q $. $e_0 = H(s_2 G + e_2 P_2 \| M) $.</li>
<li>I continue making the ring. $s_0 \leftarrow \mathbb{Z}_q $. $e_1 = H(s_0 G + e_0 P_0 \| M) $.</li>
<li>Now notice that $e_2$ has been determined in two ways: from before, $e_2 = H(\alpha G \| M)$, and also from the property which must hold for every $e$ value: $e_2 = H(s_1 G + e_1 P_1 \| M)$. The only $s_1$ that satisfies these constraints is $s_1 = \alpha - e_1x_1$, which I can easily compute, since I know $x_1$. 5. Finally, my signature is $\sigma = (e_0, s_0, s_1, s_2)$.</li>
</ol>
<p>The way to verify this signature is to just step all the way through the ring until we loop back around, and then check that the final $e$ value matches the initial one. Here are steps for the above example; the general process should be easy to see:</p>
<ol>
<li>Let $e_1 = H(s_0 G + e_0 P_0 \| M)$.</li>
<li>Let $e_2 = H(s_1 G + e_1 P_1 \| M)$.</li>
<li>Let $e'_0 = H(s_2 G + e_2 P_2 \| M)$.</li>
<li>Check that $e_0 = e'_0$.</li>
</ol>
<p>The verification process checks that <em>some</em> $s$ value was calculated <em>after</em> all the $e$ values were determined, which implies that some secret key is known. Which $s$ it is is well-hidden, though. Notice that all the $s$ values but the last one are random. And also notice that the final $s$ value has $\alpha$ as an offset. But that $\alpha$ was chosen randomly and was never revealed. So this final $s$ value is completely indistinguishable from randomness, and is thus indistinguishable from the truly random $s$ values. Pretty cool, huh?</p>
<p>There's one tweak we can make to this that'll slightly improve efficiency and make notation easier. Including $M$ at every step really isn't necessary. It just has to get mixed in at <em>some</em> point in the process. A natural place to put it is in $e_0 = H(s_{n-1} G + e_{n-1} P_{n-1} \| M)$ and calculate the other $e$ values without the $m$, like $e_{i+1} = H(s_iG + e_iP_i)$.</p>
<h1 id="borromean-ring-signatures"><a class="zola-anchor" href="#borromean-ring-signatures" aria-label="Anchor link for: borromean-ring-signatures">§</a>
Borromean Ring Signatures</h1>
<p>If you thought we were done generalizing, you're dead wrong. We've got one more step to go. Consider the following situation (and withhold your cries for practical application for just a wee bit longer): there are multiple sets of public keys $\mathcal{A}_1, \mathcal{A}_2, \mathcal{A}_3 $. I, having one private key in each $\mathcal{A}_i $, would like to sign a message $M$ in each of these rings. In doing so, I am proving "Some key in $\mathcal{A}_1$ signed $M$ AND some key in $\mathcal{A}_2$ signed $M$ AND some key in $\mathcal{A}_3$ signed $M$." The naïve approach is to make a separate AOS signature for each set of public keys, giving us a final signature of $\sigma = (\sigma_1, \sigma_2, \sigma_3)$. But it turns out that there is an (admittedly small) optimization that can make the final signature smaller.</p>
<p>Gregory Maxwell's <a href="https://github.com/ElementsProject/borromean-signatures-writeup">Borromean ring signature scheme</a><sup class="footnote-reference"><a href="#1">1</a></sup> makes the optimization of pinning $e_0$ as a shared $e$ value for all rings $\mathcal{A}_i$. More specifically, the paper defines</p>
<p>$$ e_0 = H(R_0 | R_1 | \ldots | R_{n-1} | M) $$</p>
<p>where each $R_i = s_{i, m_i-1} G + e_{i, m_i-1} P_{i, m_i-1}$ when $j_i \neq m_i-1$, and $R_i = \alpha_i G$ otherwise, and $m_i$ denotes the number of public keys in the $i^\textrm{th}$ ring, and $j_i$ denotes the index of the known private key in the $i^\textrm{th}$ ring. The whole $R$ thing is a technicality. The gist is that the last $e$ and $s$ values of every ring (whether it correspond to the known private key or not) are incorporated into $e_0$. Here's a pretty picture from the Maxwell Paper to aide your geometric intuition (if one believes in such silly things)</p>
<figure>
<img src="/assets/images/borromean.png" alt="A Borromean ring signature on two rings">
<figcaption>
A Borromean ring signature for $(P_0 | P_1| P_2) \& (P_0' | P_3 | P_4)$
</figcaption>
</figure>
<p>The signature itself looks like</p>
<p>$$
\sigma =
\left(
e_0, (s_{0,0}, s_{0,1}, \ldots, s_{1,m_0-1})
, \ldots ,
(s_{n-1,0}, \ldots, s_{n-1,m_{(n-1)}-1})
\right)
$$</p>
<p>where $s_{i,j}$ is the $j^\textrm{th}$ $s$ value in the $i^\textrm{th}$ ring.</p>
<p>For clarity, I did slightly modify some details from this paper, but I don't believe that the modifications impact the security of the construction whatsoever. There is also the important detail of mixing the ring number and position in the ring into at least one $e$ value per-ring so that rings cannot be moved around without breaking the signature. The mixing is done by simply hashing the values into some $e$.</p>
<p>Anyway, the end result of this construction is a method of constructing $n$ separate ring signatures using $\sum m_i + 1$ values (the $s$ values plus the one $e_0$) instead of the naïve way, in which we would have to include $e_{0,0}, e_{1,0}, \ldots, e_{(n-1),0}$. This saves us $n-1$ integers in the signature.</p>
<p>You might be wondering how large $n$ is that such savings are worth a brand-new signature scheme. If you are wondering that, stop reading, because you won't get an answer. Onwards towards more theory!</p>
<h1 id="pedersen-commitments"><a class="zola-anchor" href="#pedersen-commitments" aria-label="Anchor link for: pedersen-commitments">§</a>
Pedersen Commitments</h1>
<p>Alright, we have all the signature technology we need. Now let's turn that fear of math into fear of commitment(s). A commitment is a value that is published prior to the revealing of some information. The commitment proves that you knew that information before it was revealed. Suppose I wanted to prove to someone that I know the winner of tomorrow's horse race, but I don't want to tell them because they might make a massive bet and raise suspicion. I could tweet out the SHA256 hash</p>
<p><code>1c5d6a56ec257e5fe6f733e7</code><code>e81f6f2571475d44c09fa</code><code>a9ecdaa2ff1c4a49ecd</code></p>
<p>Once the race is over, I tweet again, revealing that the preimage of the hash was "Cloud Computing". Since finding the preimage of a hash function is capital-D-Difficult, I have effectively proven that I knew ahead of time that Cloud Computing would win (note: the set of possible winners is so small that someone can easily just try all the names and see what matches. In this case, I would pick a random number and commit to "Cloud Computing.ba9fd6d66f9bd53d" and then reveal <em>that</em> later.)</p>
<p>Pedersen commitments are a type of commitment scheme with some nice properties that the hashing technique above doesn't have. A Pedersen commitment in an abelian group $\mathbb{G}$ of prime order $q$ requires two public and unrelated generators, $G$ and $H$ (by unrelated, I mean there should be no obvious relation $aG = H$). If I want to commit to the value $v \in \mathbb{Z}_q$ I do as follows:</p>
<ol>
<li>Pick a random "blinding factor" $\alpha \leftarrow \mathbb{Z}_q $.</li>
<li>Return $Q = \alpha G + v H$ as my commitment.</li>
</ol>
<p>That's it. The way I reveal my commitment is simply by revealing my initial values $(\alpha, v)$. It's worth it to quickly check that the scheme is _binding_, that is, if I make a commitment to $(\alpha, v)$, it's hard to come up with different values $(\alpha', v')$ that result in the same commitment. For suppose I were able to do such a thing, then</p>
<p>$$
\alpha G + v H = \alpha' G + v' H
\implies (\alpha - \alpha')G = (v' - v)H
\implies G = \frac{v' - v}{\alpha - \alpha'}H
$$</p>
<p>and we've found the discrete logarithm of $H$ with respect to G, which we assumed earlier was hard. Another cool property (which is totally unrelated to anything) is <em>perfect hiding</em>. That is, for any commitment $Q$ and any value $v$, there is a blinding factor $\alpha$ such that $Q$ is a valid commitment to $(\alpha, v)$. This is just by virtue of the fact that, since $G$ is a generator, there must be an $\alpha$ such that $\alpha G = Q - vH$ (also since $H$ is also a generator, this also works if you fix $Q$ and $\alpha$ and derive $v$). Perfect hiding proves that, when $\alpha$ is truly random, you cannot learn anything about $v$, given just $Q$.</p>
<p>Lastly, and very importantly, Pedersen commitments are additively homomorphic. That means that if $Q$ commits to $(\alpha, v)$ and $Q'$ commits to $(\alpha', v')$, then</p>
<p>$$ Q + Q' = \alpha G + vH + \alpha' G + v'H = (\alpha + \alpha')G + (v + v')H $$</p>
<p>So the commitment $Q + Q'$ commits to $(\alpha + \alpha', v + v')$. We'll use this property in just a second.</p>
<h1 id="hiding-transaction-amounts"><a class="zola-anchor" href="#hiding-transaction-amounts" aria-label="Anchor link for: hiding-transaction-amounts">§</a>
Hiding Transaction Amounts</h1>
<p>Ok so back to the problem statement. We'll simplify it a little bit. A transaction has an input amount $a$, an output amount $b$, and a transaction fee $f$, all in $\mathbb{Z}_q$. To maintain consistency, every transaction should satisfy the property $a = b + f$, i.e., total input equals total output, so no money appears out of thin air and no money disappears into nothingness. We can actually already prove that this equation is satisfied without revealing any of the values by using Pedersen commitments. Pick random $\alpha_a \leftarrow \mathbb{Z}_q,, \alpha_b \leftarrow \mathbb{Z}_q$, and let $\alpha_f = \alpha_a - \alpha_b$. Now make the Pedersen commitments</p>
<p>$$ P = \alpha_a G + aH \quad Q = \alpha_b G + bH \quad R = \alpha_f G + fH $$</p>
<p>and publish $(P,Q,R)$ as your transaction. Then a verifier won't be able to determine any of the values of $a$, $b$, or $f$, but will still be able to verify that</p>
<p>$$ P - Q - R = (\alpha_a - \alpha_b - \alpha_f) G + (a - b - f)H = 0G + 0H = \mathcal{O} $$</p>
<p>Remember, if someone tries to cheat and picks values so $a - b - f \neq 0$, then they'll have to find an $\alpha$ such that $-\alpha G = (a - b - f) H$ which is Hard. So we're done, right? Problem solved! Well not quite yet. What we actually have here is a proof that $a - b - f \equiv 0, (\textrm{mod } q)$. See the distinction? For example, let $q$ be a large prime, say, 13. I'll have the input to my transaction be 1🔥TC (Litcoin; ICO is next week, check it out). I'd like to print some money, so I set my output to be 9🔥TC. I'll be generous and give the miner 5🔥TC as my fee. Then anyone can check via the generated Pedersen commitments that</p>
<p>$$ a - b - f = 1 - 9 - 5 = -13 \equiv 0, (\textrm{mod } 13) $$</p>
<p>So this transaction passes the correctness test. What happened? I overflowed and ended up wrapping around the modulus. Since all our arithmetic is done modulo $q$, none of the above algorithms can tell the difference! So how can we prevent the above situation from happening? How do I prove that my inputs don't wrap around the modulus and come back to zero? One word:</p>
<h1 id="rangeproofs"><a class="zola-anchor" href="#rangeproofs" aria-label="Anchor link for: rangeproofs">§</a>
Rangeproofs</h1>
<p>To prove that our arithmetic doesn't wrap around the modulus, it suffices to prove that the values $a,b,f$ are small enough such that their sum does not exceed $q$. To avoid thinking about negative numbers, we'll check that $a = b + f$ instead of $a - b - f = 0$, which are identical equations, but the first one will be a bit easier to reason about. To show that $b + f < q$, we will actually show that $b$ and $f$ can be represented in binary with $k$ bits, where $2^{k+1} < q$ (this ensures that overflow can't happen since $b,f < 2^k$ and $2^k + 2^k = 2^{k+1} < q$). In particular, for both $b$ and $f$, we will make $k$ Pedersen commitments, where each $v$ value is provably 0 or a power of two, and the sum of the commitments equals the commitment of $b$ or $f$, respectively. Let's do it step by step.</p>
<ol>
<li>
<p>I start with a value $v$ that I want to prove is representable with $k$ bits. First, pick a random $\alpha \leftarrow \mathbb{Z}_q$ and make a Pedersen commitment $P = \alpha G + v H$</p>
</li>
<li>
<p>Break $v$ down into its binary representation: $v = b_0 + 2b_1 + \ldots + 2^{k-1}b_{k-1} $.</p>
</li>
<li>
<p>For each summand, make a Pedersen commitment, making sure that the sum of the commitments is $P$. That is,</p>
<p>$$
\forall 0 \leq i < k-1 \quad \textrm{pick } \alpha_i \leftarrow \mathbb{Z}<em>q \
\textrm{and let } \alpha</em>{k-1} = \alpha - \sum_{i=0}^{k-2} \alpha_i
$$</p>
<p>Then for all $i$, commit</p>
<p>$$ P_i = \alpha_i G + 2^ib_i H $$</p>
<p>This ensures that $P = P_0 + P_1 + \ldots + P_{k-1}$. The verifier will be checking this property later.</p>
</li>
</ol>
<p>Great. So far we've provably broken down a single number into $k$ constituents, while hiding all
the bits. But how does a verifier know that all the $b$ values are bits? What's preventing me
from picking $b_0 = 3^{200}$, for example? This is where we will use ring signatures! For each
commitment, we'll make the set $\mathcal{A}_i = \{P_i, P_i - 2^iH\}$ and treat that as a set
of public keys for a ring signature. Note that, because we know the binary expansion of $v$, we
know the private key to exactly one of the public keys in $\mathcal{A}_i$. This is because</p>
<p>$$
\begin{aligned}
b_i = 0 &\implies P_i = \alpha_i G + 0H = \alpha_i G \newline
b_i = 1 &\implies P_i - 2^iH = \alpha_i G + 2^iH - 2^iH = \alpha_i G
\end{aligned}
$$</p>
<p>So to prove that $b_i = 0 \textrm{ or } 1$, we construct a ring signature over $\mathcal{A}_i$. Since the ring signature is signer-ambiguous, a verifier can't determine which key did the signing. This means we get to hide all the bits, while simultaneously proving that they are indeed bits! We get some space savings by using Borromean signatures here, since we'll have $k$ total signatures of size 2 each. The final rangeproof of the value $v$ is thus</p>
<p>$$
R_v = (P_0, \ldots, P_k, e_0, s_0, \overline{s_0}, s_1, \overline{s_1}, \ldots, s_k, \overline{s_k})
$$</p>
<p>where $s_i$ and $\overline{s_i}$ are the $s$ values of the $i^\textrm{th}$ ring signature. Obviously, the choice of binary representation as opposed to, say, base-16 representation is arbitrary, since you can make rings as big as you want, where each public key corresponds to a digit in that representation. But note that the space savings that Borromean ring signatures give us come from the number of rings, not their size. So it appears to be a good strategy to make the rings as small as possible and let the center $e_0$ value take the place of as many $e$ values as possible.</p>
<h1 id="putting-it-all-together"><a class="zola-anchor" href="#putting-it-all-together" aria-label="Anchor link for: putting-it-all-together">§</a>
Putting It All Together</h1>
<p>So to recap, we have picked transaction input $a$, output $b$, and fee $f$, and hidden them with Pedersen commitments $P_a$, $P_b$, and $P_f$. This gives verifiers the ability to check correctness of the transaction up to modulus-wrapping. Then we constructed the commitments' corresponding rangeproofs $R_a$, $R_b$, and $R_f$ so that a verifier gets the last piece of assurance that the transaction is correct <em>and</em> there is no overflow. So, in total, a confidential transaction is the tuple</p>
<p>$$ (P_a, P_b, P_f, R_a, R_b, R_f) $$</p>
<p>And that's how confidential transactions work! If I want to send 🔥TC to someone, I can construct a confidential transaction that I make public, and then privately reveal the commitments for $P_a$, $P_b$ and $P_f$ so that they can be sure that I actually sent what I claim. Because the commitments are binding, they can be certain that I can't claim to someone else that I sent different $a$, $b$ or $f$ values.</p>
<p>There's plenty more detail in how transactions are constructed that I didn't cover, but I hope I was able to explain the core of confidential transactions, and hopefully interest you in cryptography a little bit more. There's a lot of cool stuff out there, and cryptocurrencies are a massive playing field for novel constructions.</p>
<div class="footnote-definition" id="1"><sup class="footnote-definition-label">1</sup>
<p>Sorry, you're gonna have to compile the $\LaTeX$ yourself. Every PDF on the internet is
either outdated or erroneous.</p>
</div>