Multilinear Communication Protocol for Message Comparison

August 21, 2024

See Reed-Solomon Encoding and Fingerprinting for a univariate protocol based on Reed-Solomon encodings.

Protocol

Alice and Bob have messages of length nn. Bob wants to check if Alice's message is equal to his but transmitting the full messages is too expensive. This protocol uses multilinear extensions to enable message comparison in O(log2(n)2)O(log_2(n)^2) bits of communication.

  1. Fix a prime pp such that p>ncp>n^c where cc is a constant greater than 1 and p>sp>s where ss is the number of symbols in the character set used to write the messages.
  2. Map each symbol in the messages to its corresponding element of Fp\mathbb{F}_p and represent the messages as nn-dimensional vectors a=(a1,,an)Fpna = (a_1,\dots,a_n) \in \mathbb{F}_p^n and b=(b1,,bn)Fpnb = (b_1,\dots,b_n) \in \mathbb{F}_p^n.
  3. Let gag_a be a function mapping the binary representations of the indices {0,,n1}\{0,\dots,n-1\} to their corresponding entries in aa. For i{0,,n1}i \in \{0,\dots,n-1\} represented as binary vectors
    ga(i)=ai+1g_a(i) = a_{i+1}
  4. There are nn entries in aa, so k=log2(n)k = \lceil log_2(n) \rceil bits of binary are required to encode them. Since 2k2^k may be greater than nn, excess indices in the domain {0,1}k\{0,1\}^k will be mapped to 00.
    ga:{0,1}kFpg_a:\{0,1\}^k\rightarrow \mathbb{F}_p
  5. To specify gag_a, we use multivariate Lagrange interpolation. For interpolating set {0,1}k\{0,1\}^k, the Lagrange basis is the set of polynomials {δv:v{0,1}k}\{\delta_v : v \in\{0,1\}^k\}.
    • For v{0,1}kv \in \{0,1\}^k, the vvth Lagrange basis polynomial δv:{0,1}k{0,1}\delta_v: \{0,1\}^k \rightarrow \{0,1\} maps all x{0,1}k{v}x \in \{0,1\}^k \setminus \{v\} to 0 and x=vx = v to 1.
      δv(x1,,xk)=i=1kxivi+(1xi)(1vi)\delta_v(x_1,\dots,x_k)= \prod_{i=1}^{k}x_i \cdot v_i + (1 - x_i)(1-v_i)
  6. The interpolating polynomial is
    ga(x1,,xk)=i{0,1}kai+1δi(x1,,xk)g_a(x_1,\dots,x_k) = \sum_{i \in \{0,1\}^k} a_{i+1}\cdot \delta_i(x_1,\dots,x_k)
    • ai+1a_{i+1} represents the entry in vector aa corresponding to index i+1i+1.
  7. Let g~a:FpnFp\tilde{g}_a: \mathbb{F}_p^n \rightarrow \mathbb{F}_p be the extension of gag_a from the domain {0,1}k\{0,1\}^k to the larger domain Fpn\mathbb{F}_p^n.
    • {0,1}\{0,1\} map to the additive and multiplicative identities in Fpn\mathbb{F}_p^n.
  8. Use the same method to define a function g~b:FpnFp\tilde{g}_b: \mathbb{F}_p^n \rightarrow \mathbb{F}_p where g~b(i)=bi+1\tilde{g}_b(i) = b_{i+1} for i{0,,n1}i \in \{0,\dots,n-1\} represented as kk-dimensional binary vectors.
    gb(x1,,xk)=i{0,1}kbi+1δi(x1,,xk)g_b(x_1,\dots,x_k) = \sum_{i \in \{0,1\}^k} b_{i+1}\cdot \delta_i(x_1,\dots,x_k)
  9. Alice evaluates y=g~a(x)y = \tilde{g}_a(x) at a random vector xFpnx \in \mathbb{F}_p^n.
  10. Alice sends yy and xx to Bob.
  11. Bob evaluates z=g~b(x)z = \tilde{g}_b(x) and checks that z=yz = y. If they are not equal, he concludes that aba \neq b. If they are equal, he concludes it is likely that the a=ba = b.
    PrxFpn[g~a(x)=g~b(x)]kp\Pr_{x \in \mathbb{F}_p^n}[\tilde{g}_a(x) = \tilde{g}_b(x)] \leq \frac{k}{p}

Analysis

Error Probability

If Alice and Bob's messages are equal, a=ba = b and xFpk(g~a(x)=g~b(x))\forall x\in \mathbb{F}_p^k ( \tilde{g}_a(x) = \tilde{g}_b(x)). Then for any vector xFpkx\in \mathbb{F}_p^k picked by Alice, Bob will find that g~a(x)=g~b(x)\tilde{g}_a(x) = \tilde{g}_b(x) and output EQUAL, so the error probability is 0.

If Alice and Bob's messages are not equal, aba \neq b and the error probability is the probability that for the vector xFpkx\in \mathbb{F}_p^k picked by Alice, g~a(x)=g~b(x)\tilde{g}_a(x) = \tilde{g}_b(x), causing Bob to output EQUAL for two non equal messages.

  • See Multilinear Lagrange Interpolation for proof that if aba \neq b then g~a(x)g~b(x)\tilde{g}_a(x) \neq \tilde{g}_b(x).
  • Since g~a(x)\tilde{g}_a(x) and g~b(x)\tilde{g}_b(x) are both multilinear and kk variate, they both have total degree k\leq k.
  • Let q=g~ag~bq = \tilde{g}_a - \tilde{g}_b. Since g~a(x)g~b(x)\tilde{g}_a(x) \neq \tilde{g}_b(x), qq is nonzero. It is also multilinear and kk variate since it is the difference of two multilinear kk variate polynomials. The Schwartz-Zippel gives us the probability that a random vector xFpkx\in \mathbb{F}_p^k is a zero of qq
    PrxFpk[q(x)=0]kp\Pr_{x \in \mathbb{F}^k_p}[q(x) = 0]\leq \frac{k}{p}
  • This is the same as the probability that g~a(x)\tilde{g}_a(x) and g~b(x)\tilde{g}_b(x) agree at xFpkx\in \mathbb{F}_p^k
    PrxFpk[g~a(x)=g~b(x)]kp\Pr_{x \in \mathbb{F}^k_p}[\tilde{g}_a(x) = \tilde{g}_b(x)]\leq \frac{k}{p}
  • To bound the probability of Bob determining that a=ba = b to 1n\frac{1}{n}, we need it to be the case that k/p1/nk/p \leq 1/n. Since k=log2(n)k = log_2(n), picking p>log2(n)np > log_2(n)\cdot n will achieve this goal.
    kp=log2(n)nlog2(n)=1n\frac{k}{p} = \frac{log_2(n)}{n \cdot log_2(n)} = \frac{1}{n}

Communication Bits

  • Alice must send bob xFpkx\in \mathbb{F}_p^k and g~a(x)\tilde{g}_a(x) which is an element of Fp\mathbb{F}_p, so she sends k+1k+1 field elements.
  • There's pp elements in the field, so we need log2(p)\lceil log_2(p) \rceil bits to represent a field element.
    • Suppose we've selected p=ncp = n^c where cc is a constant and c>1c \gt1.
    • Then it is still the case that p>log2(n)np > log_2(n)\cdot n, bounding the error probability below 1n\frac{1}{n}
    • The number of bits required to transmit a field element is then clog2(n)\lceil c \cdot log_2(n) \rceil.
  • Since k=log2(n)k = \lceil log_2(n) \rceil, we can compute the number of bits to transmit k+1k+1 field elements:
    (k+1)(clog2(n))(k + 1)(\lceil c \cdot log_2(n) \rceil)
    =(log2(n)+1)(clog2(n))= (\lceil log_2(n) \rceil + 1)(\lceil c \cdot log_2(n) \rceil)
    =log2(n)clog2(n)+clog2(n)=\lceil log_2(n) \rceil \cdot \lceil c \cdot log_2(n) \rceil + \lceil c \cdot log_2(n) \rceil
  • Dropping the insignificant terms, we have log2(n)log2(n)log_2(n)\cdot log_2(n). Thus, the communication cost in big O is O(log2(n)2)O(log_2(n)^2).

Profile picture

Written by Breads