Theory Seminar: Verifying The unseen: Interactive proofs for Label-Invariant distribution Properties

Tal Herman (Weizmann Institute of Science)

Wednesday, 18.01.2023, 12:30

Taub 201

Given i.i.d. samples from an unknown distribution over a large domain [N], approximating several basic quantities, including the distribution’s support size, its entropy, and its distance from the uniform distribution, requires (NlogN) samples [Valiant and Valiant, STOC 2011].
Suppose, however, that we can interact with a powerful but untrusted prover, who knows the entire distribution (or a good approximation of it). Can we use such a prover to approximate (or rather, to approximately {\em verify}) such statistical quantities more efficiently? We show that this is indeed the case: the support size, the entropy, and the distance from the uniform distribution, can all be approximately verified via a 2-message interactive proof, where the communication complexity, the verifier’s running time, and the sample complexity are O(N) . For all these quantities, the sample complexity is tight up to \polylogN factors (for any interactive proof, regardless of its communication complexity or verification time).
More generally, we give a tolerant interactive proof system with the above sample and communication complexities for verifying a distribution’s proximity to any label-invariant property (any property that is invariant to re-labeling of the elements in the distribution’s support). The verifier’s running time in this more general protocol is also O(N) , under a mild assumption about the complexity of deciding, given a compact representation of a distribution, whether it is in the property or far from it.