pdf icon
Theory of Computing Library
Graduate Surveys 9
A Survey on Distribution Testing: Your Data is Big. But is it Blue?
Published: August 15, 2020    (100 pages)
Download article from ToC site:
[PDF (804K)] [PS (5696K)] [Source ZIP]
Keywords: property testing, distribution testing, probability distributions, algorithms, lower bounds
ACM Classification: G.3, F.2.2
AMS Classification: 68Q32, 68W20, 68Q17, 68Q87

Abstract: [Plain Text Version]

The field of property testing originated in work on program checking, and has evolved into an established and very active research area. In this work, we survey the developments of one of its most recent and prolific offsprings, distribution testing. This subfield, at the junction of property testing and Statistics, is concerned with studying properties of probability distributions.

We cover the current status of distribution testing in several settings, starting with the traditional sampling model where the algorithm obtains independent samples from the distribution. We then discuss different recent models, which either grant the testing algorithms more powerful types of queries, or evaluate their performance against that of an information-theoretically optimal “adversary” (for a given number of samples). In each setting, we describe the state of the art for a variety of testing problems.

We hope this survey will serve as a self-contained introduction for those considering research in this field.