Revised: January 22, 2019
Published: September 25, 2020
Abstract: [Plain Text Version]
The population recovery problem is a basic problem in noisy unsupervised learning that has attracted significant attention in recent years (Dvir et al., ITCS'12), (Wigderson and Yehudayoff, STOC'12), (Moitra and Saks, FOCS'13), (Batman et al., RANDOM'13), (Lovett and Zhang, STOC'15), (De et al., FOCS'16). A number of variants of this problem have been studied, often under assumptions on the unknown distribution (such as that it has restricted support size). In this article we study the sample complexity and algorithmic complexity of the most general version of the problem, under both the bit-flip noise and the erasure noise models. We give essentially matching upper and lower sample complexity bounds for both noise models, and efficient algorithms matching these sample complexity bounds up to polynomial factors.