Soledad Villar
NYU Center for Data Science
60 5th Ave, New York, NY 1011
Office 621
soledad.villar at nyu edu

I am a Research Fellow at the Center for Data Science at New York University. I also have a Collaboration Scientist appointment at the Algorithms and Geometry Simons Collaboration. My research is in mathematical data science and optimization. My papers can be found on my Google Scholar profile and my code can be found on my Github.

Before I was a Research Fellow at the Simons Institute, UC Berkeley. I got my PhD in Mathematics from the University of Texas at Austin, my PhD advisor is Rachel Ward. Here is my CV.

Optimization and learning techniques for clustering problems Statistical Physics and Machine Learning back together. Cargese, August 2018.

Mathematical data science

I am interested in computational and mathematical aspects of extracting information from data. In particular I am interested in clustering, quadratic assingment, dimensionality reduction. I am also interested in deep learning, in particular graph neural networks and generative models. I recently have been working on problems related with single-cell RNA sequencing data, thanks to my friend and collaborator Bianca Dumitrascu.


Selected publications
SqueezeFit: Label-aware dimensionality reduction by semidefinite programming.
With C. McWhirter and D. G. Mixon. In IEEE Transactions on Information Theory (to appear).
On the equivalence between graph isomorphism testing and function approximation with GNNs.
With Z. Chen , L. Chen and J. Bruna . In Advances in Neural Information Processing Systems (NeurIPS 2019) pp. 15868-15876.
Clustering subgaussian mixtures by semidefinite programming.
With D. G. Mixon and R. Ward. In Information and Inference: A Journal of the IMA 6 (4), pp. 389-415. [code]
Probably certifiably correct k-means clustering.
With T. Iguchi, D. G. Mixon and J. Peterson. In Mathematical Programming 2017 (165), pp. 605–642. [code]
Relax, no need to round: integrality of clustering formulations.
With P. Awasthi, A. S. Bandeira, M. Charikar, R. Krishnaswamy and R. Ward. In Proceedings of the 2015 Conference on Innovations in Theoretical Computer Science. pp. 191-200.
Fair redistricting is hard.
With R. Kueng and D. G. Mixon. In Theoretical Computer Science 2019 (791), pp. 28-35.
Experimental performance of graph neural networks on random instances of max-cut.
With W. Yao and A. S. Bandeira In SPIE Wavelets and Sparsity 2019 XVIII 11138, 111380S.
A note on learning algorithms for quadratic assignment with graph neural networks.
With A. Nowak, A. S. Bandeira and J. Bruna. In IEEE Data Science Workshop, 2018 pp. 229-233. [code]
Projected power iteration for network alignment.
With E. Onaran In SPIE Wavelets and Sparsity 2017 XVII 10394, 103941C.
Manifold optimization for k-means clustering.
With T. Carson, D. G. Mixon and R. Ward In IEEE International Conference on Sampling Theory and Applications (SampTA 2017) pp. 73-77. [code]
Preprints and working papers
Optimal gene selection for cell type discrimination in single cell analyses.
With B. Dumitrascu, D. G. Mixon and B. Engelhardt . [code]
Can graph neural networks count substructures?
With Z. Chen , L. Chen and J. Bruna .
MREC: a fast and versatile framework for aligning and matching point clouds with applications to single cell molecular data .
With A. J. Blumberg, M. Carriere, M. A. Mandell and R. Rabadan .
Utility Ghost: Gamified redistricting with partisan symmetry.
With D. G. Mixon.
SUNLayer: stable denoising with generative networks.
With D. G. Mixon.
Monte Carlo approximation certificates for k-means clustering.
With D. G. Mixon.
A polynomial-time relaxation of the Gromov-Hausdorff distance.
With A. S. Bandeira, A. J. Blumberg and R. Ward. [code]
On the tightness of an SDP relaxation of kmeans clustering.
With T. Iguchi, D. G. Mixon and J. Peterson.
Unpublished undergraduate and master's thesis (Number Theory)
I used to study number theory. My advisor was Gonzalo Tornaria, in Universidad de la Republica, Uruguay.
Gross formula on heights and special values of L-series.
My master thesis on modular forms and quaternion algebras (in Spanish).
Pell curves cryptography and generalizations.
My undergraduate thesis (in Spanish).

NYU Center for Data Science. Fall 2019

Inference and Representation .

University of Texas at Austin (2012-2015)

I have worked as a Teaching Assistant (leading discussion sessions, holding office hours and grading) for the following courses.
  • Differential equations
  • From numbers to chaos
  • Differential calculus
  • Integral calculus
  • Functions of a complex variable
  • Introduction to mathematics

Universidad de la República, Uruguay (2008-2012)

College of Engineering (2012)
I worked as an Instructor (holding lectures and grading) for Calculus 1.
College of Natural Sciences (2008-2012)
I worked as a Teaching Assistant (leading discussion sessions and grading) in the following courses.
  • Introduction to topology
  • Programming (Python)
  • Programming (Haskell)
  • Linear algebra
  • Mathematics for life sciences

Universidad Católica del Uruguay (2010-2011)

I worked as a teaching assistant and lecturer in the following courses.
  • Complex analysis (teaching assistant)
  • Linear algebra (lecturer)

Math Olympiads

I started doing math for fun thanks to the Uruguay Math Olympiads. After I graduated from high school I joined the organization as a volunteer. I worked as a trainer and a jury for the National Math Olympiads. I have conducted workshops and seminars for high school students and teachers in which I taught tools on problem solving.