When Do You Need Billions of Words of Pretraining Data? (code) Unpublished Manuscript, 2020
Asking Crowdworkers to Write Entailment Examples: The Best of Bad Options (code and data) Proceedings of AACL, 2020
English Intermediate-Task Training Improves Zero-Shot Cross-Lingual Transfer Too (code) Proceedings of AACL, 2020
Learning Which Features Matter: RoBERTa Acquires a Preference for Linguistic Generalizations (Eventually) (code and data) Proceedings of EMNLP, 2020
Precise Task Formalization Matters in Winograd Schema Evaluations (code) Proceedings of EMNLP (short paper), 2020
New Protocols and Negative Results for Textual Entailment Data Collection (code and data) Proceedings of EMNLP, 2020
CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models (code and data) Proceedings of EMNLP, 2020
Counterfactually-Augmented SNLI Training Data Does Not Yield Better Generalization Than Unaugmented Data (code and data) Proceedings of the Workshop on Insights from Negative Results in NLP, 2020
Do self-supervised neural networks acquire a bias towards structural linguistic generalizations? (code and data) Proceedings of CogSci, 2020
Self-Training for Unsupervised Parsing with PRPN (code) Proceedings of IWPT, 2020
Intermediate-Task Transfer Learning with Pretrained Language Models: When and Why Does It Work? Proceedings of ACL, 2020
jiant: A Software Toolkit for Research on General-Purpose Text Understanding Models (project site) Proceedings of ACL (demonstration track), 2020
BLiMP: A Benchmark of Linguistic Minimal Pairs for English (project site) Transactions of the ACL (TACL), 2020
Learning to Learn Morphological Inflection for Resource-Poor Languages Proceedings of AAAI, 2020
Do Attention Heads in BERT Track Syntactic Dependencies? Unpublished manuscript, 2019
Inducing Constituency Trees through Neural Machine Translation Unpublished manuscript, 2019
Neural Unsupervised Parsing Beyond English Proceedings of The Workshop on Deep Learning for Low-Resource NLP (DeepLo), 2019
SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems (project site, baseline code) Proceedings of NeurIPS, 2019
Can Unconditional Language Models Recover Arbitrary Sentences? Proceedings of NeurIPS, 2019
Investigating BERT’s Knowledge of Language: Five Analysis Methods with NPIs (code and data) Proceedings of EMNLP, 2019
Towards Realistic Practices In Low-Resource Natural Language Processing: The Development Set Proceedings of EMNLP, 2019
Neural Network Acceptability Judgments (corpus page) Transactions of the ACL (TACL), 2019
Can You Tell Me How to Get Past Sesame Street: Sentence-Level Pretraining Beyond Language Modeling (code) Proceedings of ACL, 2019
Human vs. Muppet: A Conservative Estimate of Human Performance on the GLUE Benchmark Proceedings of ACL, 2019
Probing What Different NLP Tasks Teach Machines about Function Word Comprehension (code) Proceedings of *SEM, 2019 Best Paper Award
Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks Unpublished manuscript, 2019
On Measuring Social Biases in Sentence Encoders Proceedings of NAACL, 2019
Identifying and Reducing Gender Bias in Word-Level Language Models Proceedings of the NAACL Student Research Workshop, 2019
Grammatical Analysis of Pretrained Sentence Encoders with Acceptability Judgments (data) Unpublished manuscript, 2019
Looking for ELMo's Friends: Sentence-Level Pretraining Beyond Language Modeling (code) Unpublished manuscript superseded by Can You Tell Me How to Get Past Sesame Street, above, 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding (project site) Proceedings of ICLR, 2019
What do you learn from context? Probing for sentence structure in contextualized word representations (code) Proceedings of ICLR, 2019
Language Modeling Teaches You More Syntax than Translation Does: Lessons Learned Through Auxiliary Task Analysis Unpublished manuscript, 2018
Verb Argument Structure Alternations in Word and Sentence Embeddings (corpus page) Proceedings of SCiL, 2018
A Stable and Effective Learning Strategy for Trainable Greedy Decoding Proceedings of EMNLP, 2018
XNLI: Cross-lingual Sentence Understanding through Inference (corpus page) Proceedings of EMNLP, 2018
Grammar Induction with Neural Language Models: An Unusual Replication (code) Proceedings of EMNLP (short paper), 2018
The Lifted Matrix-Space Model for Semantic Composition Proceedings of CoNLL, 2018
ListOps: A Diagnostic Dataset for Latent Tree Learning (code and data) Proceedings of the NAACL Student Research Workshop, 2018
Training a Ranking Function for Open-Domain Question Answering Proceedings of the NAACL Student Research Workshop, 2018
A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference (corpus page) Proceedings of NAACL, 2018
Do latent tree learning models identify meaningful structure in sentences? (code) Transactions of the ACL (TACL), 2018
Annotation Artifacts in Natural Language Inference Data (data on the MultiNLI corpus page) Proceedings of NAACL (short paper), 2018
Ruminating Reader: Reasoning with Gated Multi-Hop Attention Proceedings of the Workshop on Machine Reading for Question Answering, 2018
The RepEval 2017 Shared Task: Multi-Genre Natural Language Inference with Sentence Representations Proceedings of RepEval 2017: The Second Workshop on Evaluating Vector Space Representations for NLP, 2017
Detecting and Explaining Crisis Proceedings of The 2017 Computational Linguistics and Clinical Psychology Workshop, 2017
Sequential Attention Proceedings of the 2nd Workshop on Representation Learning for NLP, 2017
Discourse-Based Objectives for Fast Unsupervised Sentence Representation Learning Unpublished manuscript, 2017
Modeling natural language semantics in learned representations Stanford University Dissertation, 2016
Generating Sentences from a Continuous Space Proceedings of CoNLL, 2016
A Fast Unified Model for Parsing and Sentence Understanding (code) Proceedings of ACL, 2016
Tree-structured composition in neural networks without tree-structured architectures Proceedings of the NIPS Workshop on Cognitive Computation: Integrating Neural and Symbolic Approaches, 2015
A large annotated corpus for learning natural language inference (corpus page) Proceedings of EMNLP, 2015 Best New Data Set or Resource Award
Recursive Neural Networks Can Learn Logical Semantics (code and data, poster) Proceedings of The 3rd Workshop on Continuous Vector Space Models and their Compositionality, 2015
Learning Distributed Word Representations for Natural Logic Reasoning Proceedings of the AAAI Spring Symposium on Knowledge Representation and Reasoning, 2015
A Gold Standard Dependency Corpus for English Proceedings of LREC, 2014
Can recursive neural tensor networks learn logical reasoning? (code and data) Unpublished manuscript, 2014
Idiosyncratic transparent vowels in Kazakh Proceedings of AMP, 2013 A typo in item (39) in the published version is corrected here.
More constructions, more genres: Extending Stanford Dependencies Proceedings of DepLing, 2013
Two arguments for vowel harmony by trigger competition Proceedings of CLS, 2013
Automatic animacy classification (poster) Proceedings of The NAACL Student Research Workshop, 2012
Vowel varmony, opacity, and finite-state OT Technical report TR-2011-03, Department of Computer Science, The University of Chicago.
Speech recognition with segmental conditional random fields: A summary of the JHU CLSP 2010 Summer Workshop Proceedings of ICASSP, 2011
Modeling pronunciation variation with context-dependent articulatory feature decision trees Proceedings of Interspeech, 2010
An aside: My Erdős number is 4, by way of Karen Livescu, Kamalika Chaudhuri, and Fan Chung, by way of Chris Manning, Val Spitkovsky, and Daniel Kleitman, or by way of Victor O.K. Li, Kuang Xu, and Joel H. Spencer.
How do we fix natural language understanding evaluation? Invited talk slides for a CMU ML Department virtual invited talk, 2020
Evaluating Recent Progress Toward General-Purpose Language Understanding Models Invited talk slides for a Google Research virtual invited talk, 2020
Evaluating Recent Progress Toward General-Purpose Language Understanding Models (video) Invited talk slides for the Allen Institute for AI and the University of Washington, 2019
Task-Independent Language Understanding Invited talk slides for Cornell and IBM Research, 2019
Task-Independent Sentence Understanding Models *SEM/SemEval joint invited talk slides, 2019
Deep Learning for Natural Language Inference NAACL tutorial, 2019
A large annotated corpus of entailments and contradictions Talk Slides from California Universities Semantics and Pragmatics, 2015
Computational Linguistics Guest lecture for an introductory linguistics class with Asya Pereltsvaig, 2015
Neural networks for natural language understanding Guest lecture for Chris Potts and Bill MacCartney's computational natural language understanding class, 2015
vector-entailment: A MATLAB toolkit for tree-structured recursive neural networks 2015
Transparent vowels in ABC: open issues ABC↔Conference invited talk handout, 2014
Seto vowel harmony and neutral vowels Presentation at LSA, 2013
Measuring amok Course paper, 2012