CDS Lunch Seminar: Finding Interpretable Word Embedding Subspaces using Covariance and Correlation Maximization

Speaker: Sophie Hao

Location: 60 Fifth Avenue, Room 7th Floor Open Space

Date: Wednesday, September 14, 2022

This talk describes a new method for estimating a direction in a word embedding subspace corresponding to an interpretable semantic property such as gender, race, or religion. Our technique assumes that words can be assigned numerical scores that quantify their association with the target property. We estimate the subspace by maximizing the covariance or correlation of these scores with the projection of word embeddings along the subspace. Using our technique, we show that word embedding spaces in English, French, and Simplified Chinese contain subspaces that encode gender, race, religion, sentiment, word length, and national population. We then apply our technique to the mitigation of gender and racial bias from word embeddings. We find that using our technique to estimate a gender or race subspace improves performance on several benchmarks.