Word Embedding Association Test

Learn how to systematically evaluate vectors fairness.

So far, we’ve utilized various metrics to assess the fairness of our model. Each metric had its advantages and disadvantages, but importantly, they were quantifiable. Evaluating fairness in natural language processing is more challenging due to the inherent ambiguity of language. However, we have tools at our disposal. One notable method is the Word Embedding Association Test (WEAT), which allows us to evaluate embeddings to detect bias systematically.

Understanding WEAT

WEAT is a statistical test. If you’re unfamiliar with statistical testing, fear not! We’ll break it down for you. Initially, we establish a null hypothesis. This hypothesis says that the embeddings are unbiased. The null hypothesis represents our baseline assumption, and we need substantial evidence to reject it. If it’s proven false, then what? We also formulate an alternative hypothesis, which essentially opposes the null hypothesis. If the null hypothesis is disproven, we then accept the alternative hypothesis.

For this test, we define two sets of target words (X and Y). The first target set contains words we suspect to be masculine (e.g., engineer, scientist, architect), while the second set is perceived as feminine (e.g., nurse, receptionist). The target sets must be consistent, containing similar types of words like job titles, adjectives, or names rather than a mix.

We also define two attribute sets (A and B): one for male descriptors (e.g., he, him, his, man) and another for female descriptors (e.g., she, her, woman).

If the null hypothesis holds actual, interchanging words between the target sets shouldn’t result in significant shifts in the similarity between target (X, Y) and attribute (A, B) vectors. Intuitively, if vectors are gender-neutral, then, on average, the similarity remains consistent.

We introduce a function, s(w,A,B)s(w, A, B), to gauge the similarity of a word with its attributes:

Get hands-on with 1200+ tech skills courses.