This repository hosts code and datasets relating to Responsible NLP projects from Meta AI.
- From Eric Michael Smith, Melissa Hall, Melanie Kambadur, Eleonora Presani, Adina Williams. "I'm sorry to hear that": finding bias in language models with a holistic descriptor dataset. 2022.
- Code to generate a dataset, HolisticBias, consisting of nearly 600 demographic terms in over 450k sentence prompts
- Code to calculate Likelihood Bias, a metric of the amount of bias in a language model, defined on HolisticBias demographic terms
- From Rebecca Qian, Candace Ross, Jude Fernandes, Eric Smith, Douwe Kiela, Adina Williams. Perturbation Augmentation for Fairer NLP. 2022.
- PANDA, an annotated dataset of 100K demographic perturbations of diverse text, rewritten to change gender, race/ethnicity and age references.
- The perturber, pretrained models, code and other artifacts related to the Perturbation Augmentation for Fairer NLP project will be released shortly.