GeniL: A Multilingual Dataset on Generalizing Language

  • 2024-04-08 21:58:06
  • Aida Mostafazadeh Davani, Sagar Gubbi, Sunipa Dev, Shachi Dave, Vinodkumar Prabhakaran
  • 0

Abstract

LLMs are increasingly transforming our digital ecosystem, but they ofteninherit societal biases learned from their training data, for instancestereotypes associating certain attributes with specific identity groups. Whilewhether and how these biases are mitigated may depend on the specific usecases, being able to effectively detect instances of stereotype perpetuation isa crucial first step. Current methods to assess presence of stereotypes ingenerated language rely on simple template or co-occurrence based measures,without accounting for the variety of sentential contexts they manifest in. Weargue that understanding the sentential context is crucial for detectinginstances of generalization. We distinguish two types of generalizations: (1)language that merely mentions the presence of a generalization ("people thinkthe French are very rude"), and (2) language that reinforces such ageneralization ("as French they must be rude"), from non-generalizing context("My French friends think I am rude"). For meaningful stereotype evaluations,we need to reliably distinguish such instances of generalizations. We introducethe new task of detecting generalization in language, and build GeniL, amultilingual dataset of over 50K sentences from 9 languages (English, Arabic,Bengali, Spanish, French, Hindi, Indonesian, Malay, and Portuguese) annotatedfor instances of generalizations. We demonstrate that the likelihood of aco-occurrence being an instance of generalization is usually low, and variesacross different languages, identity groups, and attributes. We buildclassifiers to detect generalization in language with an overall PR-AUC of58.7, with varying degrees of performance across languages. Our researchprovides data and tools to enable a nuanced understanding of stereotypeperpetuation, a crucial step towards more inclusive and responsible languagetechnologies.

 

Quick Read (beta)

loading the full paper ...