Toward Informal Language Processing: Knowledge of Slang in Large Language Models

Abstract

Recent advancement in large language models (LLMs) has offered a strongpotential for natural language systems to process informal language. Arepresentative form of informal language is slang, used commonly in dailyconversations and online social media. To date, slang has not beencomprehensively evaluated in LLMs due partly to the absence of a carefullydesigned and publicly accessible benchmark. Using movie subtitles, we constructa dataset that supports evaluation on a diverse set of tasks pertaining toautomatic processing of slang. For both evaluation and finetuning, we show theeffectiveness of our dataset on two core applications: 1) slang detection, and2) identification of regional and historical sources of slang from naturalsentences. We also show how our dataset can be used to probe the outputdistributions of LLMs for interpretive insights. We find that while LLMs suchas GPT-4 achieve good performance in a zero-shot setting, smaller BERT-likemodels finetuned on our dataset achieve comparable performance. Furthermore, weshow that our dataset enables finetuning of LLMs such as GPT-3.5 that achievesubstantially better performance than strong zero-shot baselines. Our workoffers a comprehensive evaluation and a high-quality benchmark on English slangbased on the OpenSubtitles corpus, serving both as a publicly accessibleresource and a platform for applying tools for informal language processing.

Quick Read (beta)

loading the full paper ...