Abstract
While Large Language Models (LLMs) exhibit remarkable capabilities inzero-shot and few-shot scenarios, they often require computationallyprohibitive sizes. Conversely, smaller Masked Language Models (MLMs) like BERTand RoBERTa achieve state-of-the-art results through fine-tuning but strugglewith extending to few-shot and zero-shot settings due to their architecturalconstraints. Hence, we propose Statement-Tuning, a technique that modelsdiscriminative tasks as a set of finite statements and trains an Encoder modelto discriminate between the potential statements to determine the label. We doStatement-Tuning on multiple tasks to enable cross-task generalization.Experimental results demonstrate that Statement Tuning achieves competitiveperformance compared to state-of-the-art LLMs with significantly fewerparameters. Moreover, the study investigates the impact of several designchoices on few-shot and zero-shot generalization, revealing that StatementTuning can achieve sufficient performance with modest training data andbenefits from task and statement diversity for unseen task generalizability.