TartuNLP @ SIGTYP 2024 Shared Task: Adapting XLM-RoBERTa for Ancient and Historical Languages

  • 2024-04-19 13:26:28
  • Aleksei Dorkin, Kairit Sirts
  • 0

Abstract

We present our submission to the unconstrained subtask of the SIGTYP 2024Shared Task on Word Embedding Evaluation for Ancient and Historical Languagesfor morphological annotation, POS-tagging, lemmatization, character- andword-level gap-filling. We developed a simple, uniform, and computationallylightweight approach based on the adapters framework using parameter-efficientfine-tuning. We applied the same adapter-based approach uniformly to all tasksand 16 languages by fine-tuning stacked language- and task-specific adapters.Our submission obtained an overall second place out of three submissions, withthe first place in word-level gap-filling. Our results show the feasibility ofadapting language models pre-trained on modern languages to historical andancient languages via adapter training.

 

Quick Read (beta)

loading the full paper ...