Event Extraction in Basque: Typologically motivated Cross-Lingual Transfer-Learning Analysis

  • 2024-04-09 16:35:41
  • Mikel Zubillaga, Oscar Sainz, Ainara Estarrona, Oier Lopez de Lacalle, Eneko Agirre
  • 0

Abstract

Cross-lingual transfer-learning is widely used in Event Extraction forlow-resource languages and involves a Multilingual Language Model that istrained in a source language and applied to the target language. This paperstudies whether the typological similarity between source and target languagesimpacts the performance of cross-lingual transfer, an under-explored topic. Wefirst focus on Basque as the target language, which is an ideal target languagebecause it is typologically different from surrounding languages. Ourexperiments on three Event Extraction tasks show that the shared linguisticcharacteristic between source and target languages does have an impact ontransfer quality. Further analysis of 72 language pairs reveals that for tasksthat involve token classification such as entity and event triggeridentification, common writing script and morphological features produce higherquality cross-lingual transfer. In contrast, for tasks involving structuralprediction like argument extraction, common word order is the most relevantfeature. In addition, we show that when increasing the training size, not allthe languages scale in the same way in the cross-lingual setting. To performthe experiments we introduce EusIE, an event extraction dataset for Basque,which follows the Multilingual Event Extraction dataset (MEE). The dataset andcode are publicly available.

 

Quick Read (beta)

loading the full paper ...