Analyzing Narrative Processing in Large Language Models (LLMs): Using GPT4 to test BERT

  • 2024-05-03 12:56:13
  • Patrick Krauss, Jannik Hösch, Claus Metzner, Andreas Maier, Peter Uhrig, Achim Schilling
  • 0

Abstract

The ability to transmit and receive complex information via language isunique to humans and is the basis of traditions, culture and versatile socialinteractions. Through the disruptive introduction of transformer based largelanguage models (LLMs) humans are not the only entity to "understand" andproduce language any more. In the present study, we have performed the firststeps to use LLMs as a model to understand fundamental mechanisms of languageprocessing in neural networks, in order to make predictions and generatehypotheses on how the human brain does language processing. Thus, we have usedChatGPT to generate seven different stylistic variations of ten differentnarratives (Aesop's fables). We used these stories as input for the open sourceLLM BERT and have analyzed the activation patterns of the hidden units of BERTusing multi-dimensional scaling and cluster analysis. We found that theactivation vectors of the hidden units cluster according to stylisticvariations in earlier layers of BERT (1) than narrative content (4-5). Despitethe fact that BERT consists of 12 identical building blocks that are stackedand trained on large text corpora, the different layers perform differenttasks. This is a very useful model of the human brain, where self-similarstructures, i.e. different areas of the cerebral cortex, can have differentfunctions and are therefore well suited to processing language in a veryefficient way. The proposed approach has the potential to open the black box ofLLMs on the one hand, and might be a further step to unravel the neuralprocesses underlying human language processing and cognition in general.

 

Quick Read (beta)

loading the full paper ...