When can transformers reason with abstract symbols?

Abstract

We investigate the capabilities of transformer models on relational reasoningtasks. In these tasks, models are trained on a set of strings encoding abstractrelations, and are then tested out-of-distribution on data that containssymbols that did not appear in the training dataset. We prove that for anyrelational reasoning task in a large family of tasks, transformers learn theabstract relations and generalize to the test set when trained by gradientdescent on sufficiently large quantities of training data. This is in contrastto classical fully-connected networks, which we prove fail to learn to reason.Our results inspire modifications of the transformer architecture that add onlytwo trainable parameters per head, and that we empirically demonstrate improvedata efficiency for learning to reason.

Quick Read (beta)

loading the full paper ...