Please use this identifier to cite or link to this item: doi:10.22028/D291-42325
Volltext verfügbar? / Dokumentlieferung
Title: Entity Tracking in Language Models
Author(s): Kim, Najoung
Schuster, Sebastian
Editor(s): Rogers, Anna
Language: English
Title: The 61st Conference of the the Association for Computational Linguistics : July 9-14, 2023 : ACL 2023 : Volume 1: Long papers
Pages: 3835-3855
Publisher/Platform: ACL
Year of Publication: 2023
Place of publication: Stroudsburg, PA
Place of the conference: Toronto, Canada
DDC notations: 004 Computer science, internet
400 Language, linguistics
Publikation type: Conference Paper
Abstract: Keeping track of how states of entities change as a text or dialog unfolds is a key prerequisite to discourse understanding. Yet, there have been few systematic investigations into the ability of large language models (LLMs) to track discourse entities. In this work, we present a task probing to what extent a language model can infer the final state of an entity given an English description of the initial state and a series of state-changing operations. We use this task to first investigate whether Flan-T5, GPT-3 and GPT-3.5 can track the state of entities, and find that only GPT-3.5 models, which have been pretrained on large amounts of code, exhibit this ability. We then investigate whether smaller models pretrained primarily on text can learn to track entities, through finetuning T5 on several training/evaluation splits. While performance degrades for more complex splits, we find that even when evaluated on a different set of entities from training or longer operation sequences, a finetuned model can perform non-trivial entity tracking. Taken together, these results suggest that language models can learn to track entities but pretraining on text corpora alone does not make this capacity surface.
Link to this record: urn:nbn:de:bsz:291--ds-423252
hdl:20.500.11880/37991
http://dx.doi.org/10.22028/D291-42325
ISBN: 978-1-959429-72-2
Date of registration: 3-Jul-2024
Faculty: MI - Fakultät für Mathematik und Informatik
Department: MI - Informatik
Professorship: MI - Prof. Dr. Vera Demberg
Collections:SciDok - Der Wissenschaftsserver der Universität des Saarlandes

Files for this record:
There are no files associated with this item.


Items in SciDok are protected by copyright, with all rights reserved, unless otherwise indicated.