Working Memory based assessment of Large Language Models
Evalvacija jezikovnih modelov obsega na osnovi delovnega spomina

No. of contract:

from 01.07.2022 to 30.06.2024


Neural language models are core language technology. When we start typing the phrase “Can computers…” in Google search, the offered text completions are produced by neural language models (LMs) — brain-inspired computer algorithms trained to predict upcoming words in language sequences. Due to their ability to learn relevant statistical patterns of language from large collections of text, LMs are becoming a core component in language technologies, fueling diverse applications ranging from question-answering, information retrieval, text summarization to machine translation systems.
Problem: the secret to success — do neural language models develop working memory? The success of LMs on such diverse language tasks suggests that during the course of learning, LMs develop a form of working memory (WM) capacity: the ability to store and access recent context when processing text sequences. In human intelligence, WM is hypothesized to be the core component underlying the flexible cognitive behavior, including language. But what is the nature of the working memory that LMs learn? And does it mediate the reported ability of LMs to be flexibly reused on a diverse range of applied language tasks (e.g. question answering)?
Approach: WM test suite for quantifying short-term memory capacity of LMs. To test the conjecture that WM allows flexible performance on diverse language tasks, the Honey Lab researchers at the John Hopkins University (JHU) have developed a novel WM test suite allowing to characterize the WM abilities of any LM. In this paradigm, the model is presented with a memorization task of the format: <preface text> <a list of words> <intervening text> <the same list of words>. With this paradigm it is possible to quantify how the memory trace of the first list affected the ability of LM to process the second list. Current results by the JHU team show that state-of-the-art LMs can remember both the order and identity of previously seen words in context.