BI-US/22-24-170 | Department of Knowledge Technologies

No. of contract:

BI-US/22-24-170

Type of project:

ARIS Bilateral Projects | National Projects

Duration:

from 01.07.2022 to 30.06.2024

Contact:

Senja Pollak

Areas:

Language Tehnologies and Digital Humanities

Neural language models are core language technology. When we start typing the phrase “Can computers…” in Google search, the offered text completions are produced by neural language models (LMs) — brain-inspired computer algorithms trained to predict upcoming words in language sequences. Due to their ability to learn relevant statistical patterns of language from large collections of text, LMs are becoming a core component in language technologies, fueling diverse applications ranging from question-answering, information retrieval, text summarization to machine translation systems.
Problem: the secret to success — do neural language models develop working memory? The success of LMs on such diverse language tasks suggests that during the course of learning, LMs develop a form of working memory (WM) capacity: the ability to store and access recent context when processing text sequences. In human intelligence, WM is hypothesized to be the core component underlying the flexible cognitive behavior, including language. But what is the nature of the working memory that LMs learn? And does it mediate the reported ability of LMs to be flexibly reused on a diverse range of applied language tasks (e.g. question answering)?
Approach: WM test suite for quantifying short-term memory capacity of LMs. To test the conjecture that WM allows flexible performance on diverse language tasks, the Honey Lab researchers at the John Hopkins University (JHU) have developed a novel WM test suite allowing to characterize the WM abilities of any LM. In this paradigm, the model is presented with a memorization task of the format: <preface text> <a list of words> <intervening text> <the same list of words>. With this paradigm it is possible to quantify how the memory trace of the first list affected the ability of LM to process the second list. Current results by the JHU team show that state-of-the-art LMs can remember both the order and identity of previously seen words in context.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.