Department of Knowledge Technologies

SEEDS communication kit webinar

Anja Glusic — Mon, 29 Jun 2026 10:51:58 +0000

I’m pleased to be speaking at the upcoming online SEEDS webinar “Beyond the Boiler : Integrated Pathways for Building Decarbonisation”, organised in the framework of the EU Sustainable Energy Days.

Buildings should no longer be considered only as passive energy consumers. By combining renewable energy, electrified heating, energy storage and smart monitoring, they can become flexible assets within the wider energy system.

During this interactive session, we will explore practical and cost-effective pathways for accelerating building decarbonisation and moving from theory to real-world applications.

📅 30 June 2026
🕘 09:30–11:00 CEST
💻 Online

🔗 REGISTER HERE : https://events.ringcentral.com/events/beyond-the-boiler-integrated-pathways-for-building-decarbonisation-2026-06-30

#SEEDSProject #HorizonEurope #BuildingDecarbonisation #EnergyTransition #SmartBuildings #Sustainability

Strategic days of the department

Anja Glusic — Thu, 28 May 2026 13:41:16 +0000

Highlights from the E8 Strategy Days, where big ideas met collaborative teamwork to set us up for a successful year ahead. We organized strategy days with our colleagues at the Marine Biological Station Piran. With some teambuilding activities and of course a lot of planning for the future of the department, we carried out the strategy days flawlessly.

Member of our department recieved Best Poster Award at the Future of Integrated Assessment Models: Pathways Towards Carbon Neutrality for Climate, Environment, Health and Socio‐economic Co‐benefits

Anja Glusic — Thu, 28 May 2026 09:19:40 +0000

A member of our department, Sabin Roman, received the award for best poster at the conference Future of Integrated Assessment Models: Pathways Towards Carbon Neutrality for Climate, Environment, Health and Socio‐economic Co‐benefits with his poster titled: Modelling the collapse of complex societies.

Congratulations!

Taja Kuzman Pungeršek successfully defended her doctorate thesis

Anja Glusic — Fri, 15 May 2026 11:34:48 +0000

Taja Kuzman Pungeršek successfully defended her doctorate thesis titled Robust Multilingual Automatic Genre Identification in Texts.

Congratulations!

Abstract:

Collecting texts from the web has significantly accelerated the creation of large text datasets
which are essential for the development of advanced language technologies, including large
language models. However, because these texts are gathered automatically, their linguistic
and functional characteristics are largely unknown, limiting their reliable use in research
and applications. Automatic genre identification, a text classification task that categorizes
texts into specific genre categories, provides key insights into such large-scale text
collections and enables their filtering for applications in language technology and linguistic
research.
This thesis advances automatic genre identification for web-scale multilingual text data
through the creation of novel genre schemata, the development of manually-annotated
genre datasets, and the exploration of robust machine learning methods. First, the
study introduces new genre schemata designed to improve annotation reliability and
cross-schema comparability, thereby addressing a key limitation of prior work where
incompatible schemata hindered meaningful comparison across studies. We demonstrate
that high-quality manual annotation with acceptable inter-annotator agreement can
be achieved through a carefully designed genre schema, detailed guidelines, and the
employment of expert annotators.
To address the lack of high-coverage genre datasets in our target languages, we develop
high-quality, manually-annotated genre datasets in Slovenian and English, as well as
multilingual test collections spanning eleven typologically diverse languages and different
scripts. Building on the test datasets, we establish the AGILE benchmark, which enables
standardized, reproducible cross-dataset and cross-lingual evaluation of genre classifiers.
The benchmark also supports a systematic comparison of modern large language models
across languages.
Using this infrastructure, we develop genre classifiers that achieve robust performance
across various languages, datasets, and evaluation settings. To ascertain which machine
learning methodology offers the most robust generalization, we experiment with a range
of text classification techniques, ranging from traditional non-neural machine learning
methodologies to cutting-edge approaches based on large language models. Our findings
indicate that a BERT-like model, fine-tuned on our newly developed training dataset,
achieves state-of-the-art performance across various datasets and languages, including languages
using non-Latin scripts and languages not closely related to the fine-tuning languages.
The resulting model is publicly released, providing a practical tool for large-scale
automatic genre annotation of multilingual web text collections.
Furthermore, we propose the LLM Teacher-Student Framework, a novel approach for
training text classifiers without manually-annotated data. By leveraging a large language
model to generate training labels, this method enables scalable, cost-efficient development
of genre classifiers, particularly for low-resource languages and specialized domains, and is
applicable beyond genre identification to other text classification tasks.

AI for Materials Science: Tuning Laser-Induced Graphene Production

Anja Glusic — Wed, 08 Apr 2026 13:42:34 +0000

AI and machine learning have advanced the state of the art in many application domains. We present an application to materials science; in particular, we use surrogate models with Bayesian optimization for automated parameter tuning to optimize the fabrication of laser-induced graphene. This process allows to create thin conductive lines in thin layers of insulating material, enabling the development of next-generation nano-circuits. This is of interest for example for in-space manufacturing. We are able to achieve improvements of up to a factor of two compared to existing approaches in the literature and to what human experts are able to achieve, in a reproducible manner. Our implementation is based on the open-source mlr and mlrMBO frameworks and generalizes to other applications.

Zoom link: Click me.

Note: The lecture room is different this time: instead of the usual E-lecture room, the lecture will be held in the JSI Physics Seminar Room (Jamova 39, Building A, Ground Floor, Room 106).

From Understanding People to Securing AI: A Human-Centered Research Journey Through Large Language Models

Anja Glusic — Tue, 10 Mar 2026 08:38:53 +0000

The talk presents a connected view of recent research on large language models (LLMs) through a human-centered lens, moving from what these systems can infer about people to how they interact with them, where they present security risks, and why such vulnerabilities matter in socially consequential settings. It begins with work on how LLMs can infer personality from short texts and on the role of communication style in shaping user experience and task outcomes, showing both the potential of LLMs for personalization and the importance of designing interaction carefully across contexts. It then broadens to questions of trustworthiness and toxicity, covering security risks in prompt-based interaction, attack surfaces, and the growing challenge posed by multilingual, multimodal, and autonomous jailbreaks. Further, it examines representational harms through methods for measuring gender bias in gendered and under-resourced languages. The talk is concluded with a high-stakes application of mental health crisis response, where clinically informed evaluation reveals that increasing model capability does not automatically translate into safe, appropriate, or context-aware behavior. Across these topics, the unifying theme is that progress in LLMs should be matched by rigorous work on evaluation, safety, fairness, and responsible deployment.

Zoom link: Click me. (Audio issues from last time are being actively addressed.)

The seminar is also organized under the ELLIOT project.

AI-enabled Mammography

Anja Glusic — Thu, 05 Mar 2026 12:18:05 +0000

Breast cancer is the most common malignancy and leading cause of cancer deaths in women around the world. The preferred diagnostic and prevention method used to fight breast cancer is mammography, being effective, cheap, reliable and suitable for screening large populations. The analysis of mammograms requires considerable effort from qualified radiologists, who have been increasingly relying on computer-aided diagnosis and the development of tools that require little supervision could save the lives of hundreds of thousands of women across the world.

In recent years, the integration of Artificial Intelligence (AI) in the field of medicine has brought about a significant transformation in healthcare delivery. Computer vision technology finds itself at the forefront of this proces. Its applications to mamography data represent an active research field, stimulated by the fact that among the clinical imaging modalities, mammography stands out in terms of high spatial resolution (full-field digital mammography systems usually produce images at resolutions ranging from 1920×2304 to 4708×5844 pixels.

I will discuss research work stemming from projects conducted at the Institute for Artificial Intelligence Research and Development of Serbia, aimed at implementing AI-assisted mamography screening at the national level. Viewed through the lens of medical applications, the talk will cover the evolution of state-of-the-art research in the domain of computer vision and multimodal AI over the last 5 years, from the applications of “classical” convolutional neural networks and visual transformer models, explainable AI extensions, generative (diffusion) models, to self supervised and multiple instance learning approaches to address specific challenges and (downstream) tasks such as: image classification, lesion detection and synthetic mammogram generation.

Zoom link: Click me. (Audio issues from last time are being actively addressed.)

SLaLoM 2026 workshop and EMMA meeting

Anja Glusic — Tue, 17 Feb 2026 09:10:33 +0000

On the 12th and 13th of February 2026, the SLaLoM 2026 workshop (2nd Slovenian workshop on Large Language Models Techniques and Applications) and an intensive EMMA (EMMA: Embeddings-based techniques for Media Monitoring Applications) meeting were held in Kranjska Gora, Slovenia. The SLaLoM 2026 proceedings will be made available on EMMA website.

Grand opening of the Slovenian Artificial Intelligence Factory (SLAIF)

Anja Glusic — Tue, 10 Feb 2026 13:18:35 +0000

Prof. Sašo Džeroski, PhD, the technical coordinator of the SLAIF project, presented the project with a total value of EUR 135 million, co-funded by the Republic of Slovenia and the European programme of the EuroHPC Joint Undertaking. The SLAIF project is part of the broader European EuroHPC initiative and brings together the expertise of leading research and educational institutions and connects them with the needs of industry.

👉The project’s development and activities focus on four key thematic areas: artificial intelligence for the green transition, for health and biotechnology, for the digital society, and for science. The goal is to actively promote collaboration between industry, academia, and research institutions, create opportunities for joint projects, and enable the transfer of knowledge and technologies into practice.

Video

Diffusion Language Models: Problem Solving and Reasoning

Anja Glusic — Tue, 03 Feb 2026 13:08:09 +0000

Masked diffusion models (MDMs) offer a compelling alternative to traditional autoregressive language models. They generate strings by iteratively refining partially masked inputs in parallel. This makes them efficient, but their computational capabilities and the limitations inherent to the parallel generation process remain largely unexplored.
In this talk, I will talk about what types of reasoning problems MDMs can provably solve and how efficiently they can do it. We will describe the relationship between MDMs and the well-understood reasoning frameworks of chain of thought (CoT) and padded looped transformers (LTs): We will see that MDMs and polynomially padded LTs are, in fact, equivalent, and that MDMs can solve all problems that CoT-augmented transformers can. Moreover, we will showcase classes of problems (including regular languages) for which MDMs are inherently more efficient than CoT transformers, where parallel generation allows for substantially faster reasoning.

Zoom link: Click me.