Senja Pollak


About me

I am assistant professor of Language Technologies and work as postdoctoral researcher at the Department of Knowledge Technologies, Jožef Stefan Institute (JSI). I am the coordinator of the H2020 project EMBEDDIA (Cross-Lingual Embeddings for Less-Represented Languages in European News Media) and am involved in several other national and EU projects. My research interests include natural language processing, text mining, corpus linguistics and computational creativity. I also teach language technologies and computational creativity at the Jožef Stefan International Postgraduate School. From 2018 to 2019, I was a research fellow at the Usher institute of the University of Edinburgh.


2014-now: postdoctoral researcher at Jožef Stefan Institute

Domains of interest: natural language processing, language technologies, text mining, corpus linguistics, computational creativity, digital humanities


Current projects:
  • EMBEDDIA: Cross-Lingual Embeddings for Less-Represented Languages in European News Media (EU H2020 RIA), project coordinator
  • SAAM: Supporting Active Ageing through Multimodal coaching (EU H2020 RIA), task leader
  • TERMFRAME: Terminology and Knowledge Frames across Languages (National research project), PI at JSI
  • RSDO: Development of Slovene in digital environment - language resources and technologies, task leader
  • SDM-Open-SLO (Semantic Data Mining for linked open data), researcher

  • Past EU projects: MUSE: Machine understanding for Interactive StorytElling, WP leader and co-PI at JSI; PROSECCO: Promoting the Scientific Exploration of Computational Creativity, co-PI at JSI; WHIM: The What-if Machine; ConCreTe: Concept Creation Technology

    Past national projects (funded by the Slovenian Research Agency): FORMICA: Influence of formal and informal corporate communications on capital markets, co-PI at JSI; JANES: Resources, Tools and Methods for the Research of Nonstandard Internet Slovene, WP leader; HinLife: Analysis of heterogeneous information networks for knowledge discovery in lifesciences, researcher

    Past industrial projects: TermIOLAR 1 and TermoIOLAR 2 (Development of a prototype software solution to support semi-automatic terminology management and extraction in mono- and bilingual corpora; industrial projects for Slovene Language service provider Iolar), principal investigator


  • Program committee member of: ICCC 2016-2020, REPROLANG 2020, JTDH 2020, IJCAI-PRICAI 2020, ECIR 2019, SLATE 2016, 4REAL 2016-2018, ICCBR-ExpCrea 2015
  • Organizing committee member of: DHandNLP 2020 (co-chair), SLSP 2019 (co-chair), ConCreTe workshop on Metaphor 2016, ICCC 2014, LTSP 2014, ESSLLI 2011

  • Education

    2014: PhD in Translation Studies (spec. Language Technologies), Department of Translation, Faculty of Arts, University of Ljubljana, Slovenia
    Title: Semi-automatic domain modeling from multilingual corpora
    advisor: Prof. Špela Vintar, co-advisor: Prof. Paola Velardi

    2009: MSc in Computational Lingusitics, University of Antwerp, Belgium
    Title: Text classification of press articles on Kenyan elections
    advisor: Prof. Walter Daelemans

    2007: BSc in French Language and Literature and BSc in Sociology of Culture, Faculty of Arts, University of Ljubljana, Slovenia

    2004: BSc in Modern Languages - French Linguistics (maîtrise), University Paris 3 - Sorbonne Nouvelle, Paris, France

    Selected publications

    Selected Journal Papers

    Selected Conference Papers


    CC resources for download (Paper: SMAILOVIĆ, Jasmina, POLLAK, Senja (2011). Semi-automated construction of a topic ontology from research papers in the domain of language technologies. LTC'11, 5th Language & Technology Conference, Poznan, Poland, November 2011.)

    Contact and other personal pages

  • E-mail: senja.pollak@ijs.si
  • Skype: Senja80

  • Google Scholar: available here