2021 Steven Krauwer Award for CLARIN Achievements Awarded to Tomaž Erjavec


Tomaž Erjavec received the 2021 Steven Krauwer Award of the CLARIN research infrastructure for his work on the compilation of the multilingual ParlaMint corpus of parliamentary debates.

Recently, Tomaž Erjavec participated in the CLARIN.SI-funded ParlaMint Project. This ambitious data engineering task included both creating a multilingual set of uniformly annotated corpora of parliamentary proceedings, as well as processing the corpora linguistically to add syntactic structures of Universal Dependencies and Named Entities annotation. He invented the interoperable annotation format used for the corpus based on Parla-CLARIN recommendations, created validation schemata and conversion scripts, and managed the repository and distribution of the resulting datasets. This unique data collection presents a crucial milestone for research in the digital humanities and political sciences.

Tomaž Erjavec’s contributions to Parla-CLARIN (framework for encoding corpora of parliamentary proceedings) and ParlaMint establish an innovative strategy for handling and processing parliamentary data. Its novelties relate to the proper and unified handling of cross-lingual and across-parliament comparable data, and to making this data uniformly available. The ParlaMint framework developed is becoming a de-facto standard for national parliamentary data and will be further developed to cover more detailed and specific metadata across languages and parliaments.

The visibility of Tomaž Erjavec’s work goes beyond ParlaMint – he is maintaining the certified CLARIN.SI repository, which currently contains more than 200 language resources and tools, or approximately 200 GB data for 80 languages, 65 of which were (co-)authored by Tomaž Erjavec himself. Besides his proficiency in language resources and evaluation, linguistic standards and parliamentary corpora, what makes Tomaž Erjavec a great colleague, according to the jury, is his knowledge and scientific professionalism, commitment, sense of humour and confidence, which makes working with him a real pleasure.