Knowledge Technologies Jožef Stefan Institute

Knowledge Technologies

Send comments and suggestions to the webmasters
Datamining Text web min Human language tec Decisionsupport

Data Mining and Machine Learning

Machine learning (knowledge discovery) explores the algorithms of learning in general, while data mining uses a variety of mostly automatic processes for analysing large amounts of data. In these areas, we focus on inductive, relational and constraint-based methods (databases, inductive logic programs), meta-learning (combining classifiers), subgroup discovery and equation discovery. We have developed a series of systems for learning logic programs and various kinds of equations (polynomial algebraic, difference and partial differential equations), learning both the structure and the parameter values of equations.

Contact: Nada Lavrač, Sašo Džeroski

Projects - Display details:
  • Development and applications of new semantic data mining methods in life sciences
  • ConCreTe
  • WHIM
  • KT
  • PD_manager
  • Development of a multimethod approach to study wildlife behavior: investigating humanbear conflicts in contrasting landscapes of Europe
  • Structured output prediction with applications in sustainable agricultural production
  • HinLife
  • Molecular bases of interactions among the grapevine and phytoplasmal causing agents of the grapevine yellows diseases
Software - Display details:

Text and Web mining

Text mining, which aims at extracting useful information from document collections, is a well-developed field of computer science, driven by the growth of document collections available in corporate and governmental environments and especially on the Web. In many real-life scenarios, documents are also available in information networks. Examples of such networks include multimedia repositories (containing multimedia descriptions, subtitles, slide titles, etc.), social networks of professionals (containing CVs), citation networks (containing publications), and even software code (heterogeneously interlinked software artifacts containing code comments). The abundance of such document-enriched networks motivates the development of new methodologies that join the two worlds, text mining and mining heterogeneous information networks, and handle the two types of data in a common data mining framework. Handling vast document streams is a relatively new challenge emerging mainly from the self-publishing activities of Web users (e.g., blogging, twitting, and participating in discussion forums). Furthermore, news streams (e.g., Dow Jones, BusinessWire, Bloomberg, Reuters) are growing in number and rate, which makes it impossible for the users to systematically follow the topics of their interest. One of the challenges is thus to investigate techniques for online data mining, machine learning, and sentiment analysis, supporting decision making in near-real time over vast amounts of constantly evolving data.

Contact: Miha Grčar, Igor Mozetič

Projects - Display details:
Software - Display details:

Human Language Technologies

Most of the information humans deal with consists of text, and Human Language Technologies enable computers to help us exploit and manage this information. Texts, in whatever language, need to be processed in various ways, from ensuring uniform encoding, to complex linguistic analyses such as assigning syntactic and semantic structure. Such methods find application in text mining, machine translation, search engines, exploratory instruments for linguists and lexicographers, digital publishing, etc. In this research area the department is developing general methods for text processing and mark-up, although with a special focus on the Slovene language. We are especially concerned with the production of standardised and available language resources, such as annotated mono- and multilingual corpora, lexica, and complex digital editions, eg. of Slovenian literature (ZRC eLibrary). While such resources can be directly used for language study, they are, for the most part, targeted towards the use of machine learning programs that automatically induce various language models from the resources.

Contact: Tomaž Erjavec

Projects - Display details:
Software - Display details:

Decision Support

Decision Support (DS) aims to provide computational support to (groups of) people faced with difficult decisions. DS provides a rich collection of decision analysis, simulation, optimization and modeling techniques, including hierarchical multi-attribute models, decision trees, influence diagrams and belief networks. DS also involves software tools such as decision support systems, group decision support and mediation systems. We have developed a series of decision models and support systems, focusing on qualitative, multi-attribute decision making and models of uncertainty, necessary for capturing realistic aspects of complex decision problems. We continue to develop and expand our main software tool, DEXi.

Contact: Marko Bohanec, Martin Žnidaršič

Projects - Display details:
Software - Display details: