Knowledge Technologies Jožef Stefan Institute

Knowledge Technologies

Send comments and suggestions to the webmasters
SOFTWARE
SearchPoint
SearchPoint
A new way of searching, with graphic control over distances to related topics

Areas: Semantic web, Text, web and multimedia mining


Document Summarizer
Document Summarizer
Based on semantic graphs representing documents as triplets of named entities (subject, verb, object)

Areas: Knowledge management, Human language technologies


OntoGen
OntoGen
OntoGen is a semi-automatic and data-driven ontology editor focusing on editing of topic ontologies (a set of topics connected with different types of relations). The system combines text-mining techniques with an efficient user interface to reduce both: the time spent and complexity for the user. In this way it bridges the gap between complex ontology editing tools and the domain experts who are constructing the ontology and do not necessarily have the skills of ontology engineering

Areas: Semantic web, Data mining and machine learning, Knowledge management


Semantic Search
Semantic Search
What is an embargo? What can a country join? What is the strongest currency? Why do children go to school? How many people have died in Vietnam? How many companies made profit? Where do birds fly? Where did an earthquake happen? Where is Slovenia? Where does Ronaldo play?

Areas: Text, web and multimedia mining, Human language technologies


Wikipedia History Browser
Wikipedia History Browser
Timeline display of search results (lifelines, events, images), with links to the Wikipedia articles.

Areas: Semantic web, Knowledge management


Document Atlas - Text Corpora Visualization
Document Atlas - Text Corpora Visualization
Utility for visualizing large corpora of text documents. First it identifies relevant semantics based on the documents from the input corpus–this is done using Latent Semantic Indexing. Than the whole corpus is projected onto discovered semantics and positioned on a 2D plane using multidimensional scaling. The user can explore the 2D plane using an intuitive interface. The density of documents is used for generating the background relief in order to make the visualization of documents similar to a map. Keywords describing specific areas are also written on the map. All these features together provide the user with an easier path towards understanding the corpus

Areas: Knowledge management, Human language technologies


Project JOS - Linguistic Annotation of Slovene (Jezikoslovno označevanje
slovenskega jezika)
Project JOS - Linguistic Annotation of Slovene (Jezikoslovno označevanje slovenskega jezika)
Morphosyntactic specifications, two annotated corpora, Web concordancer, and service for text markup. The first major and freely available (Creative Commons) set of resources for morphosyntactic annotation and lemmatisation of Slovene.

Areas: Human language technologies, Semantic web


Text-Garden
Text-Garden
The Text-Mining Software Tools enable easy handling of text documents for the purpose of data analysis, including automatic model generation and document classification, clustering, visualization, dealing with Web documents and Web crawling. Text-Garden includes the OntoGen ontology construction system.

Areas: Semantic web


ECOGEN Soil Quality Index
ECOGEN Soil Quality Index
ESQI is a qualitative multi-attribute model, developed within the ECOGEN project, that calculates an index of soil quality relative to a selected standard soil condition ("medium" value of attributes). The model is implemented in a server-side script, and accessed through an interactive Web page.

Areas: Decision support


			
				RSD
RSD
Relational Subgroup Discovery through 1.st order feature construction. The source code of the system, in Yap Prolog, is available for download, with samples and a user manual.

Areas: Data mining and machine learning, Text, web and multimedia mining


Social Browsing (LiveNetLive)
Social Browsing (LiveNetLive)
A developing Web service based on people's interests and the semantics of the pages they visit

Areas: Semantic web, Text, web and multimedia mining


nl.ijs.si on-line services
nl.ijs.si on-line services
Concordancing and lemmatization

Areas: Human language technologies


SEGS
SEGS
SEGS (Search for Enriched Gene Sets) is a web tool for descriptive analysis of microarray data. The analysis is peformed by looking for descriptions of gene sets that are statistically significantly over- or under-expressed between different scenarios within the context of a genome-scale experiments (DNA microarray).

Areas: Data mining and machine learning, Text, web and multimedia mining


MLC4.5 and MLJ4.8
MLC4.5 and MLJ4.8
Learn to combine classifiers with meta decision trees.

Areas: Data mining and machine learning, Text, web and multimedia mining


LemmaGen
A system for learning Ripple Down Rules specialized for automatic generation of lemmatizers. So far, LemmaGen was used to produce lemmatizers for 12 different languages.

Areas: Human language technologies


DEXi (DEX for Instruction)
DEXi (DEX for Instruction)
An educational computer program for qualitative decision modelling (developed within Slovenian Ro (Computer Literacy) Programme; 1999-2000)

Areas: Decision support


proDEX
proDEX is a tool for qualitative multi-attribute modelling in basic and extended DEX methodology.

Areas: Decision support


GMOtrack
GMOtrack
GMOtrack is a program that supports traceability of genetically modified organisms. Given a table of GMOs (along with the probabilities of their presence and the genetic elements present in their genome) GMOtrack computes the optimal set of screening assays for a two-phase testing strategy.

Areas: Decision support


Lagrange/Lagramge
Lagrange and Lagramge are programs for inducing algebraic and ordinary differential equations from observational data. While Lagrange is completely data-driven approach to inducing equations, Lagramge allows for knowledge-driven induction, where user can tailor the space of candidate equation structures according to the background knowledge from the domain of interest.

Areas: Data mining and machine learning


LINUS
LINUS
ILP learning of constrained logic programs.

Areas: Data mining and machine learning


GOVOREC (Speaker)
GOVOREC (Speaker)
A system for slovene speech synthesis and pronunciation of text files, available for on-line testing and download

Areas: Human language technologies


CIPER - Constrained Inductive Polynomial Equation for Regression
CIPER - Constrained Inductive Polynomial Equation for Regression
Regression methods aim at inducing model of numeric data. While most state-of-the-art machine learning methods for regression focus on inducing piecewise regression models (regression and model trees), we investigate the predictive performance of regression models based on polynomial equations. We present Ciper, an efficient method for inducing polynomial equations and empirically evaluate its predictive performance on standard regression tasks.

Areas: Data mining and machine learning