J2-2505

Predictive clustering on data streams

Napovedno razvrščanje na podatkovnih tokovih

https://www.ijs.si/ijsw/ARRSProjekti/2020/Napovedno%20razvr%C5%A1%C4%8Danje%20na%20podatkovnih%20tokovih

No. of contract:

J2-2505

Type of project:

Basic ARIS Projects | National Projects

Duration:

from 01.09.2020 to 31.08.2023

Contact:

Sašo Džeroski

Areas:

Machine Learning

Data streams are high frequency information sources that have recently become ubiquitous. Properties specific to them include the high frequency of arrival of new examples and the time-order thereof. Crucial among these properties is the possibility that the data (and the underlying mechanisms generating it) can change – this is called concept drift. Data stream mining methods must thus be able to detect it and adapt accordingly.

The need for mining data streams has increased and so has their complexity, which can be categorized along several dimensions. One is the complexity of the target to predict, where we are increasingly often encountering multi-target prediction tasks. Another is the need to handle examples with missing values of the targets in the context of semi-supervised learning or clustering. Finally, specific to data streams is the occurrence of the phenomenon of concept drift and the need to detect it and adapt to it.

Responding to the need to handle complex data streams, this project will develop online learning methods that can 1) Handle tasks of both flat and hierarchical multi-target regression and multi-label classification; 2) Efficiently perform unsupervised learning (clustering), as well as semi-supervised learning for (hierarchical) multi-target prediction tasks; 3) Estimate importance of features for supervised and semi-supervised tasks of multi-target prediction; and 4) Detect and handle changes during the learning of predictive models for different types of structured outputs, also in the context of semi-supervised learning.

It will systematically evaluate the developed methods using appropriate evaluation methodology. The developed methods will be made publicly available through a major data stream mining platform. Their use will also be promoted and facilitated by appropriately annotating the methods (with terms from an ontology of data stream mining), making them easier to find/use. Finally, the utility of the developed methods will be demonstrated on real-world case studies from the challenging areas of environmental and health monitoring, as well as space operations monitoring and optimization.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.