Embeddings-based techniques for Media Monitoring Applications (EMMA)
Tehnike vektorskih vložitev za medijske aplikacije

No. of contract:

from 01.10.2023 to 30.09.2026

Optimization problems arise everywhere in our everyday lives; across all geographic regions, industrial branches, and academic disciplines. The better methods we have to tackle these problems, the more efficiently we can exploit our limited resources. Many, if not most, practical problems can therefore not be solved by exact analytics. In these situations, one has to resort to heuristic approaches, which trade the accuracy of a solution for a simple algorithmic structure, fast running times, or otherwise efficient use of computational resources. Under many circumstances heuristics offer the only possibility to obtain any feasible solution at all. Ideally, one would like to develop algorithms that perform well across broad sets of problems and instances.

In practice, however, we observe a distinct complementarity of different algorithmic ideas: algorithms performing very well on some problems may perform poorly on others. The user therefore needs to select carefully which heuristic to apply to the problem that she wishes to solve. This algorithm selection problem is complicated by the fact that iterative heuristics are parametrized. Different hyper-parameter settings can result in very different performances. The user therefore needs not only to decide which algorithm to apply, but also how to configure it. To explore and eventually exploit the complementarity of these heuristics and their interactions, and to design new heuristics that are even more suitable for a given problem instance, supervised machine learning techniques can be applied.

Such machine-trained approaches require meaningful features that quantify different aspects of the problem instance. Additionally, they need information about hyper-parameter configuration and heuristics performance. In a more general way, an automate algorithm design needs to understand and fuse the information from the problem space, hyper-parameter space, and performance space. With regard to the problem space, exploratory landscape analysis has been introduced with the goal to support automated algorithm design techniques through sets of features (i.e., characteristics) which measure various properties of the optimization problem at hand.

In the context of numerical black-box optimization, a large number of features has been suggested over the last decades, and each of them measures a different numerical black-box optimization, a large number of features has been suggested over the last decades, and each of them measures a different characteristic of the problem. It is well known that a proper feature selection is essential to obtain peak performance in supervised learning approaches, not only to remove features with limited or even deceptive information, but also to remove an overestimated relevance of features that are correlated to each other. From the other side, the performance space should also be explored with a great care, by trying to understand the relations between feature space and the performance space. The main objective of the proposed project is to enable users to obtain good solutions of a given optimization problem instance without in-depth apriori knowledge about optimization algorithms or the problem instance. To achieve this, a number of different problem instance characteristics need to be considered to determine the suitability of a given optimization algorithm for a given problem instance. This is a complex multidimensional problem with interdependencies between individual characteristics. Given a suitable portfolio of known benchmark problems with their characteristics, and a portfolio of optimization algorithms, the actual performance of the optimization algorithms on the set of problem instances can be determined by solving each of the problem instances with each of the optimization algorithms. Using explainable machine learning techniques, we can find the relationships between problem instance characteristics and optimization algorithm performance. If sufficiently general relationships can be found, they can be used to predict optimization algorithm performance on unseen problem instances.