Izboljšave metode skladanja klasifikatorjev

Abstract

Methods for combining classifiers attract considerable attention in the field of inductive machine learning. They enable us, in principle, to achieve better classification accuracy, because we can correct a false prediction of one base level classifier by taking into account predictions of other base level classifiers. We can distinguish two different groups of methods for combining classifiers. Methods of the first group only use one learning algorithm for building base level classifiers and attain their diversity by manipulating the training set. Methods of the second group combine classifiers built by different learning algorithms (heterogeneous classifiers). Among this group of methods, stacking of classifiers is most well-known. Stacking combines classifiers in two steps. In the first step several learning algorithms are used to build base level classifiers. Their predictions are then described by meta level attributes and collected into a meta level learning set. In the second step the meta level learning set is used to build a meta classifier which combines the predictions of the base classifiers into a final prediction.

In the work presented here, we have compared several known methods for combining heterogeneous classifiers using a unified evaluation methodology. The results show, that the accuracy of the best stacking approach is only slightly higher than the accuracy achieved by selection by cross-validation. This has motivated us to search for possible improvements of existing stacking approaches and we have discovered two new methods. The first one uses an extended set of meta level attributes, while the second method uses multi-response model trees as a meta level learning algorithm. The latter preforms significantly better than previously known methods for combining heterogeneous classifiers. Both methods were also experimentally evaluated on several practical problems.

Publication
Magistrsko delo (MSc Thesis)