The field of machine learning is concerned with methods that can automatically learn from experience. The experience is usually given in the form of learning examples from which machine learning methods can automatically learn a model. Sometimes it is very important that the learned model can be read by human, and one of the most understandable types of models are rule sets. The methods for learning models in the form of rule sets are the topic of this thesis.
The predictive clustering approach to rule learning presented here is based on ideas from two machine learning subareas, predictive modeling and clustering. Both areas are usually regarded as completely different tasks. Predictive modeling is concerned with the construction of models that can be used to predict some object’s target property from the description of this object. Clustering, on the other hand, is concerned with grouping of objects into classes of similar objects, called clusters; there is no target property to be predicted, and usually no symbolic description of discovered clusters. However, predictive modeling methods that partition the example space, such as decision trees and rules are very similar to clustering. They partition the set of examples into subsets in which examples have similar values of the target variable, while clustering produces subsets in which examples have similar values of all descriptive variables. Predictive clustering approach builds on this similarity. It constructs clusters of examples that are similar to each other, but in general takes both the descriptive and the target variables into account, and associates a predictive model to each constructed cluster. Methods for predictive clustering enable us to construct models for predicting multiple target variables, which are normally simpler and more comprehensible than the corresponding set of models, each predicting a single variable.
To this day, predictive clustering has been restricted to decision tree methods. The goal of this thesis is to extend predictive clustering approach to methods for learning rules, i.e., to develop a method that deals with rule learning and clustering in a uniform way. Of the existing rule learning methods, majority are based on the sequential covering algorithm, originally designed for learning ordered rule lists for binary classification domains. We have developed a generalized version of this algorithm that enables learning of ordered or unordered rules, on single or multiple target classification or regression domains.
The newly developed algorithm is empirically evaluated on several single and multiple target classification and regression problems. Performance of the new method compares favorably to existing methods. Comparison of single target and multiple target prediction models shows that multiple target models offer comparable performance and drastically lower complexity than the corresponding sets of single target models.