Hierarchical models are used in various domains of science and technology. They are extensively used within decision support systems for problems in which an option is to be selected from a given set so as to best satisfy the goals of the decision maker. The procedure for option evaluation is defined by a multi-attribute hierarchical model. In addition to evaluation, such models are used also for various analyses and simulations. In artificial intelligence, the hierarchical models are used in (1) expert systems for inference and (2) machine learning to decrease the complexity and increase the transparency of the learned classification rule. Hierarchical models and corresponding methods were also developed and are used within pattern recognition and computer aided design of switching circuits.
The most difficult problem in the development of hierarchical models is the design of model structure. In most cases, the structure is defined manually in a tiresome and lengthy process based on knowledge about the problem domain, which requires a lot of skill and experience. Only within few specific domains this process has been automated, for example in switching cirucuit design. However, these methods are typically limited to two-valued variables and a predefined set of logical operators.
Another important problem is due to the fact that the research of hierarchical models took place separately within different areas of research; resulting in etremely specific solutions. For example, in multi-attribute decision making, the models incorporate only numerical variables, which are then interrelated by functions based on utility theory, Bayesian probability or continuous logic. On the other hand, artificial intelligence generally uses the models with symbolic variables, and operators, defined by rules, decision tables, or decision trees. Many real-world problems require both, and this can be achieved only by an appropriate combination of the approaches.
Within this project we propose research, development, and verification of the methods that would, compared to existing methods, provide a better and more active support to the development of hierarchical classification and decision models. The proposed solutions are based on four approaches:
Machine learning methods are used to derive a concept that is described with learning examples. The methods are based on the identification of problem subspaces, for which the concept can be described with given description language. In this project we treat the problem of developing hierarchical models as a special case of machine learning from examples that automatically derives both intermediate concepts (model variables) and operators that define the relationships between the concepts. Such a method belongs to constructive induction (Michalski 83), which, in order to induce a concept description, generates new intermediate concepts. Some well-known constructive induction systems are DUCE (Muggleton 90), LFC (Ragavan, Rendell 93), CIGOL (Muggleton, Buntine 92). The main limitation of these systems is that the generation of new concepts does not discover the operators of the model, but rather selects them from a predefined set. In general, such methods generate new concepts in a way that makes it difficult to order them into a hierarchy.
The development of hierarchical models was addressed by a pioneer of artificial intelligence, Samuel (59). Given the model's structure, his algorithm used learning examples to develop models for playing checkers. This approach was later improved by Biermann et al. (82). Although their method could automatically derive the structure of variables, the approach was inefficient and could only handle cases with a complete set of learning examples. Recently, the method was substantially improved by research groups of Ross (94) and Perkowski (95). Their approach is based on Ashenhurst-Curtis function decomposition (Ashenhurst 52). Besides the fact that these methods are designed for specific application of switching circuit design and use two-valued variables, they are much more efficient and can be used also with incomplete sets of learning examples.
An important contribution to the use of hierarchical models in machine learning is structured induction, developed by Shapiro (87). His approach is based on a manual decomposition of the problem. For every intermediate concept either a special set of learning examples is used or an expert is consulted to build a corresponding decision tree. In comparison with standard decision tree induction techniques, Shapiro's approach exhibits about the same classification accuracy, but increased transparency and lower complexity of the developed models. Michie (95) emphasizes the important role the structural induction will have in the future developments and presents several real problems that where solved by structured induction.
Decision support is an area where hierarchical models have been used extensively. Most common are the models that use numerical variables and analytical (utility) functions as operators (Mallach 94); a typical example is AHP (Saaty 93). In all cases, the structure of the model is defined manually. However, the acquisition of utility functions is well supported by various methods based on interactive dialogs, graphs, and tables. Within this project we will make use of these methods primarily as a reference to generalize the learning algorithms for (1) continuous variables, (2) different types of operators, and (3) the derivation of hybrid models that include both numerical and symbolical variables.
The members of the Project group have actively participated in the research and development of both machine learning and multi-attribute decision making methods. They have developed several original machine learning methods, such as Assistant (Cestnik et al. 87) and Retis (Karalič 92), and proposed several improvements of existing methods, like m-estimate of probability (Cestnik 90) and an optimal tree-pruning algorithm (Bohanec, Bratko 94). Special emphasis was also payed on real application of these methods in medicine, machine engineering, metallurgy, process control and others (Urbančič et al. 91).
Within the domain of decision support systems we have developed an expert system shell for multi-attribute decision making DEX (Bohanec, Rajkovič 90). The system was successfully applied in Slovenia and abroad in more than 50 realistic decision making problems. Typical application areas were: selection and evaluation of technologies, evaluation of enterprises and business partners, personnel management and project evaluation. For the proposed project, specifically relevant are the approaches for combining qualitative and quantitative decision models (Bohanec et al. 92, 95, Rajkovič et al. 95). In this project, these methods will be upgraded, generalized and supported with appropriate methods.
Biermann, A.W., Fairfield, J., Beres, T.: Signature table systems and learning, IEEE Trans. System, Man, and Cybernetics 12(5), 635-684, 1982.
Bohanec, M., Rajkovič, V.: DEX: An expert system shell for decision support, Sistemica 1(1), 145-157, 1990.
Bohanec, M., Urh, B., Rajkovič, V.: Evaluating options by combined qualitative and quantitative methods, Acta Psychologica 80, 67-89, North-Holland, 1992.
Bohanec, M., Bratko, I.: Trading accuracy for simplicity in decision trees, Machine Learning 15, Kluwer Academic Publishers, 223-250, 1994.
Bohanec, M., Rajkovič, V., Semolič, B., Pogačnik, A.: Knowledge-based portfolio analysis for project evaluation, Information & Management 28, 293-302, Elsevier, 1995.
Cestnik, B., Kononenko, I., Bratko, I.: ASSISTANT 86: A knowledge-elicitation tool for sophisticated users, Progress in Machine Learning (eds. I. Bratko, N. Lavrač), Sigma Press, 1987.
Cestnik, B.: Estimating probabilities: A crucial task in machine learning, Proc. ECAI 90, 1990.
Karalič, A.: Linear regression in regression tree leaves, Proc. ISSEK 92, Bled, 1992.
Mallach, E.G.: Understanding decision support systems and expert systems, Irwin, 1994.
Michalski, R.: Understanding the nature of learning: issues and research directions, Machine Learning: An Artificial Intelligence Approach (eds. R. Michalski, J. Carbonell, T. Mitchell), Kaufmann, 3-25, 1986.
Michie, D.: Problem decomposition and the learning of skills, Lecture Notes in Artificial Intelligence 912 (eds. N. Lavrač, S. Wrobel), Springer, 17-31, 1995.
Muggleton, S.: Inductive Acquisition of Expert Knowledge, Addison-Wesley, 1990.
Muggleton, S., Buntine, W.: Machine invention of first-order predicates by inverting resolution, Machine Learning: An Artificial Intelligence Approach (ed. S. Muggleton), Academic Press, 1992.
Perkowski, M.A., et al.: Unified approach to functional decomposition of switching functions, Technical report, Warsaw University of Technology and Eindhoven University of Technology, 1995.
Ragavan, H., Rendell, L.: Lookahead feature construction for learning hard concepts, Proc. 10th International Machine Learning Conference, Morgan Kaufmann, 252-259, 1993.
Rajkovič, V., Bohanec, M., Leskošek, B., Kapus, V.: Zasnova ekspertnega sistema za usmerjanje šolske mladine v športne panoge, Računalniška analiza medicinskih podatkov CADAM-95 (ed. Lavrač, N.), IJS Scientific Publishing IJS-SP-95-1, 234-244, 1995.
Ross, T.D., Noviskey, M.J., Gadd, D.A., Goldman, J.A.: Pattern theoretic feature extraction and constructive induction, Proc. ML-COLT-94 Workshop on Constructive Induction and Change of Representation, New Brunswick, New Jersey, 1994.
Samuel, A.: Some studies in machine learning using the game of checkers, IBM Journal 3(3), 535-554, 1959.
Saaty, T.L.: Multicriteria decision making: The analytic hierarchy process, RWS Publications, 1993.
Shapiro, A.D.: Structured induction in expert systems, Turing Institute Press with Addison-Wesley, 1987.
Urbančič, T., Kononenko, I., Križman, V.: Review of applications by Ljubljana Artificial Intelligence Laboratories, Technical Report DP-6218, Institut Jožef Stefan, Ljubljana, 1991.
The most important problems with existing methods of this type are: (1) high time complexity, (2) inability to cope with multi-valued attributes, and (3) high sensitivity to incomplete and/or noisy learning examples. In this project we intend to:
All the methods developed in the project will be evaluated on artificial and real-world problems. The latter will specifically include: