Equation discovery - Introduction

introduction | Lagrange and Lagramge | towards knowledge-based equation discovery

What is the task of equation discovery?

Equation discovery is the area of machine learning that develops methods for automated discovery of quantitative laws, expressed in the form of equations, in collections of measured data. It is strongly related to the area of system identification. However, mainstream system identification methods work under the assumption that the structure of the model, i.e., the form of the equations, is known and are concerned with determining the values of the constant parameters in the model that minimize the discrepancy between measured and simulated data. Equation discovery systems, on the other hand, aim at identifying both an adequate structure of the equations and appropriate values of the constant parameters. Again, the discrepancy between measured and simulated data is to be minimized.

A very brief overview of the equation discovery systems

BACON [Langley et al. 1987] is the pioneer among equation discovery systems. It uses a set of data-driven heuristics for finding regularities (constancies and trends) in data and for formulating hypotheses based on them.

COPER [Kokar 1986] is using information about the dimension units of the system variables to restrict the space of possible equation structures.

FAHRENHEIT/EF [Langley and Zytkow 1989], [Zembowitz and Zytkow 1992] - EF (Equation Finder) is used as a equation discovery subsystem of the scientific discovery system FAHRENHEIT. Used for discovering bivariate equations only, user being able to specify the set of operators and functions to be used within equations.

ABACUS [Falkenhainer and Michalski 1990] is experimenting with different search strategies through the space of equation structures. Also allows discovery of piecewise equations using clustering for identifying the limits between pieces.

IDS [Nordhausen and Langley 1990]

ARC [Moulet 1992]

E* [Schaffer 1993] discovery of bivariate equations using a small set of predefined equation structures.

LAGRANGE [Dzeroski and Todorovski 1995] extends the scope of the equation discovery systems towards the discovery of differential equations.

GOLDHORN [Krizman et al. 1995] extends LAGRANGE towards discovery from noisy data.

SDS [Washio and Motoda 1997] using information about scale types of the dimension units of the system variables to restrict the space of possible equations.

LAGRAMGE [Todorovski and Dzeroski 1997] allows the user to specify the space of possible equations with context free grammar.

Bibliography

Dzeroski, S. and Todorovski, L. (1995) Discovering dynamics: From inductive logic programming to machine discovery. Journal of Intelligent Information Systems, 4: 89-108.

Falkenhainer, B. and Michalski, R. (1990) Integrating quantitative and qualitative discovery in the ABACUS system. In Kodratoff, Y. and Michalski, R. (editors) Machine Learning: An Artificial Intelligence Approach. Morgan Kaufmann, San Mateo, CA.

Kokar, M.M. (1986) Determining arguments of invariant functional descriptions. Machine Learning, 1(4): 403-422.

Krizman, V., Dzeroski, S. and Kompare, B. (1995) Discovering dynamics from measured data. Electrotechnical Review, 62: 191-198.

Langley, P., Simon, H. and Bradshaw, G. (1987) Heuristics for empirical discovery. In Bolc, L. (editor) Computational Models of Learning. Springer, Berlin.

Langley, P. and Zytkow, J. (1989) Data-driven approaches to empirical discovery. Artificial Intelligence, 40: 283-312.

Moulet, M. (1992) Iterative Model Construction with Regression.

Nordhausen, B. and Langley, P. (1990) A Robust Approach to Numeric Discovery. In Proceedings of the Seventh International Conference on Machine Learning, pages 411-418.

Schaffer, C. (1993) Bivariate scientific function finding in a sampled, real-data testbed. Machine Learning, 12: 167-183.

Todorovski, L. and Dzeroski, S. (1997) Declarative bias in equation discovery. In Proceedings of Fourteenth International Conference on Machine Learning, pages 376-384. Morgan Kaufmann, San Mateo, CA.

Washio, T. and Motoda, H. (1997) Discovering admissible models of complex systems based on scale-types and identity constraints. In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, pages 810-817. Morgan Kaufmann, San Mateo, CA.

Zembowitz, R. and Zytkow, J. (1992) Discovery of equations: experimental evaluation of convergence. In Proceedings of Tenth National Conference on Artificial Intelligence, pages 101-117. Morgan Kaufmann, San Mateo, CA.

Ljupco Todorovski

Created: November, 2000
Updated: May, 2001