137 On Construction of Inductive Modeling Ontology as a Metamodel of the Subject Field Halyna Pidnebesna, Volodymyr Stepashko International Research and Training Centre for Information Technologies and Systems of the NAS and MES of Ukraine, Glushkov ave., 40, Kyiv, 03680, Ukraine, e-mails: stepashko@irtc.org.ua, pidnebesna@ukr.net Abstract: The paper considers the constructing issue of development of inductive modeling software. ontology for the GMDH-based inductive modeling The Group Method of Data Handling (GMDH) [2] is one domain. It examines the main components of the GMDH of the most effective inductive modeling methods having algorithms in terms of their synthesis for designing the intelligent properties [3]. It is the method of building models domain ontology to construct inductive modelling tools. with automatic selection of structure and parameters on the Such approach significantly expands opportunities for basis of a short data sample with incomplete and uncertain construction of GMDH-based tools for building forecast input information to identify unknown relationships of the models of complex processes of different nature. object or process under study. Keywords: inductive modeling, GMDH, metamodel, In this paper, an approach is considered for domain ontology, model class, structure generator, "intellectualization" of inductive modeling software tools by selection criterion, computation tool. applying an ontological approach to representing the knowledge of the subject field to design a knowledge base, I. INTRODUCTION computing tools and intelligent interface. With the development of technologies, the Internet, an The aim of this research is to build ontology of the extremely large and constantly growing number of inductive modeling subject area. For this purpose, the information resources, there was the need to develop tools for analysis of the modeling process is done and main its stages automatic processing and analyzing data of different nature are characterized. The results of the analysis and structuring with taking into account the semantics (content) of this of this area are presented. The basic components and information. Some intelligent knowledge-based tools have characteristics of them are determined and main principles of been rapidly developed. the GMDH ontology construction are outlined. "Intelligent" computer systems (with artificial intelligence II. STRUCTURING KNOWLEDGE OF INDUCTIVE properties) may be understood as the ability to find ways to solve any task automatically, without (or with minimum) MODELING DOMAIN human intervention. The necessary features of such systems Modeling is a process of studying a real object, in which are adaptability, the ability to take into account the results only some of its specific characteristics, description, and obtained earlier, getting problem solutions by analogy with conditional image are used. We consider the mathematical other cases, building valid (effective) algorithms, using the modeling that is studying the properties of an object by knowledge contained therein. That is, an intelligent computer analyzing and constructing its mathematical model. system should be able to simulate the process of constructing There are two main approaches to constructing an algorithm for solving the current problem, like human mathematical models of objects: the theory-driven (or considerations. deductive) and the data-driven (or inductive) ones (Fig. 1). In the design of computer modeling systems with artificial intelligence properties, the perspective direction of research is the use of an ontological approach to knowledge representation. This allows expanding computer capabilities, increases their "intelligence" and also simplifies the process of developing and modifying software products to solve specific tasks of constructing models and forecasts. The advantage of an ontological approach is that ontology defines a conceptual structured environment in which the process of constructing a model of an object occurs [1]. This environment should be independent on the choice of a Fig. 1 Mathematical modelling approaches particular simulation object. Inductive modeling is the construction of a model based The automation task for intelligent computer modeling on the analysis and generalization of the statistical data about systems may be interpreted as modeling of the modeling the object, obtained through observations or experiments. process, which may be called as metamodeling. A metamodel Methods in the field include such algorithms for finding is a model that describes the structure, principles of other hidden patterns in data: GMDH, discovering of associative models’ operation. rules, sequence analysis, classification, regression, random The paper considers ways to increase efficiency of the forest, neural networks, support vector machine (SVM), ACIT 2018, June 1-3, 2018, Ceske Budejovice, Czech Republic 138 genetic algorithms, least absolute shrinkage and selection Any real problem can be characterized by the following operator (LASSO) etc. main stages of the process of its solution: preparation; The inductive modeling algorithms solve a range of tasks: preliminary analysis; formulation of task; solving the task; • building mathematical models of objects/processes; analysis of results; their application. This upper level of • forecasting processes specified by time series; structuring is supplemented by more detailed classifiers of • construction of classification rules (supervised learning) subsequent hierarchical levels depending on the specificity of for attributing an object to a given class; the problems under consideration. • clustering (unsupervised learning or self-training: The preparation of the task consists in determining the identification of effective features, forms and rules of type of task (modeling of statics, time series or dynamics), distinction); in GMDH this problem is called “Objective modeling goals (approximation, interpolation, extrapolation, Computer Clasterization” (OCC); prediction, search for regularity), experiment planning (if the • objective system analysis (OSA) when one need to find simulated system allows experimentation), obtaining a set of out which variables among the measured ones are data (as a result of an active or passive experiment), their independent (inputs), dependent (outputs) and irrelevant preliminary processing and organization of storage in the (uninformative) for building an appropriate model. relevant database [4]. Inductive modeling based on statistical data is a process As a result of the field structuring, the principles of of sequential decision making, consisted of certain successive formation of algorithmic modules for solving a class of stages (Fig. 2). All methods of inductive modeling have specific problem are determined. Depending on the type of standard components. This can be the basis of the metamodel tasks, adequate methods for solving them are selected. Each of available methods corresponds to certain of inductive modeling. characteristics. According to them, it is possible to choice (may be automatically) a better method for a specific task. To do this, each of the set of solution methods should have some weight of importance to assess the adequacy of choosing this particular method at this stage. When choosing the appropriate (for a particular case) method at each stage of the modeling process, we get an algorithm (possibly the best) for solving a specific problem as a result of sequential synthesis in a structured set of possible options. III. METAMODEL AS THE HIGH LEVEL ONTOLOGY To significantly expand the scope of computer modeling systems, they must be independent of the particular Fig.2 Components of inductive modeling process simulation object and of the means of its implementation. (fragment of the metamodel) That means there should be a high level of abstraction of the A metamodel provides the logical level of the domain and subject area. is interpreted dynamically at the application level. This adds Raising the level of abstraction is difficult, and developers additional flexibility to the system, since the domain logic are forced to take out part of the information model for some can be changed without modifying the code. To allow or application. This means fixing this part of the model. At the prohibit a particular type of communication at the logical same time, the setting flexibility for the subject area is lost. level, it will suffice only to assign it to the formal terms of The solution of this problem is seen in the introduction of the metamodel. metamodels. Metamodels reduce uncertainty in the In fact, the metamodel may be defined as a high level description of the subject area and allow to get rid of rigid ontology, in terms of concepts of solution methods, key fixation on the task specificity. stages and constraints (Fig. 3). The ontological model of the First of all the metamodel helps to determine the structure subject domain of the lower level describes the algorithmic of the process and allows developers to show specific components of each particular modeling method in more requirements of the process automation means. The details. To solve a practical task, an ontological model of a metamodel defines "design details", from which a modeling task is used having its own parameters, specific system may be subsequently created. characteristics and areas of admissible values. Metamodels are closely related to ontologies, because they are used to structure information and to analyze the relationships between concepts. The ontology divides the variables needed for some set of computations and establishes the relationship between them [5]. Modeling can be considered as an explicit description (design and rules) of how a problem-oriented model is constructed. As a rule, metamodels are a strict set of rules. A real metamodel is an ontology, but not all ontologies are represented explicitly as metamodels. Fig.3 Hierarchy of GMDH domain ontologies The internal structure of an intelligent computer system is ACIT 2018, June 1-3, 2018, Ceske Budejovice, Czech Republic 139 a reflection of certain knowledge that needs to be expressed mathematical model of an object (process). explicitly, in a formal way. The use of ontologies can facilitate the description of the task of designing complex V. ONTOLOGICAL MODEL OF GMDH-BASED systems from components and implement a program that INDUCTIVE MODELING PROCESS makes such a configuration independent of the product and To structure knowledge in a domain, one needs to consider the components itself, makes it possible to reuse. the following issues: Ontology is the exact specification of some field that contains a glossary of terms and a set of subject area links • define the main stages for solving typical problems in a describing relations between these terms. It actually is a specific domain to obtain the basis for constructing the hierarchical conceptual skeleton of the subject area. metamodel of the inductive modeling process; Formal ontology model (O) is an ordered triplet [6] • identify the main methods for effective solving these problems to form the basis of the domain ontology; О=<Т, R, F>, • generalize the experience of applying these methods to where: develop relevant intelligent software tools. T is finite set of terms of the subject area being described Obviously, each of these problems has a complex by the ontology О; multilevel structure. The results of analysis of these problems R is finite set of relations between the given terms; are used to create the ontology of the subject field. F is finite set of the interpretation functions given on the GMDH as one of the methods of inductive modeling also terms and/or relations of the ontology О. has a standard sequence of stages to solving a specific The purpose of creating and using ontologies is support problem, as discussed in [7]. for activities to accumulate, distribute and reuse knowledge Ontology development is an integrated, sequential and in a particular subject area. iterative process. At the top level, the ontology contains a list Ontology allows one to specify a complex structure that of concepts and their general properties. In fact it is a can contain different types of data, provide a simple thesaurus. A dictionary or a list of concepts is collected as a understanding of the presentation of structured knowledge result of structuring knowledge domain. The next important and relatively easy updating. step is to rank and organize the terms and build a hierarchy. In general case, the ontological model of the presentation In [8] a fragment of thesaurus of GMDH was given and contains a description of the situation/task (data, the purpose general principles and main stages were described. The ideas of the modeling) and the appropriate solution (algorithm for given in [8] are substantially generalized in this paper. obtaining an adequate model). In most cases, in order to The next step is more detailed study of GMDH obtain an algorithm for solving a problem, it is sufficient a algorithms, definition of the ontology structure and parametric representation in the form of a set of characteristics of the stages of the choice of a models class, corresponding parameters given by the ontology, with structure generators, and model evaluation criteria. The result specific values. In what follows, there is an example of an is the construction of corresponding ontological models. ontological representation of knowledge of the domain of The ontology of inductive modeling. classes of models CM (Fig. 4) is characterized IV. GMDH AS A METHOD OF MODEL BUILDING by such key parameters as the number of input The Group Method of Data Handling (GMDH) is one of and output variables, the most effective methods of modeling from statistical data, and the number of past which fully implements the essence of the inductive approach values (latencies) taken in modeling and has the intelligent properties. into account for input GMDH is the method for constructing models with and output variables, automatic determination of model structure and parameters respectively. Depending from a data sample under conditions of incompleteness and on the specific values uncertainty of input information in order to detect an of these parameters, unknown operation rule of an object or process under study. one can obtain most of GMDH characterizes by application of principles of the variants of linear automatic model generation with inductive complication of models that are used in variants, non-definitive decisions and sequential selection practice to describe according to external criteria for constructing models of static objects, time optimal complexity. For comparison and selection of the best Fig. 4 The ontology of model classes series and dynamic models, external criteria are used which are based on splitting objects and processes. the sample of input data into two or more parts. Estimation of The model generators ontology GS (Fig. 5) contains two parameters and quality assurance of models is carried out on main types of GMDH structure generators: sorting-out and different subsamples, which allows to automatically take into iterative ones. In turn, typical sorting-out algorithms to form account different types of a priori uncertainty when different model structures are COMBI (exhaustive search) constructing a model. These principles can be considered as and multistage MULTI (directed search) [9]. Two main metamodel characteristics for the process of building architectures of iterative structure generators are multilayer ACIT 2018, June 1-3, 2018, Ceske Budejovice, Czech Republic 140 MIA and relaxational RIA. VI. CONCLUSION In recent years, new kinds of GMDH The way to generalization of software tools of inductive algorithms have been modeling means by applying an ontological approach as developed: generalized metamodel representing the knowledge of GMDH-based iterative algorithm domain is considered. This enables substantial simplification GIA [10] and the of developing specifications and software tools for solving hybrid combinatorial- various applied tasks. genetic algorithm The paper presents the results of structuring of the Combi-GA [11]. So, inductive modeling domain. The examples of main taking into account components of the modeling process defining their basic current trends, the characteristics for building ontologies are provided. Some ontology of generators fragments of the ontology constructed using Protégé are of structures may be presented as significant modules of the domain metamodel. presented in the form REFERENCES Fig. 5 The ontology of model of Fig. 5. generators [1] T. Gruber, “Toward principles for the design of Ontology of model ontologies used for knowledge sharing,” International criteria CR (Fig. 6) may Journal Human-Computer Studies, 43(5-6), 1995, pp. be defined by key 907-928 parameters that describe [2] H.R. Madala, A.G. Ivakhnenko, Inductive Learning the penalty functions for Algorithms for Complex Systems Modeling. New York: the model complexity, CRC Press, 1994, 384 p. the model quality, [3] V.S. Stepashko, Conceptual fundamentals of intellectual estimations of the modeling. Control Systems and Computers. – Кyiv: unknown variance. They IRTC ITS, #4, pp. 3-15 (In Russian) characterize a set CR of [4] V.S. Stepashko, On the problem of structuring the criteria, which are expert's knowledge in the field of modeling by empirical applied in practice for data. ISSN 0454-9910. Kibernetika i vychisl. tekhnika. tasks of structural 1991, Issue 92, pp. 80-83. (in Russian) identification of models [5] Metamodeling, [cited 2017 Oct. 16]. Available from: Fig.6 Ontology of model criteria of optimal complexity. https://en.wikipedia.org/wiki/Metamodeling. In [8] an example is given for ontological model of [6] T.A. Gavrilova, V.P. Khoroshevsky, Knowledge Base sorting-out GMDH algorithm COMBI. The same way can be Intelligent Systems, SPb.: Piter, 2000, 384 p. (in defining ontological models for other GMDH algorithms. For Russian) instance, let us consider the following set of parameters: [7] V. Stepashko, G. Pidnebesna, “Generalized • an element of the model classes set CМ is Multifunctional Modules Concept for Construction of k i * = < linear regression models >, Inductive Modeling Tools,” Proc. of the 4th Int. Conf. on • an element of structure generators set GS is Inductive Modelling ICIM-2013, Kyiv, Ukraine, Kyiv: g i *=< sorting-out algorithm:: directed search >, IRTC ITS NASU, 2013, pp. 225-230. • an element of parameter estimators set EP is [8] H. Pidnebesna, On Constructing Ontology of the p i *=< least-squares method >, GMDH-based Inductive Modeling Domain, Proc. of 8th • an element of selection criteria set CR is International Workshop on Inductive Modeling IWIM r i *=< regularity criterion >. 2017, Lviv, Ukraine, 2017, pp.511-513. This set of parameters of the inductive modeling ontology [9] V.S. Stepashko, "A Finite Selection Procedure for defines the sorting-out GMDH algorithm MULTI [9]. Pruning an Exhaustive Search of Models," Soviet In case when element of structure generators set GS is Automatic Control, 1983, vol. 16, nо. 4, pp. 84-88. g i *=< iterative algorithm:: relaxational >, it defines the [10] V. Stepashko, O. Bulgakova, V. Zosimov Construction Relaxational iterative GMDH algorithm RIA. and Research of the Generalized Iterative GMDH In case element of structure generators set GS is Algorithm with Active Neurons. – In: Advances in g i *=< iterative algorithm:: multilayered >, it defines the Intelligent Systems and Computing II. CSIT 2017 / Multilayered iterative GMDH algorithm MIA. Shakhovska N., Stepashko V. (eds). – Advances in These are examples of ontology models as part of Intelligent Systems and Computing, vol 689. Springer, GMDH-based domain ontology. Preliminary analysis of the Cham, 2018. – P. 492-510. subject field enables the generalization of many different [11] V. Stepashko, O. Moroz, “Hybrid Searching GMDH-GA methods, identifying the key parameters. Ontology allows Algorithm for Solving Inductive Modeling Tasks,” defining both general rules for constructing the algorithm and IEEE Int. Conf. on Data Stream Mining & Processing, specific parameters when making an application. Lviv, Ukraine, pp. 350-355, August 2016. ACIT 2018, June 1-3, 2018, Ceske Budejovice, Czech Republic