System of Ontologies for Data Processing Applications Based on Implementation of Data Mining Techniques Alexander Vodyaho1, Nataly Zhukova2 1Saint-Petersburg State Electrotechnical University, Saint Petersburg, Russia 2Saint-Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences, Saint Petersburg, Russia {aivodyaho,nazhukova}@mail.ru Abstract. The paper describes a system of ontologies developed for the appli- cations oriented on solving problems of situations recognition and assessment based on results of data processing and analyses. Main attention is focused on the problems of processing measurements of various objects parameters repre- sented in a form of time series. The considered applications process data using knowledge extracted from historical data with the help of Data Mining tech- niques. Such applications are highly knowledge centric and their core element is knowledge base that is represented as a system of ontologies. The proposed system of ontologies is a set of upper level ontologies for which techniques of adaptation for solving applied tasks for one or several related subject domains are developed. Keywords: knowledge representation, data analyses, data fusion, measure- ments processing, situation recognition and assessment. 1 Introduction Nowadays multiple problems in various subject domains are required to be solved at the level of situations [1, 2]. Results of solving problems at this level are much easier interpretable by an end user than results represented at lower levels of information generalization. Solving problems at the level of situations assumes solving such prob- lems as recognition of situations, formal description of situations, analyses of situa- tions, their estimation, assessment, prediction and awareness. Main sources of infor- mation about situations are results of measurements received from different types of instruments that measure parameters of technical and / or environmental objects. Real systems have to process huge volume of information including bad quality infor- mation. The majority of real life problems require that measurements are processed in real time or in the mode close to real time. It considerably increases the complexity of the problems. The problems can be solved with the desired quality and in limited time only using knowledge-oriented technologies. These intelligent technologies are based on application of data mining algorithms along with other means of artificial intelli- gence such as expert systems and inference machines. A set of basic solutions for 102 developing intelligent technologies for measurements processing (IMPT) and exam- ples of their implementation are proposed in [3, 4, 5, 6]. The intelligent measurements processing technologies are described in general form using web ontology language (OWL). When new measurements are received an ap- propriate technology is selected and detailed using an a priori defined set of produc- tion rules. The rules are two part structures that use first order logic for reasoning over knowledge representation [7]. The detailed technologies are processes described in business processes modeling language (BPML), they can be executed using standard engines. Execution of the processes requires that the input data, information and knowledge are represented using standard formats. It is reasonable to use the same standards for representing the results of measurements processing. For formal description of data, information and knowledge about initial and processed measurements a hierarchy of information models has been developed [8]. In [6] a set of general classifiers for technologies, methods, algorithms and procedures for meas- urements processing is proposed. To use the intelligent technologies in the end user applications it is necessary to implement the models and to integrate them into the information models of the applications. For implanting the models it is proposed to use ontological approach as, at first, it has in fact become a standard for describing models of subject domains and, at second, the information models of the applications are commonly described using ontologies. In the paper a structure of the system of ontologies build according to the models for measurements processing is proposed. Main data mining techniques and models re- quired for measurements processing are enumerated in the second section. In the third section the developed system of ontologies is described. An example of the ontologies adaptation for the subject domain of telemetric information processing (TMI) is given in the fifth section. 2 Models and techniques for measurements processing and analyses The actual standard of data and information processing and analyses is defined by the JDL model [9]. The JDL model is a general functional model of data and information fusion. The model has five levels: signal level, object level, situation level and level of threats. The highest fifth level is the level of decision making support. Measure- ments processing and analyses includes three steps: measurements harmonization, integration and fusion. Optionally measurements exploration can be executed at the fourth step. For each of the models levels, the functions and the processes of the lev- els are defined. The detailed descriptions of the models are given in [10] and the tech- nologies of data harmonization, integration and fusion that provide the implementa- tion of the models can be found in [11]. Input and output parameters of the levels of the functional models are represented using three specialized information models for description of different types of initial measurements and information and knowledge about them: a model of time series of measurements, a model of separate measure- ments and a combined model of different types of measurements. The description of 103 each model is given in [3]. Processing of measurements at each level according to the developed technologies assumes application of an a priori defined set of intelligent technologies or separate statistical and data mining methods and algorithms adapted for solving tasks of measurements processing. The set of intelligent technologies used for measurements harmonization is oriented on processing and analyses of initial binary data streams and the measurements repre- sented in the form of single values or time series that are extracted from the streams. Processing and analyses of initial data streams assumes application of technologies for identification of the structures of the streams and estimation of the quality of the received data. Extracted measurements are transformed into standard formats and described in terms of the dictionary of the subject domain. Harmonization technology uses methods for measurements transformation into different formats, methods based on computing correlation functions, methods based on statistical laws of linguistic distribution, methods for building formalized descriptions of the initial data streams and measurements. Intelligent technologies oriented on measurements integration include two key tech- nologies: a technology for measurements preprocessing and a technology for prepar- ing measurements for solving applied tasks. The first technology is implemented us- ing algorithms of measurements denoising, removing single and group outliers, filling gaps, removing duplicating values and specialized procedures developed for different types of measurement instruments. The second technology uses methods for estimat- ing compliance of the measurements to requirements of the end user tasks, methods for computing various features of measurements and characteristics of the analyzed objects. Technologies of data fusion include technologies of extracting information and knowledge from initial measurements, of revealing dependencies in behavior of the measured objects parameters, of grouping measurements, of building grids on the base of separate measurements and of solving separate highly complicated computa- tional tasks. The technology of extracting information and knowledge from measure- ments is based on algorithms of classification, cluster analyses and segmentation. The technology of revealing dependences applies algorithms of associations mining and building temporal patterns. The technology of measurements grouping is oriented on identifying groups of similar measurements and uses methods of cluster analyses. For the identified groups classes and association rules are defined. The technology of building grids is used to build both regular and non-regular hierarchical grids with various levels of detailing. The list of the computational tasks can include various tasks that are solved at the level of situations or oriented on decision making support. The list of the technologies and methods given above is aimed to show the multiplici- ty of the directions of data mining techniques application for processing measure- ments. The detailed description of each technology one can find in [6]. The data, in- formation and knowledge required to execute the methods and the algorithms directly affect the structure of the information models of measurements and results of their processing and, consequently, the structure of the system of ontologies for measure- ments processing. 104 3 A system of ontologies for measurements processing The proposed interconnected ontologies are aimed to store and to provide data, information and knowledge about measurements and results of their processing. They are developed according to [12] and form the core of the system of ontologies of the subject domain of measurements processing. The system includes 3 main groups of ontologies: ontologies that contain information and knowledge about measurements, ontologies that describe technologies, methods, algorithms and procedures for meas- urements processing and analyses, and ontologies for representing information and knowledge about objects and situations using measurements of objects parameters. The first group contains the ontologies of time series, of time series segments, of time series features, of time series formal descriptions, of the criteria for the initial meas- urements and results of their processing estimation. The second group includes ontol- ogies that provide information and knowledge about technologies of measurements processing, applied methods, algorithms and procedures including semantic descrip- tions of their input and output parameters, conditions of their application, the criteria for estimating results, the history of the methods application as well as other parame- ters. Ontologies of objects contain information about the structures of objects, their life cycles, functionality, possible interaction, defined regular states and faults. Ontol- ogies of situations define the possible types of situations and provide extended for- malized descriptions of situations and the objects involved in the situations. Different kinds of external ontologies that are required for measurements pro- cessing or contain information about related subject domains can be used, for exam- ple, ontology of data providers or ontology of statistical distributions. For adaptation to applied subject domains the system can be extended with the specialized ontolo- gies. The set of relations defined for the ontologies is given in Fig. 1. Information and knowledge about measurements Contains Ontology of the criteria for Ontology of time series Estimated using measurements estimation represented Ontology of time series Are Calculated for segments Described with Ontology of time series formal Ontology of time series features Defines descriptions Ontologies Ontologies of of measurements measurements and and results results of of their their processing processing formal formal representation representation Used for processing measurements Information and knowledge about technologies methods and algorithms Used for building Used for Ontologies of measurements processing methods, Ontologies of measurements processing and analyses descriptions formalized implementation algorithms and procedures technologies Information and knowledge about objects and sutuations Ontologies of objects and situations Used for measurements processing Used for adaptation to the applied subject domains Ontologies of the related subject domains Ontologies of the applied subject domains Fig 1. Relations defined for the system of ontologies A. Description of the ontology of time series. The ontology of time series is aimed to provide information about different types of time series that can be processed. Types are formed according to behavior of time series and consequently define groups of algorithms that one can use for processing time series. The behavior of time series is described using five base features. 105 Feature 1. According to the types of the objects parameters 3 types of time series of measurements can be defined: functional, signal and constant. Functional time series are represented with continuous functions. For signal time series stepwise behavior is typical. Constant time series do not change in time. Feature 2. Depending on dynamic of changes of functional time series slow changing time series and fast changing time series can be defined. The first type of time series can be characterized with the frequency spectrum in an interval from 0 up to 20-50 Hz, the second type – up to 2-3kHz or even more. Feature 3. Depending on behavior, functional time series can be stationary, non- stationary and piece-wise stationary time series. The majority of time series are non- stationary but they contain comparatively long stationary segments. Feature 4. For slow changing time series existence of gaps in the first and the second derivatives are considered as features. Feature 5. For functional time series possibility of their description using parametric models is considered. For non-stationary time series a set of parametric models for each of the stationary segments is build. For selecting an appropriate model the mod- els are matched using the least squares method or the method of maximum likelihood estimation. For defining types of time series for each time series a set of various features is com- puted and classifiers of the time series types are used. The classifiers can be built on the base of historical data using algorithms for building decision trees [13]. B. Description of the ontology of time series segments. Segments are defined for piece-wise stationary and non-stationary time series. The ontology contains infor- mation about possible types of segments that can be observed in a time series. For defining types of segments 2 approaches are proposed. The first approach is based on using an a priori defined set of typical segments that are described in the ontology. To define a type of a segment, similar segments are found in the data base. The data base contains segments that have constant, linear increasing / decreasing, convexly / con- cavely increasing / decreasing behavior. The data base can be extended with segments that describe specialized behavior of time series typical for the applied subject do- main. Specialized segments can be defined by experts or revealed from the historical data. The second approach assumes that for the analyzed segment a set of features is computed. The computed features contain several groups of features that reflect gen- eral behavior of the segment, describe the segment without taking into account the local peculiarities of the segment and that are focused on describing all tiny peculiari- ties of the segment. For defining methods and algorithms for computing features on- tologies of methods are used. C. Description of the ontology of time series features. The ontology is aimed to define features for describing stationary, piece-wise stationary and non-stationary functional time series and segments of time series. The sets of features computed for other types of time series, are fixed. The features can be defined according to the time required for features computing, according to the domain of the time series representation (time, frequency, time-frequency or spatio-temporal domain) and according to infor- mation density of the features for the solved task or for the allied subject domain. 106 The first group of features contains statistical features (median, mode, range, rank, standard deviation, coefficient of the variation, moments including mean, variance, skrewness, kurtosis), measurements frequency, behavior of the curve that corresponds to the time series in the time domain (convexity / concavity of the curve, variability of the curve, the error of the piece-wise constant / piece-wise linear approximation, the error of the approximation using the polynomials of the second and higher degrees, values of the characteristic points, the curvature), entropy, variability of the first de- rivative. The considered list of features contains commonly list feature, it can be ex- tended or modified. The second group includes feature that consider time series as stochastic processes, in particular, one-dimensional and multi-dimensional distribu- tion functions, one-dimensional and multi-dimensional probability density of the so- phisticated processes, the distributions of the probabilities of the sophisticated discrete variables, spectral density. The list of features of the third group that are computed for both initial and transformed time series is given in table 1. Table 1. Extended set of time series features Transformation types Computed features initial measurements; ranging of values error of a time series description using a constant of initial measurements; computation of / linear / quadratic function for a time series derivative using the finite difference approximation method; computing of upper and lower envelopes computation of variation of upper and deviation from zero lower envelopes of a time series interpolation using cubic splines error of interpolation transformation approximation using a defined function; error of approximation transformation using computation of a curve length power / exponential / logarithmic / user function computation of a curve complexity local complexity, global complexity and weighted complexity computation of a curve variability variability indices computation of the characteristic points number of minimums, maximums, intersections of a curve with the defined level of the values computation of a curve curvature minimum, maximum and median of a curvature computation of area of a figure that is value of an area limited by the curve and the line that connects the edge points [14] computation of the first component error of a time series description using a constant using the method of principle / linear / quadratic function for a time series components [15] approximation The alternative approach for building the ontology of the time series features is pro- posed in [16]. It is based on computing linear, non-linear and other features. For de- fining linear features measures based on the computing of linear correlation, frequen- cy parameters of the time series and autoregressive models are used. To nonlinear features refer 19 features. Definition of measures for these features assumes computa- tion of nonlinear correlation and of time series dimension and complexity, building nonlinear models of time series. D. Ontology of time series formal descriptions. The ontology is used for building formal descriptions of stationary, piece-wise stationary and non-stationary functional 107 time series. Descriptions are built according to the computed features of the time se- ries. The time series can be described using adaptive and non-adaptive approaches [17]. Adaptive approach assumes computing coefficients of piece-wise constant and piece-wise linear approximations, coefficients of singular decomposition and building symbolic representations of time series. In order to build non-adaptive descriptions one can use such features as coefficients of wavelet transformations, of time series spectral representation, results of piece-wise aggregate approximation. Depending of time series complexity one or several descriptions can be built. E. Description of the ontology of criteria for initial measurements and results of their processing estimation. In the ontology 3 groups of criteria for initial measurements are considered. The first group allows one to estimate measurements using knowledge about the object / environmental area which parameters are measured, the second group ̶ using results of matching new data with historical data, the third group ̶ using specialized procedures selected according to the types of the processed meas- urements and applied methods. The criteria of the first group are usually defined by experts and / or producers of the measurement instruments. They are represented as a set of features for which admissible intervals for measured values are given. The sec- ond group of criteria is based on computing distances between the analyzed meas- urements or their features and measurements that were acquired earlier in similar conditions. The third group of the criteria includes criteria that estimate separate measurements and sets of measurements, separate time series and their groups. The criteria significantly depend on the solved tasks. The examples of the criteria are uniqueness, accuracy, consistency, completeness, timeliness, actuality, interpretabil- ity, relatedness to other data. Fig 2. Use case diagram for the system of the ontologies for measurements processing Results of measurements processing are estimated twice: just after measurements are processed and at consequent stages of their processing and analyses. Both stages as- sumes application of the procedures of revealing contradictions of the acquired results 108 with available information, of comparing results received using different methods, of comparing results with results of historical data processing, of comparing results of separate measurements and separate time series processing with the results of joint analyses, of computing complex features on the base of separate features. An example of criteria for cluster analyses methods can be found in [18]. The described above system of ontologies but can be used for solving tasks in intelli- gent applications specialized for measurements processing by experts and common users and by different external applications. The use case diagram for the proposed system of ontologies is given in Fig. 2. 4 Application of the system of ontologies for TMI processing The developed set of ontologies for measurements processing was adapted for pro- cessing TMI [19] received from remote space objects. A hierarchy of the solved tasks is given in Fig. 3. Tasks solved using TMI from remote space objects Exploration of the objects behaviour Control of the objects state Localization of the faults on the objects Identification of the objects Identification of the objects Control of the objects state on the base of comparing characteristics with mathematical models of the parameters Control of the objects state on the base of the defined Control of the objects state on the base Control of the objects state on the base of the functional dependencies between parameters of separate functional parameters code parameters Fig 3. Ontology of the tasks Table 2. Time series of measurements of specialized parameters constant code meander counter mantissa order lower part upper part Table 3. Standard dependences of telemetric parameters Example of the initial data graphical re- Dependency presentation pairs of sine and cosine integro-differential pairs elements of the matrix of the dency x2  y 2  1 x  y  0 pen- coordinates transformation De- 109 Adaptation required extension of the ontology of the types of times series, the ontolo- gy for representing dependences in objects parameters and the ontology of methods and algorithms for measurements processing. A set of types of time series was ex- tended with the types aimed to describe measurements of specialized parameters (ta- ble 2). The set of features for the specialized types are defined in [20]. The standard dependencies of telemetric parameters include pairs of sine and cosine, the integro- differential pairs and elements of the matrix of the coordinates transformation (table 3). The upper level ontology of methods and algorithms for TMI processing is given in Fig.4. Several branches of the ontology are detailed in Fig. 5-7. Methods and algorithms for TMI processing and analyses Methods for processing the structures of Methods for measurements Methods for measurements the initial binary data streams processing at the semantic level analyses Methods for time series sequential analyses Methods for identification of the Methods for building Methods for time series Methods for time series Methods for time series Methods for building types of parameters behaviour patterns for time series segmentation cluster analyses patterns comparing association rules for time series Fig 4. Ontology of methods and algorithms for TMI processing and analyses Methods for processing the structures of the initial binary data streams Methods for express processing of Methods for complex processing of Algorithms for identifying types of Methods for identifying structures of the the initial binary streams the initial binary streams structures multiplexors used for forming data streams binary streams Methods for computing Methods for Algorithms for identifying the length Methods of differential distances building graphs of the frames in the initial streams operators Methods for classification of Methods for computing frequency rank distributions Segmentation Methods for building edit distance methods frequency rank distributions Algorithms for identifying the length Approximation of the words in the initial streams methods Methods for computing Zipf's law, Zipfian frequency - rank distributions edit distance for graphs Binary streams segmentation methods Methods and algorithms of Methods for distributions Quick method for computing Classification Methods of potential approximation edit distance for graphs correlation analysis methods functions calculation Fig 5. A fragment of the ontology of methods for processing structures of binary streams Methods for measurements processing at the semantic level Methods for identification of the types Methods for identification of the Methods for building semantic Methods for restoring of measurements representation types of measured parameters descriptions of the binary complex parameters streams of measurements Methods for identification of Methods for Methods for identifying Methods for matching Methods for computing values the types of measurements identifying mantissas constant parameters mantissas and orders of of measured parameters using represented in the binary form measured parameters identified parts of the Methods for identifying parameters Methods for code parameters identifying orders Methods for identification of the types of Methods for reveling functional Methods for matching upper Methods for dependencies in parameters and lower parts of measured measurements represented in the form of identifying counters behavior parameters separate values Methods for identifying lower Methods for parts of the parameters identifying meanders Methods for reveling Methods for revieling specialized functional Methods for reveling Methods for reveling Methods for identifying upper elements of the matrix of the dependencies in integro-differential pairs pairs of sine and cosine parts of the parameters coordinates transformation parameters Fig 6. A fragment of the ontology of methods for measurements processing at the semantic level The system of the ontologies was implemented in a number of the applications orient- ed on processing TMI from space objects in the delayed mode that are successfully used for about ten years already. The description of the developed systems and the examples of their application can be found in [6, 21]. 110 Methods for measurements analyses Methods for identification of the types of parameters behaviour Methods for time series segmentation Methods for identification of the types Methods for segmentation of Methods for segmentation of of slow changings parameters behavior time series of slow changing time series of fast changing parameters measurements parameters measurements Algorithms for Algorithms for symbolic computing distances aggregate approximation Algorithm based on building Algorithm based optimal partitions for slow Algorithm based on on computing of changing parameters computing of derivatives spectral density Algorithms for computing edit Algorithms for piecewise Algorithm based on Algorithm based on distances between strings aggregate approximation building optimal partitions building optimal partitions for fast Algorithms for computing changing parameters Algorithms for building symbolic Methods for time series distances between symbolic representations of time series patterns comparing Segmentation algorithms representations Methods for computing Methods for building Methods for building patterns for time series distances between patterns patterns for time series with Approximation algorithms Algorithms for spline approximation two continuous derivatives Methods for building patterns for time series with Algorithms for wavelet piece-wise constant behavior Methods for building based approximation Algorithms for spline-wavelet patterns for time series with based approximation piece-wise linear behavior Fig 7. A fragment of the ontology of methods for measurements analyses 5 Case Study The control of the space objects state using code parameters assumes analyses of the time points at which the values of the parameters changed. These points corre- spond to the moments of execution of commands on the controlled objects. In table 4 a subset of code parameters for three different objects of one type are given. For each parameter the time points of their values change are defined. Table 4. The time of the values change points of the code parameters № PRMp KND SC Ki PRMb OPKi OHKi ST KZ 1 362789 344936 348956 350428 350429 359539 359535 361435 361746 2 453563 464518 468542 470111 470113 479124 479121 481018 481328 3 190444 201398 205418 206915 206917 216025 216018 217898 218208 KP4b KP4c KP4d KP4e KK4a KK4b KK4c KK4d KK4e 1 361479 361478 361483 361483 361944 361943 361940 361944 361955 2 481061 481061 481062 481063 481475 481476 481476 481477 481477 3 217941 217941 217942 217943 218355 218356 218357 218357 218358 KD1a KD1b KD1c KD1d KD1e PRK KD3b KD3c KD3d 1 362327 362373 362366 362387 362388 362789 363040 363040 363042 2 481930 482010 481990 481991 481970 482372 482623 482605 482606 3 218832 218833 218870 218891 218871 219252 219482 219497 219482 KD3e GK KD5a RPbc RPcd RPde RPeb VOGb VOGc 1 363042 363042 363295 363300 363299 363307 363300 363330 363329 2 482645 482653 482906 482910 482908 482915 482909 482933 482927 3 219494 219504 219746 219748 219746 219755 219750 219777 219775 VOGd VOGe VNNb VNNc VNNd VNNe KP - - 1 363330 363330 363332 363331 363332 363330 363356 - - 2 482919 482930 482940 482938 482938 482939 482950 - - 3 219778 219777 219781 219780 219784 219783 219791 - - The time points of the values change were processed using data mining techniques, in particular, statistical and cluster analyses methods. The results of building clusters 111 of objects using all parameters showed that the behavior of the first object differs significantly from the behavior of the second and the third objects. The first object is the only element of the first cluster. The second and the third objects form the second cluster. The differences between the clusters are represented in the form of a histo- gram (Fig. 8 a). The order of the parameters in the histogram is the same as in the table 4. The cluster analyses of similar parameters of different blocks of the objects that have equal construction (the name of the block to which the parameters refer is written in small letters after the name of the parameter) revealed deviations from the normal behavior for the parameters RPde (the time points of the disconnection of the spherical locks between blocks ‘b’ and ‘e’ differ from the time points defined for the same parameter between other blocks), KD3 (the time points of the contacts breaking of blocks ‘b’ and ‘d’ differ from the time points defined for the parameter for blocks ‘c’ and ‘e’), VNN (the time points of the output of the tooth for blocks ‘d’ and ‘e’ differ from the time points defined for blocks ‘b’ and ‘e’) (Fig. 8 b-d). The clusters in Fig. 8 are represented in the feature space build using the principal component meth- od [22]. a) b) 112 c) d) Fig 8. Application of Data Mining techniques for processing time points of the values change of code parameters 6 Conclusion In the paper a system of ontologies required for processing and analyzes of various objects parameters measurements represented in the form of time series or single values is presented. The structure of the ontologies and the relations between the on- tologies that link them into a system are defined. For each of the ontologies a detailed description is provided and the relations with external ontologies are enumerated. The proposed system of the ontologies has the following distinguishing features: - the system allows one to solve the tasks of measurements processing taking into account the peculiarities of the processed data and the solved tasks; - multiple technological solutions for measurements processing based on application of intelligent methods and algorithms can be implemented using the considered set of ontologies; - the structure of the system of the ontologies and of the separate ontologies is simple and can be easily extended and modified if new methods are developed or new types of measurements are defined; - information and knowledge represented in the form of ontologies can be interpreted both by experts and machines and can be multiply used; - the system of ontologies can be easily adapted to different subject domains if onto- logical descriptions of the domains are available. Further development of the described system of ontologies assumes detailing the on- tologies on the base of knowledge, acquired as a result of operating of the developed applications for telemetric information processing. A set of applications for other subject domains is going to be developed and approved. 113 References 1. Steinberg A.N. Foundations of Situation and Threat Assessment, Handbook of Multisensor Data Fusion, D. Hall, M. Liggins, J. Llinas (eds.), LLC Books (2008). 2. Steinberg, A.N. ; Rogova, G. Situation and context in data fusion and natural language un- derstanding. Proceedings of 11th FUSION, Cologne (2008). 3. Vitol А., Zhukova N., Pankin A. Adaptive multidimensional measurements processing us- ing IGIS technologies. Proceedings of the 6th International Workshop on Information Fu- sion and Geographic Information Systems: Environmental and Urban Challenges, St. Pe- tersburg (2013) 4. Pankin A., Vodyaho A., Zhukova N. Operative Measurements Analyses in Situation Early Recognition Tasks. Proceedings of the 11th International Conference on Pattern Recogni- tion and Image Analyses, Samara (2013) 5. Zhukova N. Method for adaptive multidimentional meas-urements processing based on IGIS technologies. Proceedings of the 11th International Conference on Pattern Recogni- tion and Image Analyses, .Samara (2013) 6. Vitol A., Deripaska A., Zhukova N., Sokolov I. Technology of adaptive measurements processing. SPbSTU «LETI», Saint-Petersburg (2012) 7. Browne P. JBoss Drools Business Rules. Packt Publishing (2009) 8. Vitol A., Zhukova N., Pankin A. Model for knowledge representation of multidimensional measurements processing results in the environment of intelligent GIS. Proceedings of the 20th International Conference on Conceptual Structures for Knowledge Representation for STEM Research and Education, Mumbai (2013) 9. Steinberg A., Bowman C., White F. Revisions to the JDL Data Fusion Model. Sensor Fu- sion: Architectures, Algorithms, and Applications. Proceedings of the SPIE, vol. 3719 (1999) 10. Zhukova N. Harmonization, integration and fusion of multidimensional measurements of technical and natural objects parameters in monitoring systems [in Russian]. Izvestiya SPbETU “LETI”, vol 2, Saint-Petersburg (2013) 11. Popovich V., Voronin M. Data Harmonization, Integration and Fusion: three sources and three major components of Geoinformation Technologies. Proceedings of IF&GIS, St. Pe- tersburg (2005) 12. http://www.w3.org/ 13. Quinlan R. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993) 14. Feng S., Kogan I., Krim H. Classification of curves in 2D and 3D via affine integral signa- tures. Acta Applicandae Mathematicae, vol 109, issue 3, Springer, Nitherlands (2010) 15. Chang K., Ghosh J. Principal curve classifier - a nonlinear approach to pattern classifica- tion. Proceedings of Neural Networks, Anchorage (1998) 16. Kugiumtzis D., Tsimpiris A. Measures of Analysis of Time Series (MATS): A MATLAB Toolkit for Computation of Multiple Measures on Time Series Data Bases. Journal of Sta- tistical Software, vol. 33, issue 5 (2010) 17. Lin J, Keogh E., Lonardi S., Chiu B. A Symbolic Representation of Time Series, with Im- plications for Streaming Algorithms. Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, San Diego (2003) 18. Halkidi M., Batistakis Y., Vazirgiannis M. Clustering Validity Checking Methods. ACM Sigmod Record 31(2,3) (2001) 19. Nazarov A., Kozyrev G., Shitov I. et al.: Modern Telemetry in Theory and in Practice. Training Course [in Russian]. Nauka i Tekhnika, St. Petersburg (2007) 114 20. Vasiljev A., Vitol A, Zhukova N. Detecting the symantic structure of the group telemetric signal [in Russian]. SPbSTU «LETI», Saint-Petersburg (2010) 21. Vasiljev A., Geppener V.,Zhukova N.,Tristanov A.,Ecalo A. Automatic control system of complex dynamic objects state on the base of telemetering information analysis [in Rus- sian]. 8th International Conference on Pattern Recognition and Image Analysis: New In- formation Technologies, vol.2, No.4 (2007) 22. Jolliffe I. Principal Component Analysis. Springer, 2nd ed. (2002) 115 Система онтологий для приложений обработки данных на основе техник анализа данных Александр Водяхо1, Наталья Жукова2 1 Санкт-Петербургский государственный электротехнический университет, Санкт-Петербург, Россия 2 Санкт-Петербургский институт информатики и автоматизации Российской академии наук, Санкт-Петербург, Россия {aivodyaho,nazhukova}@mail.ru Аннотация. В статье описана система онтологий, спроектированных для приложений, ориентированных на решение проблем распознавания и оценки ситуаций на основе результатов обработки и анализа данных. Ос- новное внимание сосредоточено на проблемах обработки измерений от различных объектов с параметрами, представленными в виде временных рядов. Рассмотренные приложения обрабатывают данные при помощи знаний, извлечённых из исторических данных при помощи техник анализа данных. Такие приложения очень зависят от базы знаний, представляю- щей собой систему онтологий. Представленная система онтологий являет- ся множеством онтологий верхнего уровня, для которых разработаны спо- собы решения задач в одной или нескольких предметных областях. Ключевые слова: представление знаний, анализ данных, слияние данных, обработка измерений, распознавание и оценка ситуаций. 116