=Paper=
{{Paper
|id=None
|storemode=property
|title=None
|pdfUrl=https://ceur-ws.org/Vol-1000/ICTERI-2013-CEUR-WS-Volume.pdf
|volume=Vol-1000
}}
==None==
Vadim Ermolayev Heinrich C. Mayr Mykola Nikitchenko Aleksander Spivakovskiy Grygoriy Zholtkevych Mikhail Zavileysky Hennadiy Kravtsov Vitaliy Kobets Vladimir Peschanenko (Eds.) ICT in Education, Research and Industrial Applications: Integration, Harmonization and Knowledge Transfer Proceedings of the 9th International Conference, ICTERI 2013 Kherson, Ukraine June, 2013 Ermolayev, V., Mayr, H. C., Nikitchenko, M., Spivakovskiy, A., Zholtkevych, G., Zavileysky, M., Kravtsov, H., Kobets, V. and Peschanenko, V. (Eds.): ICT in Educa- tion, Research and Industrial Applications: Integration, Harmonization and Knowled- ge Transfer. Proc. 9th Int. Conf. ICTERI 2013, Kherson, Ukraine, June 19-22, 2013, CEUR-WS.org, online This volume constitutes the refereed proceedings of the 9th International Confer- ence on ICT in Education, Research, and Industrial Applications, held in Kherson, Ukraine, in June 2013. The 49 papers were carefully reviewed and selected from 124 submissions. The volume opens with the contributions of the invited speakers. Further, the part of the volume containing the papers of the main ICTERI conference is structured in four topical parts: ICT infrastructures, Integration and Interoperability; Machine Intelli- gence, Knowledge Engineering and Management for ICT; Model-based software system development; and Methodological and Didactical Aspects of Teaching ICT and Using ICT in Education. This part of the volume is concluded by the two papers describing the tutorials presented at the conference. The final part of the volume com- prises the selected contributions of the three workshops co-located with ICTERI 2013, namely: the 1st International Workshop on Methods and Resources of Distance Learning (MRDL 2013); the 2nd International Workshop on Information Technolo- gies in Economic Research (ITER 2013); and the 2nd International Workshop on Algebraic, Logical, and Algorithmic Methods of System Modeling, Specification and Verification (SMSV 2013). Copyright © 2013 for the individual papers by the papers’ authors. Copying permitted only for private and academic purposes. This vol- ume is published and copyrighted by its editors. Preface It is our pleasure to present you the proceedings of ICTERI 2013, the ninth edition of the International Conference on Information and Communication Technologies in Education, Research, and Industrial Applications: Integration, Harmonization, and Knowledge Transfer, held at Kherson, Ukraine on June 19-22, 2013. ICTERI is con- cerned with interrelated topics from information and communication technology (ICT) infrastructures to teaching these technologies or using those in education or industry. Those aspects of ICT research, development, technology transfer, and use in real world cases are vibrant for both the academic and industrial communities. The conference scope was outlined as a constellation of the following themes: ICT infrastructures, integration and interoperability Machine Intelligence, knowledge engineering, and knowledge management for ICT Cooperation between academia and industry in ICT Model-based software system development Methodological and didactical aspects of teaching ICT and using ICT in education A visit to Google Analytics proves the broad and continuously increasing interest in the ICTERI themes. Indeed, between November 15, 2012 and May 15, 2013 we have received circa 4 400 visits to the conference web site, http://icteri.org/, from 110 countries (568 cities). These numbers are 1.5 – 2 times higher than those observed in the similar period for ICTERI 2012. ICTERI 2013 continues the tradition of hosting co-located focused events under its umbrella. In the complement to the main conference this year, the program offered the three co-located workshops, two tutorials, and IT talks panel. The main conference program has been composed of the top-rated submissions evenly covering all the themes of ICTERI scope. The workshops formed the corolla around the main ICTERI conference by focus- ing on particular sub-fields relevant to the conference theme. In particular: The 1st International Workshop on Methods and Resources of Distance Learning (MRDL 2013) dealt mainly with the methodological and didactical aspects of teaching ICT and using ICT in education The scope of the 2nd International Workshop on Information Technologies in Eco- nomic Research (ITER 2013) was more within the topic of cooperation between academia and industry II 2nd International Workshop on Algebraic, Logical, and Algorithmic Methods of System Modeling, Specification and Verification (SMSV 2013) focused on model- based software system development The IT Talks panel was the venue for the invited industrial speakers who wish to present their cutting edge ICT achievements. This year we were also accepted the two focused short tutorials to the program: on ontology alignment and the industrial applications of this technology; and on the time model and Clock Constraint Specification Language for the UML profile used in modeling and analysis of real-time and embedded systems. Overall ICTERI attracted a substantial number of submissions – a total of 124 comprising the main conference and workshops. Out of the 60 paper submissions to the main conference we have accepted 22 high quality and most interesting papers to be presented at the conference and published in our proceedings. The acceptance rate was therefore 36.7 percent. Our three workshops received overall 64 submissions, from which 27 were accepted by their organizers and included in the second part of this volume. Those selected publications are preceded by the contributions of our invited speakers. The talk by our keynote speaker Wolf-Ekkehard Matzke expressed his industrial views on the knowledge-based bio-economy and the “Green Triple- Helix” of biotechnology, synthetic biology, and ICT. The invited talk by Gary L. Pratt was focused on a movement of higher education institutions to forming consortiums for creating a position of strength facing contemporary economic challenges. The invited talk by Alexander A. Letichevsky presented a general theory of interaction and cognitive architectures based on this theory. The conference would not have been possible without the support of many people. First of all we would like to thank all the authors who submitted papers to ICTERI 2013 and thus demonstrated their interest in the research problems within our scope. We are also very grateful to the members of our Program Committee for providing timely and thorough reviews and also for been cooperative in doing additional review work. We would like also to thank the local organizers of the conference whose devo- tion and efficiency made this instance of ICTERI a very comfortable and effective scientific forum. Finally a special acknowledgement is given to the support by our editorial assistant Olga Tatarintseva who invested a considerable effort in checking and proofing the final versions of our papers. June, 2013 Vadim Ermolayev Heinrich C. Mayr Mykola Nikitchenko Aleksander Spivakovskiy Grygoriy Zholtkevych Mikhail Zavileysky Hennadiy Kravtsov Vitaliy Kobets Vladimir Peschanenko III Organization Organizers Ministry of Education and Science of Ukraine Kherson State University, Ukraine Alpen-Adria-Universität Klagenfurt, Austria Zaporizhzhya National University, Ukraine Institute of Information Technology and Teaching Resources, Ukraine V. N. Karazin Kharkiv National University, Ukraine Taras Shevchenko National University of Kyiv, Ukraine DataArt Solutions Inc. General Chair Aleksander Spivakovsky, Kherson State University, Ukraine Steering Committee Vadim Ermolayev, Zaporizhzhya National University, Ukraine Heinrich C. Mayr, Alpen-Adria-Universät Klagenfurt, Austria Natalia Morse, National University of Life and Environmental Sciences, Ukraine Mykola Nikitchenko, Taras Shevchenko National University of Kyiv, Ukraine Aleksander Spivakovsky, Kherson State University, Ukraine Mikhail Zavileysky, DataArt, Russian Federation Grygoriy Zholtkevych, V.N.Karazin Kharkiv National University, Ukraine Program Co-chairs Vadim Ermolayev, Zaporizhzhya National University, Ukraine Heinrich C. Mayr, Alpen-Adria-Universität Klagenfurt, Klagenfurt, Austria IV Workshops Chair Mykola Nikitchenko, Taras Shevchenko National University of Kyiv, Ukraine Tutorials Chair Grygoriy Zholtkevych, V.N.Karazin Kharkiv National University, Ukraine IT Talks Co-chairs Aleksander Spivakovsky, Kherson State University, Ukraine Mikhail Zavileysky, DataArt, Russian Federation Program Committee Rajendra Akerkar, Western Norway Research Institute, Norway Eugene Alferov, Kherson State University, Ukraine Costin Badica, University of Craiova, Romania Tobias Buerger, PAYBACK, Germany Andrey Bulat, Kherson State University, Ukraine David Camacho, Universidad Autonoma de Madrid, Spain Michael Cochez, University of Jyväskylä, Finland Maxim Davidovsky, Zaporizhzhya National University, Ukraine Anatoliy Doroshenko, National Technical University of Ukraine "Kyiv Polytechnic Institute”, Ukraine Vadim Ermolayev, Zaporizhzhya National University, Ukraine David Esteban, Techforce, Spain Lyudmila Gavrilova, Slovyansk State Pedagogical University, Ukraine Vladimir Gorodetsky, St. Petersburg Institute for Informatics and Automation of The Russian Academy of Science, Russian Federation Marko Grobelnik, Jozef Stefan Institute, Slovenia Brian Hainey, Glasgow Caledonian University, United Kingdom Sungkook Han, Wonkwang University, South Korea Mitja Jermol, Jozef Stefan Institute, Slovenia Jason Jung, Yeungnam University, South Korea Natalya Keberle, Zaporizhzhya National University, Ukraine Nick Kings, Connected Shopping, United Kingdom Christian Kop, Alpen-Adria-Universität Klagenfurt, Austria Hennadiy Kravtsov, Kherson State University, Ukraine Nataliya Kushnir, Kherson State University, Ukraine Frédéric Mallet, Université de Nice-Sophia Antipolis, France Mihhail Matskin, Royal Institute of Technology, Sweden Heinrich C. Mayr, Alpen-Adria-Universität Klagenfurt, Austria Mykola Nikitchenko, Taras Shevchenko National University of Kyiv, Ukraine Andriy Nikolov, Knowledge Media Institute, The Open University, United Kingdom Inna Novalija, Jozef Stefan Institute, Slovenia Tomás Pariente Lobo, ATOS Origin, Spain Vladimir Peschanenko, Kherson State University, Ukraine Carlos Ruiz, playence, Spain Abdel-Badeeh Salem, Ain Shams University, Cairo, Egypt Wolfgang Schreiner, Research Institute for Symbolic Computation (RISC), Johannes Kepler University, Austria V Vladimir A. Shekhovtsov, Alpen-Adria-Universität Klagenfurt, Austria Mikhail Simonov, Istituto Superiore Mario Boella, Italy Marcus Spies, Ludwig-Maximilians-Universität München, Germany Aleksander Spivakovsky, Kherson State University, Ukraine Martin Strecker, IRIT, Universite Paul Sabatier, France Olga Tatarintseva, Zaporizhzhya National University, Ukraine Vagan Terziyan, University of Jyväskylä, Finland Nikolay Tkachuk, National Technical University "Kharkiv Polytechnic Institute”, Ukraine Leo Van Moergestel, Utrecht University of Applied Sciences, Netherlands Maryna Vladymyrova, V. N. Karazin Kharkov National University, Ukraine Paul Warren, Knowledge Media Institute, the Open University, United Kingdom Iryna Zaretska, V. N. Karazin Kharkov National University, Ukraine Mikhail Zavileysky, DataArt Solutions Inc., Russian Federation Grygoriy Zholtkevych, V. N. Karazin Kharkov National University, Ukraine Additional Reviewers Fahdi Al Machot, Alpen-Adria-Universität Klagenfurt, Austria Antonio Gonzalez-Pardo, Universidad Autonoma de Madrid, Spain Alexey Vekschin, National Technical University "Kharkiv Polytechnic Institute”, Ukraine VI Sponsors Organizations and Companies DataArt (http://dataart.com/) develops custom applications, helping clients optimize time-to-market and save costs Kherson State University (http://www.ksu.ks.ua/) is a mul- tidisciplinary scientific, educational, and cultural center in the south of Ukraine Zaporizhzhya National University (http://www.znu.edu.ua/) is a renowned educational and research center of Ukraine that offers a classically balanced diversity of high quality aca- demic curricula and many opportunities to build your scien- tific carrier Individuals Aleksandr Spivakovsky is the chair of the Department of Informatics and the first vice-rector of Kherson State Univer- sity Dmitriy Shchedrolosev is the head of DataArt’s R&D Center at Kherson VII Table of Contents Preface ..........................................................................................................................I Organization............................................................................................................. III Sponsors.....................................................................................................................VI Invited Contributions ................................................................................................. 1 The Knowledge-Based Bio-Economy and the “Green Triple-Helix” of Biotechnology, Synthetic Biology and ICT .............................................................. 2 A Movement of Higher Education Institutions to Consortiums of Institutions Banding Together to Create a Position of Strength.................................................... 3 General Theory of Interaction and Cognitive Architectures ...................................... 4 Part 1. Main ICTERI Conference ........................................................................... 16 1.1 ICT Infrastructures, Integration and Interoperability ......................................... 17 Wireframe Model for Simulating Quantum Information Processing Systems......... 18 Modeling, Algorithms and Implementation of the Microcontroller Control System for the Ion Beam Forming Process for Nanostructures Etching.................. 30 Using Algebra-Algorithmic and Term Rewriting Tools for Developing Efficient Parallel Programs ...................................................................................... 38 1.2 Machine Intelligence, Knowledge Engineering and Management for ICT........ 47 An Intelligent Approach to Increase Efficiency of IT-Service Management Systems: University Case-Study.............................................................................. 48 Refining an Ontology by Learning Stakeholder Votes from their Texts ................. 64 Answering Conjunctive Queries over a Temporally-Ordered Finite Sequence of ABoxes sharing one TBox................................................................... 79 An Adaptive Forecasting of Nonlinear Nonstationary Time Series under Short Learning Samples................................................................................. 91 Application of an Instance Migration Solution to Industrial Ontologies ................ 99 Extracting Knowledge Tokens from Text Streams ................................................ 108 1.3 Model-Based Software System Development.................................................. 117 VIII Use of Neural Networks for Monitoring Beam Spectrum of Industrial Electron Accelerators............................................................................................. 118 Lazy Parallel Synchronous Composition of Infinite Transition Systems .............. 130 Selecting Mathematical Software for Dependability Assessment of Computer Systems Described by Stiff Markov Chains ..................................... 146 Asymptotical Information Bound of Consecutive Qubit Binary Testing............... 163 A Data Transfer Model of Computer-Aided Vehicle Traffic Coordination System for the Rail Transport in Ukraine ........................................ 178 Quantitative Estimation of Competency as a Fuzzy Set ........................................ 187 1.4 Methodological and Didactical Aspects of Teaching ICT and Using ICT in Education............................................................................. 194 New Approaches of Teaching ICT to Meet Educational Needs of Net Students Generation.................................................................................... 195 Pedagogical Diagnostics with Use of Computer Technologies ............................. 209 The Use of Distributed Version Control Systems in Advanced Programming Courses............................................................................................ 221 Comparative Analysis of Learning in Three-Subjective Didactic Model .............. 236 Conception of Programs Factory for Representing and E-Learning Disciplines of Software Engineering ..................................................................... 252 Public Information Environment of a Modern University ..................................... 264 Designing Massive Open Online Courses.............................................................. 273 The Role of Informatization in the Change of Higher School Tasks: the Impact on the Professional Teacher Competences........................................... 281 1.5 ICTERI Tutorials ............................................................................................. 288 UML Profile for MARTE: Time Model and CCSL............................................... 289 Ontology Alignment and Applications in 90 Minutes ........................................... 295 Part 2. ICTERI Workshops ................................................................................... 307 2.1 2nd International Workshop on Information Technologies in Economic Research (ITER 2013)................................................................ 308 Foreword................................................................................................................ 309 Binary Quasi Equidistant and Reflected Codes in Mixed Numeration Systems.... 311 Mechanism Design for Foreign Producers of Unique Homogeneity Product........ 329 Features of National Welfare Innovative Potential Parametric Indication Information-Analytical Tools System in the Globalization Trends’ Context ........ 339 Matrix Analogues of the Diffie-Hellman Protocol ................................................ 352 Are Securities Secure: Study of the Influence of the International Debt Securities on the Economic Growth.............................................................. 360 How to Make High-tech Industry Highly Developed? Effective Model of National R&D Investment Policy...................................................................... 366 IX Econometric Analysis on the Site “Lesson Pulse”................................................. 374 Decision Supporting Procedure for Strategic Planning: DEA Implementation for Regional Economy Efficiency Estimation ....................................................... 385 Applying of Fuzzy Logic Modeling for the Assessment of ERP Projects Efficiency .................................................................................... 393 Mathematical Model of Banking Firm as Tool for Analysis, Management and Learning .................................................................................... 401 2.2 1st International Workshop on Methods and Resources of Distance Learning (MRDL 2013) ................................................................ 409 Foreword................................................................................................................ 410 What Should be E-Learning Course for Smart Education ..................................... 411 TIO – a Software Toolset for Mobile Learning in MINT Disciplines .................. 424 Holistic Approach to Training of ICT Skilled Educational Personnel................... 436 2.3 2nd International Workshop on Algebraic, Logical, and Algorithmic Methods of System Modeling, Specification and Verification (SMSV 2013) ................ 446 Foreword................................................................................................................ 447 An Abstract Block Formalism for Engineering Systems ....................................... 448 Multilevel Environments in Insertion Modeling System ....................................... 464 Clocks Model for Specification and Analysis of Timing in Real-Time Embedded Systems .......................................................................... 475 Specializations and Symbolic Modeling................................................................ 490 On a Dynamic Logic for Graph Rewriting ............................................................ 506 Logical Foundations for Reasoning about Transformations of Knowledge Bases .............................................................................................. 521 Program Algebras with Monotone Floyd-Hoare Composition .............................. 533 A Formal Model of Resource Sharing Conicts in Multithreaded Java .................. 550 Implementation of Propagation-Based Constraint Solver in IMS.......................... 565 UniTESK: Component Model Based Testing........................................................ 573 Protoautomata as Models of Systems with Data Accumulation ............................ 582 Models of Class Specification Intersection of Object-Oriented Programming...... 590 Author Index ........................................................................................................... 595 X Invited Contributions The Knowledge-Based Bio-Economy and the “Green Triple-Helix” of Biotechnology, Synthetic Biology and ICT Wolf-Ekkehard Matzke MINRES Technologises GmbH, Neubiberg, Germany wolf@minres.com Abstract. Over the last decades economies around the globe have transformed into a knowledge-based economy (KBE). Information and Communication Technology (ICT) has served as the principal enabling technology for this trans- formation. Now biology becomes another major pillar – producing a knowl- edge-based bio-economy (KBBE). The challenges faced by biotechnology push the requirements for ICT in many ways to the extreme and far beyond its basic utility function. In particular, it is valid for synthetic biology which aims to break ground on the rational design and construction of artificial biological sys- tems with ICT as its backbone for bio-design automation (BDA). This could be best illustrated using a metaphor of a “green triple-helix”, where “green” stands for environmental consciousness and “triple-helix” visualizes the inter- dependency of biotechnology, synthetic biology, and ICT as the helical strands. The talk will explore this inter-dependency in dynamics. High level ICT re- quirements will be identified and discussed along the dimensions of education, research and industry with the emphasis on synthetic biology and BDA. The guidelines for the architecture and implementation of an open BDA platform will be presented so that interested ICT researchers and practitioners will better understand the biology-specific ICT challenges of the KBBE. Keywords. Knowledge-Based Bio-Economy, Biotechnology, Synthetic Biol- ogy, Bio-Design Automation Key terms. ICTInfrastructure, Industry, Management, Research A Movement of Higher Education Institutions to Consortiums of Institutions Banding Together to Create a Position of Strength Gary L. Pratt Eastern Washington University, 202 Huston Hall, Cheney, Washington 99004, USA gpratt@ewu.edu Abstract. Colleges and universities compete for students, faculty, and business, industry, and research partnerships with quality programs, strong faculty, re- search opportunities, affordable cost, and high student success factors. Yet, at the infrastructure level, most of these institutions provide many similar informa- tion technology services and support. On top of this, many of these institutions struggle to provide this quality infrastructure because of a variety of factors, in- cluding: shrinking budgets, minimal strategic planning, and a lack of institu- tional vision of information technology as a strategic asset. This presentation will showcase best practice examples of how higher education institutions can band together, to create strong consortium relationships that can help all part- ners in this relationship move forward as a strong force. Examples will include actual successes experience by the Kentucky Council on Postsecondary Educa- tions Distance Learning Advisory Committee (DLAC), the Washington Legis- lative Technology Transformation Taskforce (TTT), and the Washington Higher Education Technology Consortium (WHETC).These successes range from statewide strategic planning efforts, to significant consortial purchasing contracts, to collaborative technology systems, services, and training opportuni- ties. This presentation will show that institutions can be stronger working to- gether than working individually. Keywords. University consortium, best practice, competition, infrastructure, in- formation technology, strategic asset, strategic planning, collaborative technol- ogy system Key terms. Academia, Information Technology, Infrastructure, Cooperation, Management General Theory of Interaction and Cognitive Architectures Alexander Letichevsky Glushkov Institute of Cybernetics, Academy of Sciences of Ukraine 40 Glushkova ave., 03187, Kyiv, Ukraine let@cyfra.net Abstract. The challenge of creating a real-life computational equiva- lent of the human mind is now attracting the attention of many scientific groups from different areas of cybernetics and Artificial Intelligence such as computational neuroscience, cognitive science, biologically inspired cognitive architectures etc. The paper presents a new cognitive archi- tecture based on insertion modeling, one of the paradigms of a general theory of interaction, and a basis for multiagent system development. Insertion cognitive architecture is represented as a multilevel insertion machine which realizes itself as a high level insertion environment. It has a center to evaluate the success of its behavior which is a special type agent that can observe the interaction of a system with external environ- ment. The main goal of a system is achieving maximum success repeated. As an agent this machine is inserted into its external environment and has the means to interact with it. The internal environment of intelligent cognitive agent creates and develops its own model and the model of ex- ternal environment. If the external environment contains other agents, they can be modeled by internal environment which creates correspond- ing machines and interprets those machines using corresponding drivers, comparing the behaviors of models and external agents. Insertion ar- chitecture is now under development on the base of Insertion modeling system, developed in Glushkov Institute of Cybernetics. Keywords. AgentBasedSystem, DistributedArtificialIntelligence, Rea- soning, FormalMethod, Simulation Key terms. AgentBasedSystem, DistributedArtificialIntelligence, Rea- soning, FormalMethod, Simulation 1 Introduction General theory of interaction is a theory of information interaction in complex distributed multi-agent systems. It has a long history. Contemporary part of this history can be considered as starting from neuro networks of McCulloch- Pitts [23]. The model of neuro nets caused the appearance of abstract automata theory, a theory which helps study the behavior and interaction of evolving systems independently of their structure. The Kleene-Glushkov algebra [13, 7] General Theory of Interaction and Cognitive Architectures 5 is the main tool for the description of the behaviors of finite state systems. Automata theory originally concentrated on the study of analyses and synthesis problems, generalization of finite state automata and complexity. Interaction in explicit form appeared only in 70s as a general theory of interacting information processes. It includes the CCS (Calculus of Communicated Processes) [24, 25] and the π-calculus of R. Milner [26], CSP (Communicated Sequential Processes) of T. Hoare [10], ACP (Algebra of Communicated Processes) [3] and many other various branches of these basic theories. Now all these calculi and algebras are the basis for modern research in this area. Fairly complete survey of the classical process theory is presented in the Handbook of Process Algebras [4], published in 2001. Insertion modeling is a trend that is developing over the last decade as an approach to a general theory of interaction of agents and environments in com- plex distributed multi-agent systems. The first works in this direction have been published in the middle of 90s [6, 15, 16]. In these studies, a model of interac- tion between agents and environments based on the notion of insertion function and the algebra of behaviors (similar to some kind of process algebra) has been proposed. The paradigm shift from computing to interaction was extensively dis- cussed in computer science that time, and our work was in some sense a response to this trend. But the real roots of the insertion model should be sought even earlier, in a model of interacting of control and operational automata, proposed by V. Glushkov back in the 60s [8, 9] to describe the structure of computers. In the 70s the algebraic abstraction of this model were studied in the theory of discrete processors and provided a number of important results on the problem of equivalence of programs, their equivalent transformations and optimization. Macroconveyor models of parallel computing, which were investigated in 80s years [11], even more close to the model of interaction of agents and environ- ments. In these models, the processes corresponding to the parallel processors can be considered as agents that interact in an environment of distributed data structures. In recent years, insertion modeling has been applied to the development of systems for the verification of requirements and specifications of distributed interacting systems [2, 12, 19–21]. The system VRS, developed in order from Mo- torola, has been successfully applied to verify the requirements and specifications in the field of telecommunication systems, embedded systems, and real-time sys- tems. A new insertion modeling system IMS [17], which is under development in the Glushkov Institute of Cybernetics of the National Academy of Sciences of Ukraine, is intended to extend the area of insertion modeling applications. We found many common features of the tools used in software development area based on formal methods and techniques used in biologically inspired cognitive architectures. This gives us hope to introduce some new ideas to the development of this subject domain. This paper presents the main principals of insertion modeling and the con- ception of cognitive architecture based on insertion modeling. To understand the formal part of the paper reader must be familiar with the concepts of labeled 6 A. Letichevsky transition system, bisimilarity and basic notions of general process theory. The mathematical foundation of insertion modeling is presented in [18]. 2 The Basic Principals Insertion modeling deals with the construction of models and study the inter- action of agents and environments in complex distributed multi-agent systems. Informally, the basic principles of the paradigm of insertion modeling can be formulated as follows. 1. The world is a hierarchy of environments and agents inserted into them. 2. Environments and agents are entities evolving in time. 3. Insertion of agent into environment changes the behavior of environment and produces new environment which is in general ready for the insertion of new agents. 4. Environments as agents can be inserted into higher level environment. 5. New agents can be inserted from external environment as well as from in- ternal agents (environments). 6. Agents and environments can model another agents and environments on the different levels of abstraction. All these principles can be formalized in terms of transition systems, behavior algebras, and insertion functions. This formalization can be used as high level abstractions of biological entities needed for computer modeling of human mind. The first and the second principals are commonly used in information mod- elling of different kinds of systems, for example as in object oriented or agent programming. They are also resembling to M. Minsky’s approach of the society of mind [27]. The third principal is clear intuitively, but has a special refinement in inser- tion modelling. We treat agents as transition systems with states considered up to bisimilarity (or up to behavior, which is the same). The type of an agent is the set of actions it can perform. The term action we use as a synonim of label for transitions, and it can denote signals or messages to send, events in which an agent can participate etc. This is the most general notion of agent which must be distinguished from more special notions of autonomous or intellectual agents in AI. Transition system consists of states and transitions that connect states. Tran- sitions are labeled by actions (signals, events, instructions, statements etc.). Transition systems are evolving in time changing their states, and actions are observable symbolic structures used for communication. We use the well-known a notation s −→ s0 to express the fact that transition system can evolve from the 0 state s to s performing action a. Usually transition systems are nondeterministic and there can be several transitions coming from the same state even labeled by the same action. If we abstract from the structure of states and concentrate only on (branching) sequences of observable actions we obtain the state equivalence General Theory of Interaction and Cognitive Architectures 7 called bisimilarity (originated from [28] and [24], exact definition can be found in [18]). Bisimilar states generate the same behavior of transition systems. Environment by definition is an agent that possesses the insertion function. Given the state of environment s and the state of agent u, insertion function computes the new state of environment which is denoted as s[u]. Note that we consider states up to bisimilarity and if we have some representation of behaviors, the behaviors of environment and agent can be used as states. The state s[u] is a state of environment and we can use insertion function to insert a new agent v into environment s[u] : (s[u])[v] = s[u, v]. Repeating this construction we can obtain the state of environment s[u1 , u2 , . . .] with several agents inserted into it. Insertion function can be considered as an operator over the states of environment, and if the states are identified with behaviors, then the insertion of a new agent changes the behavior of environment. Environment is an agent with insertion function, so if we forget the inser- tion function, then environment can be inserted as agent into a higher level environment and we can obtain hierarchical structure like s[s1 [u11 , u12 , . . .]E1 , s2 [u21 , u22 , . . .]E2 , . . .]E Here notation s[u1 , u2 , . . .]E explicitly shows the environment E to which the state s belongs (environment indexes can be omitted if they are known from the context). This refines the fourth principle. Environment is an agent which can be inserted into external environment and having agents inserted into this environment. The evolution of agents can a a be defined by the rules for transitions. The rules s[u] −→ s[u, v] and s[t[u, v]] −→ s[t[u], v] can be used for the illustration of the 5-th principal. We consider the creating and manipulation of the models of external and internal environments as the main property of cognitive processes of intellectual agent. Formalization of this property in terms of insertion modeling supports the 6-th principal. Cognitive architecture will be constructed as a multilevel insertion environ- ment. Below we shall define the main kinds of blocks used for construction of cognitive architecture. They are local description unites and insertion machines. 3 Multilevel Environments To represent behaviors of transition systems we use behavior algebras (a kind of process algebra). Behavior algebra is defined by the set of actions and the set of behaviors (processes). It has two operations and termination constants. Op- erations are prefixing a.u (a - action, u - behavior) and nondeterministic choice u + v (u and v - behaviors). Termination constants are successful termination ∆, deadlock 0, and undefined behavior ⊥. It has also approximation relation ⊆, which is a partial order with minimal element ⊥, and is used for constructing a complete algebra with fixed point theorem. To define infinite behaviors we use equations in behavior algebra. These equations have the form of recursive definitions ui = Fi (u1 , u2 , . . .), i = 1, 2, . . . and define left hand side functions as 8 A. Letichevsky the components of a minimal fixed point. Left hand sides of these definitions can depend on parameters ui (x) = Fi (u, x) of different types. In complete behavior algebra each behavior has a representation (normal form) X u= ai .ui + εi i∈I which is defined uniquely (up to commutativity and associativity of nondeter- ministic choice), if all ai .ui are different (εu is a termination constant). The type of environment is defined by two action sets: the set of environment actions and the set of agent actions. The last defines the type of agents which can be inserted into this environment: if the set of agent actions is included in the set of agent actions of environment then this agent can be inserted into this environment. This relation is called compatibility relation between agents and environments (agent is compatible with environment if it can be inserted into this environment). Multilevel environment is a family of environments with dis- tinguished the most external environment. The compatibility relation on the set of environments defines a directed graph and we demand for multilevel environ- ment that the outermost environment would be reachable from any environment of the family in this graph. To define the insertion function for some environment it is sufficient to define transition relation for all states of environment including states with inserted agents. The common approach is to define behavior by means of rules. The following is an example of such rule: b a − s0 , u − s→ → u0 c P (a, b, c) − s0 [u0 ] s[u] → This rule can be interpreted as follows. Agent in the state u can make a transition a u−→ u0 . Environment allows this transition if the predicate P (a, b, c) is true. This rule defines behavior property of environment in some local neighborhood of the state s[u]. So such a rule belongs to the class of local description units discussed in the next section. At a given moment of time an agent belongs (is inserted) to only one en- vironment. But if the type of an agent is compatible with the type of another environment it can move to this environment. Such a movements can be de- scribed by the following types of rules: moveup E u −−−−−−−→ u0 P1 (E, F, u, moveup(E)) moveup(F →E) E[F [u, v], w] −−−−−−−−−−→ E[F [v], u0 , w] moving from internal to external environment; movedn F u −−−−−−−→ u0 P2 (E, F, u, movedn(F )) movedn(E→F ) E[F [v], u, w] −−−−−−−−−−→ E[F [u0 , v], w] General Theory of Interaction and Cognitive Architectures 9 moving from external environment to internal one; moveto G u −−−−−−→ u0 P3 (E, F, u, moveto(F )) moveto(F →G) E[F [u, v], G[w]] −−−−−−−−−→ E[F [v], G[u0 , w]] moving to another environment on the same level. In all cases permitting con- ditions must include the compatibility conditions for corresponding agents and environments. The rules above define the property of a system called mobility and underlies the calculus of mobile ambients of Luca Cardelli [5]. 4 Local Description Units over Attribute Environments A special type of environments is considered in cognitive architecture to have a sufficiently rich language for the description of environment states properties. These environments are called attribute environments. There are two kinds of attribute environments – concrete and symbolic. The state of concrete attribute environment is the valuation of attributes - symbols that change their values while changing the state in time. Each attribute has type (numeric, symbolic, enumerated, agent and behavior types, functional types etc.). Some of functional and predicate symbols are interpreted symbols. Now logic formulas can be used for the description of properties of agent or environment states. We use the first order logic formulas as the basis that can be extended by fuzzy logic, temporal logic etc. The general form of local description unit is the following: ∀x(α(x, r) →< P (x, r) > β(x, r)), where x is a list of typed parameters, r is a list of attributes, α(x, r) and β(x, r) are logic formulas, < P (x, r) > is a process - finite behavior of an environment. Local descriptions can be considered as formulas of dynamic logic, or Hoare triples, or productions - the most popular units of procedural memory in AI. In any case they describe local dynamic properties of environment behavior: for all possible values of parameters, if precondition is true then a process of a local description unit can start and after successful termination of this process a postcondition must be true. The states of symbolic environment are formulas of basic logic language of environment. Such formulas are abstractions of classes of concrete states. Each symbolic state covers the set of concrete states and the traces generated by local description units cover the sets of concrete traces. Local description units are the main units of knowledge representation in cog- nitive architecture. A set of local description units can be used for the definition of transitions of environment. In this case they can be considered as procedural knowledge units. Logic knowledge can be represented as environment with the states representing the current knowledge, and the local description units cor- responding to the rules of inference in corresponding calculus. Local description units can be applied in forward and backward modes. Forward mode can be used for the generating of new knowledge, backward mode – for answering queries. 10 A. Letichevsky 5 Insertion Machines Another construction blocks for cognitive architecture are insertion machines intended for implementation of insertion environments. The input of insertion machine is the description of a multilevel environment (a model of an environ- ment) and its initial state, an output depends on the goal that is put to machine. Multilevel environments are represented in cognitive architecture by means of environment descriptions for different levels and a set of local description units for insertion functions. Environment description contains the signature of environment that includes types of attributes, types of inserted agents, and also the description of sets of environment and agent actions. Local description units used for the definition of insertion function are organized as a knowledge base with special data structures providing efficient access to the needed descriptions and history of their use. To implement multilevel environment different kinds of insertion machines are used. But all of them have the general architecture represented on the Fig.1. Three main components of insertion machine are model driver (MD), behavior Fig. 1. Architecture of Insertion Machine unfolder (Unf), and interactor (Intr). Model driver is a component which con- trols the machine traversal along the behavior tree of a model. The state of a model is represented as a text in the input language of insertion machine and is General Theory of Interaction and Cognitive Architectures 11 considered as an algebraic expression. The input language includes the recursive definitions of agent behaviors, the notation for insertion function, and possibly some compositions for environment states. Before computing insertion function the state of a system must be represented in the form s[u1 , u2 , . . .]. This func- tionality is performed by agent behavior unfolder. To make the movement, the state of environment must be reduced to the normal form X ai .ui + ε i∈I where ai are actions, ui are environment states, ε is a termination constant. This functionality is performed by the module environment interactor. It computes the insertion function calling recursively if it is necessary the agent behavior unfolder. Two kinds of insertion machines are distinguished: real time or interactive and analytical insertion machines. The first ones are functioning in the real or virtual environment, interacting with it in the real or virtual time. Analytical machines intended for model analysis, investigation of its properties, solving problems etc. The drivers for two kinds of machines correspondingly are also di- vided into interactive and analytical drivers. Interactive driver after normalizing the state of environment must select exactly one alternative and perform the action specified as a prefix of this alternative. Insertion machine with interactive driver operates as an agent inserted into external environment with insertion function defining the laws of functioning of this environment. External environ- ment, for example, can change a behavior prefix of insertion machine according to their insertion function. Cognitive interactive driver has criteria of successful functioning in external environment, it accumulates the information about its past in episodic memory, develops the models of external environment, uses some learning algorithms to improve the strategy of selecting actions and increase the level of successful functioning. In addition it should have specialized tools for ex- change the signals with external environment (for example, perception of visual or acoustical information, space movement etc.). Analytical insertion machine as opposed to interactive one can consider dif- ferent variants of making decisions about performed actions, returning to choice points (as in logic programming) and consider different paths in the behavior tree of a model. The model of a system can include the model of external en- vironment of this system, and the driver performance depends on the goals of insertion machine. In the general case analytical machine solves the problems by search of states, having the corresponding properties (goal states) or states in which given safety properties are violated. The external environment for inser- tion machine can be represented by a user who interacts with insertion machine, sets problems, and controls the activity of insertion machine. Analytical machine enriched by logic and deductive tools are used for generating traces of symbolic models of systems. The state of symbolic model is represented by means of prop- erties of the values of attributes rather than their concrete values. Insertion machine with separated external environment interface can be im- plemented as a transition system with hidden structure that separates the kernel 12 A. Letichevsky environment state and the states of inserted agents. Such implementation can be more efficient and can be constructed using partial computations or other specialization and optimization programming tools. 6 Cognitive Architecture Like well-known cognitive architectures such as Soar [14], ACT-R [1] or many other from the list of BICA society [29] insertion cognitive architecture ICAR is an environment for construction of cognitive agents. The main blocks of this architecture are local description units, agents, represented by their behaviors, and insertion machines. Building blocks are collected in memory units that have structures of knowledge bases or associative memories. On the abstract level ICAR has the same architecture as cognitive agents that can be created in it. From this point of view it can be considered as an intellectual assistant for user who interacts with ICAR in the process of creating cognitive agents. The general architecture of cognitive agent of ICAR is represented on Fig.2. Fig. 2. Insertion cognitive architecture In general cognitive agent is constructed as a real time multilevel insertion machine which realizes itself as a highest level internal environment. As an agent, General Theory of Interaction and Cognitive Architectures 13 this machine is inserted in its external environment and has the means to interact with it. This external environment includes a user and objects of external (real or virtual) world to which agent has access. One or several self-models can be inserted into the internal environment of cognitive agent to be used when interacting with external environment or making decisions and planning future activities. An agent has an estimation mechanism to evaluate the success of its behavior. This mechanism is realized in the form of a special agent that can observe the interaction of a system with external environment and make estimation according to some criteria. These criteria can be predefined initially and evolves in the future according to obtained experience. The main goal of a system is achieving maximum success repeated. The self-models of cognitive agent are created and developed together with the models of external environment. If the external environment contains other agents, they can be modeled by internal environment which creates correspond- ing machines and interprets those machines using corresponding drivers, com- paring the behaviors of models and external agents. All these models are evolving and developing in the process of accumulating the experience in interaction with the external world. Some mechanisms that model emotional or psychological features (humor and concentration, pleasure and anger, etc.) can be implemented at higher levels of cognitive strucure. The mechanisms of decision making, planning and executing plans are also at higher levels. The main part of cognitive structure is the base of models describing the history of cognitive agent functioning at different levels of abstraction. The in- terface with external world provides language (symbolic) communication and image processing. All interaction histories are processed in the working memory of the self-level insertion machines and then transferred to the appropriate levels of a model base. The model base is always active. The analytical insertion machines which control and manage the structure of model base are always busy with searching solution of problems and performing tasks with ansatisfactory answers, or cre- ating new models. All this activity models subconcious levels of cognition and time-to-time interact with the higher levels of cognitive structure. Independent levels of cognitive structure are working in parallel. The hierarchy of environments of cognitive agent in some sense are similar to six layers of neocortex. Moving from low levels to higher ones the levels of abstraction are increased and used more and more abstract symbolic models. How to create such models is a big challenge and we are working on it now. Cognitive analytical insertion machines of ICAR are used by cognitive agents to learn their models and their interaction with external environment to solve problems better, accepts user helps as a teacher and teach user how to interact better with ICAR. General learning mechanisms are the parts of model drivers of different types. 14 A. Letichevsky 7 Conclusions The description of cognitive architecture in the last section is a very tentative reflection of our far goals. The nearer goals include the further development of our system of proving pogram correctness [22], communication in natural language, and living in virtual reality. As a zero approximation of ICAR the insertion modeling system [17] is used. References 1. Anderson, J.R., Lebiere, C.: The Atomic Components of Thought. Mahwah: Lawrence Erlbaum Associates (1998) 2. Baranov, S.,Jervis, C., Kotlyarov, V., Letichevsky, A. and Weigert, T.: Leveraging UML to deliver correct telecom applications in UML for Real: Design of Embed- ded Real-Time Systems by L.Lavagno, G. Martin, and B. Selic (editors), 323-342, Kluwer Academic Publishers (2003) 3. Bergstra, J. A. and Klop J. W.: Process algebra for synchronous communications. Information and Control, 60 (1/3), 109–137 (1984) 4. Bergstra, J. A., Ponce, A. and Smolka, S. A.(eds.): Handbook of Process Algebra. North-Holland (2001) 5. Cardelli, L. and Gordon, A. D.: Mobile ambients. In: Foundations of Software Science and Computation Structures: First International Conference, FOSSACS ’98, Springer-Verlag (1998) 6. Gilbert, D. R. and Letichevsky, A. A.: A Universal Interpreter for Nondetermin- istic Concurrent Programming Languages. In: Gabbrielli, M. (Ed.) Fifth Compu- log Network Area Meeting on Language Design and Semantic Analysis Methods, September (1996) 7. Glushkov, V.M.: On an Algorithm of Abstract Automata Synthesis. Ukrainian Mthematical Journal, 12(2), 147–156 (1960). 8. Glushkov, V.M.: Automata Theory and Questions of Design Structure of Digital Machines. Cybernetics 1, 3–11 (1965) 9. Glushkov, V.M. and Letichevsky, A. A.: Theory of Algorithms and Discrete proces- sors. In: Tou, J. T. (Ed.) Advances in Information Systems Science, vol. 1, Plenum Press, 1-58 (1969) 10. Hoare, C.A.R.: Communicating Sequential Processes. Prentice Hall (1985) 11. Kapitonova, J. and Letichevsky, A.: Mathematical Theory of Computational Sys- tems Design. Moscow, Science (1988) (in Russian) 12. Kapitonova, J., Letichevsky, A., Volkov, V. and Weigert, T.: Validation of Em- bedded Systems. In: R. Zurawski (Ed.) The Embedded Systems Handbook, CRC Press, Miami (2005) 13. Kleene, S. C.: Representation of Events in Nerve Nets and Finite Automata. In: Shannon, C. E., McCarthy, J. (eds.) Automata Studies, Princeton University Press, pp. 3-42 (1956) 14. Laird, J. E., Newell, A., Rosenbloom, P. S.: SOAR: an Architecture for General Intelligence. Artifitial intelligence, 33, 1–64 (1987) 15. Letichevsky, A. and Gilbert, D.: A general Theory of Action Languages. Cyber- netics and System Analyses, 1 (1998) General Theory of Interaction and Cognitive Architectures 15 16. Letichevsky, A. and Gilbert, D.: A Model for Interaction of Agents and Environ- ments. In: Bert, D., Choppy, C. and Moses, P. (eds.) Recent Trends in Algebraic Development Techniques. LNCS, vol 1827, Springer Verlag (1999) 17. Letichevsky, A., Letychevskyi, O.and Peschanenko, V.: Insertion Modeling System. In: Proc. PSI 2011, LNCS, vol 7162, pp. 262–274, Springer Verlag (2011) 18. Letichevsky, A.: Algebra of Behavior Transformations and its Applications. In: Kudryavtsev, V. B. and Rosenberg, I.G. (eds.) Structural Theory of Automata, Semigroups, and Universal Algebra. NATO Science Series II. Mathematics, Physics and Chemistry, vol 207, pp. 241–272, Springer Verlag (2005) 19. Letichevsky, A., Kapitonova, J., Letichevsky, A. Jr., Volkov, V., Baranov, S., Kotl- yarov, V. and Weigert, T.: Basic Protocols, Message Sequence Charts, and the Verification of Requirements Specifications. ISSRE 2004, WITUL (Workshop on Integrated reliability with Telecommunications and UML Languages) , Rennes, 4 November (2005) 20. Letichevsky, A., Kapitonova, J., Letichevsky, A. Jr., Volkov, V., Baranov, S., Kotl- yarov, V. and Weigert, T.: Basic Protocols, Message Sequence Charts, and the Ver- ification of Requirements Specifications. Computer Networks, 47, 662–675 (2005) 21. Letichevsky, A., Kapitonova, J., Letichevsky, A. Jr., Volkov, V., Baranov, S., Kotl- yarov, V. and Weigert, T.: System Specification with Basic Protocols. Cybernetics and System Analyses, 4 (2005) 22. Letichevsky, A., Letichevsky, O., Morokhovets, M. and Peschanenko, V.: System of Programs Proving. In: Velichko, V., Volosin, A. and Markov, K. (eds.) Problems of Computer Intellectualization, Kyiv, V. M. Glushkov Institute of Cybernetics, pp.133–140 (2012) 23. McCulloch, W.S. and Pitts, W.: A Logical Calculus of the Ideas Immanent in Nervous Activity, Bull. of Math Biophy., 5, 115–133 (1943) 24. Milner, R.: A Calculus of Communicating Systems, LNCS, vol 92, Springer Verlag (1980) 25. Milner, R.: Communication and Concurrency. Prentice Hall (1989) 26. Milner, R.: The Polyadic π-calculus: a Tutorial. Tech. Rep. ECS–LFCS–91–180, Laboratory for Foundations of Computer Science, Department of Computer Sci- ence, University of Edinburgh, UK (1991) 27. Minsky, M.: The Society of Mind. Touchstone Book (1988) 28. Park, D.: Concurrency and Automata on Infinite Sequences. LNCS, vol 104, Springer-Verlag (1981) 29. Samsonovich, A. V.: Toward a Unified Catalog of Implemented Cognitive Archi- tectures (Review). In: Samsonovich, A.V., Johansdottir, K.R., Chella, A. and Go- ertzel, B. (eds.) Biologically Inspired Cognitive Architectures 2010: Proc. 1st An- nual Meeting of BICA Society, Frontiers in Artifitial Intelligence and Applications, vol 221, pp. 195–244 (2010) Part 1. Main ICTERI Conference 1.1 ICT Infrastructures, Integration and Interoperability Modeling, Algorithms and Implementation of the Microcontroller Control System for the Ion Beam Forming Process for Nanostructures Etching Aleksandr Ralo1, Andrii Derevianko1, Aleksandr Kropotov1, Sergiy Styervoyedov1 and Oleksiy Vozniy1 1 V.N. Karazin Kharkiv National University, Svobody sq. 4, 61022, Kharkiv, Ukraine ralo-a-n@mail.ru Abstract. In this work a three-level control system for the vacuum-plasma sys- tem with the ion source based on high-frequency induction discharge architec- ture is described. The system is built on smart sensors and specially designed for operation in high EMI programmable logic controllers (PLCs) in the middle hierarchical level. The structure of PLC and the algorithms of the system are presented. Results of comparative simulation of classical management system and created one, as well as the results of the system applying to obtain beams of positive and negative ions and beams neutralized particles are reported. Keywords. Programmable logic controller (PLC), control system, plasma tech- nology, empirical model Key terms. InformationTechnology, CommmunicationTechnology, Software- System, Integration, Process 1 Introduction Ion beams are an effective means of the surface treatment. Their application can be found in the industry of the integrated circuits manufacturing, as well as for research purposes as a powerful tool of micro-and nanoscale structures synthesis. However, during the materials treatment by ion beams due to effects of similar charges repul- sion and the charge accumulation near the processed surface, defects are formed in the form of islet unetched films and vice versa – side etches. There are unwanted and unpredictable changes in the structure of the processed sample that can irreversibly change the characteristics of the ware [1]. There are several methods of excluding or compensating the charge accumulation. One of the most promising technique that can increase the rate of applicable products during plasma-beam processing is alternately etching by pulsed beams of ions of different signs and etching by high-energy beams Modeling, Algorithms and Implementation of the Microcontroller Control System 31 of neutral atoms, where the particle charge is neutralized by the ion beam away from the treated sample [2], [3], [4]. Short times and complex algorithms that should be taken into account during these experiments do not allow the operator to process without the use of modern computer automation systems that work using a process model. Therefore, the aim of this work was to create the intellectual management and control system, that will meet the re- quirements of operating conditions of the pulsed plasma-process plant for etching and micro- and nanostructures formation. An urgent task today is to build a system capa- ble of independently analysis of historical data, and using them to provide the process control. Seeing the specifics of the pulsed plasma-ion process system must ensure the fastest possible response to the effects of interference that is impossible with the use of information-analytical systems (IAS) based on the classical scheme and used for the slower process. 2 The Use of Programmable Logic Controllers in Control Systems Programmable logic controllers (PLCs) are widely used in such control systems as industrial automation and automation of scientific research. They provide signifi- cantly higher reliability than personal computers, and the same flexibility of working. Difficulties with changes in the operation make popular today microcontroller sys- tems as not sufficient automation tool. In order to improve the efficiency of the system it was proposed to build the hard- ware using three-tier architecture, as it shown in Fig. 1. The lower level consists of intelligent sensors and control elements, which control the PLC work. PLCs have a much greater speed and are responsible for getting data from sensors, filtering, formation of data packets and for communicating with the control centre, created on the basis of a personal computer. Algorithmic work of such system can be represented as follows: 1. Information about physical parameters, obtained by sensors, after conversion into digital codes enters with the corresponding interface to the programmable logic controller. 2. PLC forms an information packet, transmits the received information to the manag- ing node (in our case, the system arbiter). 3. The central node addresses for the data needed to the historical data storage, ana- lyzes them and, if necessary, generate management command. Information re- ceived from the PLC is also stored in the repository for further analysis. 4. This control command is received with the PLC, which transmits it to the appropri- ate control element. This design exceeds the performance of classical one, because: first, damaged or incorrect data can be filtered by a PLC, and second, the PLC data is transferred using an optimum package format. Modern PLC consists of three main parts: PLC processor module 32 A. Ralo, A. Derevianko, A. Kropotov, S. Styervoyedov and O. Vozniy I/O modules Programming mechanism Typically, these parts are combined with the use of crates on a physical level and a number of tires on the logical one. Despite the structural similarity of the PLC with the PC, the PLC must have some specific characteristics. For the experiments, and automation systems building using PLC, they should be able to work under different, sometimes very hard conditions, ensuring high availability. Experimental plant for the ions and neutral particles obtaining Control Sensors elements PLC System Storage arbiter system Fig. 1. The structure of the nodes communication using PLC In the development of the PLC as a system bus selection was made from a list of the most popular, fully standardized well studied buses. The choice was made in fa- vour of bus VMEbus [5]. By using this bus is possible to build scalable systems, where it can be increased both the number of modules that are responsible for input/output, and the number of processor modules, that makes it possible to distribute the control program in the case of its complexity. Block diagram of the created PLC is shown in Fig. 2. As a PLC’s control processor it has been selected 32-bit microcontroller AT91SAM7S from Atmel. 3 The Model Used To provide a control on a given algorithm of complex technological processes, that includes the induction plasma source management, it is necessary to develop a model of the process. The best way to obtain it is to process the results of passive experiment that are stored in specially designed storage that provides quick access to historical data. After receiving the necessary amount of experimental data and the model crea- tion, the control system can proceed in an active phase, i.e. manage the process. Modeling, Algorithms and Implementation of the Microcontroller Control System 33 Processor part PLC interface Memory Application memory AT91SAM7S128 Front panel Po w e r sou rce Time rs L o gic Indication USB 2.0 and control System User Data VME bus memory memory memory controller VME bus Join panel Input/output modules Outer bus RS-485 Fig. 2. PLC Block diagram To construct the control system it was decided to use the model of the 5th class by Shantikumar [6]. In the proposed model, various simulation results for different initial conditions are used. In addition, on the basis of these results analytical model is gen- erated. It is possible to identify the following properties of the model: Simulation model is used to determine the relationship between variable parame- ters and target factor An analytical model is a generalization of the results of various simulations; such systems are more accurate, than the solutions obtained only by using theoretical calculations, since they are based on multiple real data After receiving the analytical solution, which is not changing under the new simu- lations, this model can be used in the process control system to predict its results for the given parameters and, hence, it provides accurate information for quick de- cision-making [7] The proposed class of models of simulation plays an important role in the investi- gation of the system behaviour. The results are used to derive analytical models to predict system performance. There may be a situation when the analytical model without simplifications can not be created, and the approximate model is not suffi- ciently accurate. Thus, models of this class are applicable in the cases: When the correlation between the target factor and the system parameters is not known that makes the analytical model is very difficult to develop, in this case, when analytical models are very expensive, unreliable or impractical, simulation can help in understanding the relationships between all factors that makes it possi- ble to develop an analytical model In many practical problems the useful signal is often superimposed with lot of noise, which is almost impossible to take into account, simulation allows to inves- tigate the behaviour of dynamic systems and to identify key parameters for evalua- tion, so that these parameters are then included in the analytical model The process of a given class model constructing was described in [8]. As a result of simulation data processing system receives a functional relationship between the ex- periment factors and coefficients of these relationships. To find a list of dependencies 34 A. Ralo, A. Derevianko, A. Kropotov, S. Styervoyedov and O. Vozniy there has been used a list of 34 functions [9], that is sufficient to describe the most of physical processes. 4 Simulation of the Control System During the system development phase to select the most efficient architecture of its work simulation was carried out. At the same time a number of software products used to data networks simulating was observed. OPNET and NetSim++ – graphical network modelling environment; SMURPH (University of Alberta) and Ptolemy (Berkeley) use an eclectic language for describing data lines; OMNet++ that uses its own language to describe the architecture that is then trans- lated by the pre-processor to standard C++. The choice was made in favour of the last instrument, as being open, easy embed- dable in third-party software products and having multiple trusted code in its compo- sition. OMNet++ is not a programming language – it’s just a class library for the simula- tion. These classes are: modules, gateways, connections, settings, reports, histograms, assemblers, precision registers, etc. First of all, it was composed a system model, built using “common bus” architec- ture with a central arbitration. The algorithm works as follows: arbiter sends a request to the sensors. The intensity of the queries depends on the particular sensor. The re- quest is broadcast, thus it contains the sensor’s address. Sensor, recognizing the ad- dress, sends the data back. The size of datagram is also dependent on the sensor’s type. The appearance of the simulation software modelling the system is shown Figure 3 (a). (a) (b) Fig. 3. OMNet++ work for simplified systems Modeling, Algorithms and Implementation of the Microcontroller Control System 35 To simulate the operation using the PLC part of the terminal devices was con- nected to an intermediate device, which operates as a PLC. The appearance of the resulting system in the program OMNet presented in Figure 3 (b). Time and composition of each message sent and received is saved in a special vec- tor file, that is then processed to obtain statistical information. As a result, basing on multiple system runs and changing the parameters such as: number of devices, data rate, processing time, etc. it was found that the use of PLC increases the productivity of an average of 1.5 times. Figure 4 shows the number of posts at different loads on the media for the cases of classical architecture and the use of the PLC. The fact that the use of OMNet++ allows programming in C++, made it possible to produce a bust of the parameters automatically. Fig. 4. Dependence of the sent messages number from the bus bandwidth 5 The Results of the Developed System Application Concrete implementation of information management system was worked out for the experimental setup of vacuum-plasma nanostructures etching, schematically depicted in Figure 5 [10]. The vacuum chamber is a cylindrical volume of diameter 600 mm and length 800 mm. On the side flange of the chamber it was mounted an RF ion source based on the inductive discharge. With the ions extracted passage through the electrode system the beam of desired particle charge or neutral particles is formed. In front of the source a quadrupole mass spectrometer with an integrated energy analyzer with a resolution of 0.1 eV mounted. Plasma excitation is done with an inductor connected to the RF generator operating at a frequency of 13.56 MHz. The management system’s task was a search for optimal etching parameters and the stabilization of these parameters to ensure the repeatability of the experiment, as well as to change the programmable modes of control elements. Modes of RF discharge source and power blocs forming voltages on the beam ex- traction system are controlled by PLC, that uses the internal bus commands to control elements, responsible for setting the pulse’s amplitude, polarity, duration and duty 36 A. Ralo, A. Derevianko, A. Kropotov, S. Styervoyedov and O. Vozniy cycle. PLC via USB 2.0 was connected to the computer where the accumulation of experimental data and the search for functional relationships between them was done. The system supports the operator control mode and independent search of optimal parameters for this type of beams and maintains these parameters constant. The user interface and the corresponding signals taken from the outputs of power blocks are shown in Figure 6. AD RFS GIS S M IND G1-3 Q VC - PB + + PB - - PB + MMS + PB - Fig. 5. Structural diagram of the setup for nanostructures etching. VC – vacuum chamber for ion-plasma treatment, QMS – quadrupole mass spectrometer with energy analyzer, RFS – RF source, AD – agreement device, PB – power blocks IND – inductor, G1-3 tree-grade electrode system, GIS – gas inlet system, MMS – microcontroller management system. Fig. 6. The user interface and the corresponding control signals Modeling, Algorithms and Implementation of the Microcontroller Control System 37 6 Conclusion Three-level information management system of vacuum-plasma system with the ion source based on high-frequency induction discharge is designed to produce beams of positive and negative ions and neutralized beam of particles used in the nanostruc- tures etching. On the middle hierarchical level of the system there are specially de- signed programmable logic controllers based on a 32-bit microcontroller from Atmel AT91SAM7S, that have increased noise immunity characterized, advanced comput- ing power and optimal capacity for a given task. PLCs make the primary information processing and greatly accelerate the process of information exchange, particularly at high pulse loads. Despite the overall effectiveness of the method of constructing process models to manage them based on statistical data accumulated, in practice, large amounts of data to process and requirements for high-speed make this method not always usable. Therefore, the task of modelling and process control between the main control mechanism and the PLC allocation is important. Testing of the system to manage the real process has shown its efficiency and al- lowed to obtain an adequate empirical model of the controlled ion source based on high-frequency discharge from tree-grades control. References 1. Kinoshita, T., Hane, M., McVittee, J. P.: Journal of Vacuum Science and Technology B 14, 560 (1996) 2. Yunogami, T., Yokogawa, K., Mizutani, T.: Development of neutral-beam-assisted etcher. Journal of Vacuum Science and Technology A 13(3), 952 (1995) 3. Yokogawa, K., Yunogami, T., Mizutani, T.: Neutral- Beam-Assisted Etching System for Low-Damage SiO2 Etching of 8-Inch Wafers. Japan Journal of Application Physics 35, 1901 (1996) 4. Vozniy, O. V., Yeom, G. Y.: High-energy negative ion beam obtained from pulsed induc- tively coupled plasma for charge-free etching process. Appl. Phys. Lett. 94, 231502 (2009) 5. VITA. Open standards, open markets, http://www.vita.com (2009) 6. Shanthikumar, J. G., Sargent, R. G.: Unifying view of hybrid simulation/analytic models and modeling. Operations Research 31 (6), 1030-1052 (1983) 7. Hsieh, T.: Hybrid analytic and simulation models for assembly line design and production planning. Simulation Modeling Practice and Theory 10, pp. 87-108 (2002) 8. Derevianko, A. V.: Constructing empirical models for the management of complex techno- logical processes. Bulletin of V.N. Karazin Kharkiv National University. No. 12: Mathe- matical modeling. Information technology. Automated control systems. № 863 - Kharkov: Publishing House of the KNU, P.101-110 (2009) (In Russian) 9. Kuri-Morales, A., Rodriguez-Erazo, F.: A search space reduction methodology for data mining in large databases. Engineering Application of Artificial Intelligence 22, pp. 57-65 (2009) 10. Vozniy, O., Polozhiy, K., Yeom, G. Y.: Journal of Application Physics 102 083306 (2007) Using Algebra-Algorithmic and Term Rewriting Tools for Developing Efficient Parallel Programs Anatoliy Doroshenko1, Kostiantyn Zhereb1 and Olena Yatsenko1 1 Institute of Software Systems of National Academy of Sciences of Ukraine, Glushkov prosp. 40, 03187 Kyiv, Ukraine doroshenkoanatoliy2@gmail.com, zhereb@gmail.com, oayat@ukr.net Abstract. An approach to program design and synthesis using algebra- algorithmic specifications and rewriting rules techniques is proposed. An alge- bra-algorithmic toolkit based on the approach allows building syntactically cor- rect and easy-to-understand algorithm specifications. The term rewriting system supplements the algebra-algorithmic toolkit with facilities for transformation of the sequential and parallel algorithms, enabling their improvement. Keywords. Algebra of algorithms, code generation, formalized design of pro- grams, parallel computation, term rewriting Key terms. FormalMethod, HighPerformanceComputing, ConcurrentComputa- tion, Integration 1 Introduction Nowadays uniprocessor systems are almost fully forced out by multiprocessor ones, as the latter allow getting the considerable increase of productivity of programs. Thus, the need of program parallelization arises [10]. There are libraries, such as pthreads, OpenMP, TBB and others [1], allowing developers to write parallel programs. Using these libraries a programmer manually divides code into independent sections, de- scribes data exchange and synchronization between them. However, such method has substantial defects, in particular, related to committing of errors into program code and a time required for parallelization and debugging. Therefore, the parallelization process has to be automatized as much as possible, and in an ideal, should be carried out fully automatically, without participation of a programmer. This paper continues our research on automation of process of designing and de- velopment of efficient parallel programs, started in [2], [9], [10], [11]. Our approach is based on usage of Integrated toolkit for Designing and Synthesis of programs (IDS) [2], [19]. The process of algorithm designing in IDS consists in the composition of reusable algorithmic components (language operations, basic operators and predi- cates), represented in Systems of Algorithmic Algebras (SAA) [2], [9], [19]. We used IDS for generation of sequential and parallel programs in Java and C++ on the basis Using Algebra-Algorithmic and Term Rewriting Tools … 39 of high-level algorithm specifications (schemes). To automate the transformations of algorithms and programs we use term rewriting system Termware [8], [11]. The nov- elty of this paper is 1) adjusting IDS to generate parallel code in Cilk++ language, which is an extension to the C and C++ programming languages, designed for multi- threaded parallel computing [7] and 2) closer integration between IDS and Termware systems. The approach is illustrated on a recursive sorting algorithm (quick sort). The problem of automated synthesis of program code from specifications has been studied extensively and many approaches have been proposed [13], [14]. Important aspects of program synthesis include 1) format of inputs (specifications), 2) methods for supporting concrete subject domains and 3) techniques for implementing trans- formation from specifications to output program code (these aspects roughly corre- spond to 3 dimensions of program synthesis discussed in [14]). For input specifica- tion, a popular option is using domain-specific languages (DSLs) [4], [17] that allow capturing requirements of subject domain. Other options include graphical modeling languages [5], [17], formal specification languages [16], ontologies [6] and algebraic specifications [3]. Using such formalisms enables analysis and verification of specifi- cations and generated code. There are also approaches that provide specification not of program or algorithm, but of problem to be solved, in form of functional and non- functional constraints [18], examples of input/output pairs [15], or natural language descriptions [14]. Another crucial aspect of program synthesis is specialization for subject domain. Some approaches are restricted to a single domain, such as statistical data analysis [12] or mobile application development [17]; others provide facilities for changing domain-specific parts, by using ontological descriptions [6], grammars [16], or by providing generic framework that is complemented by domain-specific tools [18]. Finally, an important aspect is transformation from input specification into source code in a target language. A transformation algorithm can be hand-coded [12], but it reduces flexibility of system. Therefore, transformation is often described in a de- clarative form, such as rewriting rules [16], visualized graph transformations [17], code templates [6]. More complex approaches require searching the space of possible programs [18], possibly using genetic programming or machine learning approaches [14]. In [4], partial synthesis is proposed: generic parts of application are generated, and then completed with specific details manually. In comparison, our approach uses algebraic specifications, based on Glushkov al- gebra of algorithms [2], but they can be represented in three equivalent forms: alge- braic (formal language), natural-linguistic and graphical, therefore simplifying under- standing of specifications and facilitating achievement of demanded program quality. Another advantage of IDS is a method of interactive design of syntactically correct algorithm specifications [2], [19], which eliminates syntax errors during construction of algorithm schemes. Specialization for subject domain is done by describing basic operators and predicates from this domain. Our approach uses code templates to spec- ify implementations for operators and predicates; program transformations, such as from sequential to parallel algorithm, are implemented as rewriting rules. Such sepa- ration simplifies changing subject domain or transformations. 40 A. Doroshenko, K. Zhereb and O. Yatsenko 2 Formalized Design of Programs in IDS and Termware The developed IDS toolkit is based on System of Algorithmic Algebras (SAA), which are used for formalized representation of algorithmic knowledge in a selected subject domain [2], [9], [19]. SAA is the two-based algebra SAA = <{U, B}; >, where U is a set of logical conditions (predicates) and B is a set of operators, defined on an infor- mational set; = 1 2 is the signature of operations consisting of the systems 1 and 2 of logical operations and operators respectively (these will be considered be- low). Operator representations of algorithms in SAA are called regular schemes. The algorithmic language SAA/1 [2] is based on mentioned algebra and is used to describe algorithms in a natural language form. The algorithms, represented in SAA/1, are called SAA schemes. Operators and predicates can be basic or compound. The basic operator (predicate) is an operator (predicate), which is considered in SAA schemes as primary atomic abstraction. Compound operators are built from elementary ones by means of opera- tions of sequential and parallel execution operators, branching and loops, and syn- chronizer WAIT ‘condition’ that delays the computation until the value of the condition is true (see also Table 1 in next section). The advantage of using SAA schemes is the ability to describe algorithms in an easy-to-understand form facilitating achievement of demanded quality of programs. The IDS is intended for the interactive designing of schemes of algorithms in SAA and generating programs in target programming languages (Java, С++, Cilk++). In IDS algorithms are designed as syntactically correct programs ensuring the syntactical regularity of schemes. IDS integrates three forms of design-time representation of algorithms: regular schemes, SAA schemes (textual representation of SAA formulae) and flow graphs. For integration with Termware, in this paper IDS was also adjusted on generation of programs in Termware language. Fig. 1. Architecture of the IDS toolkit The IDS toolkit consists of the following components (Fig. 1): constructor, in- tended for dialogue designing of syntactically correct sequential and concurrent algo- rithm schemes and generation of programs; flow graph editor; generator of SAA schemes on the basis of higher level schemes, called hyper-schemes [19]; and data- Using Algebra-Algorithmic and Term Rewriting Tools … 41 base, containing the description of SAA operations, basic operators and predicates in three mentioned forms, and also their program implementations. The constructor is intended to unfold designing of algorithm schemes by superpo- sition of SAA language constructs, which a user chooses from a list of reusable com- ponents for construction of algorithms. The design process is represented by a tree of an algorithm [2], [19]. On each step of the design process the constructor allows the user to select only those operations, the insertion of which into the algorithm tree does not break the syntactical correctness of the scheme. The tree of algorithm constructing is then used for automatic generation of the text of SAA scheme, flow graph and the program code in a target programming language. Example 1. We illustrate the use of SAA on Quicksort algorithm, which is given below in the form of SAA scheme. The identifiers of basic operators in the SAA scheme are written with double quotes and basic predicates are written with single quotes. Notice that identifiers can contain any text explaining the meaning of operator or predicate. It is not interpreted: it has to match exactly the specification in the data- base (however, since constructs are not entered manually, but selected from a list, the misspellings are prevented). The comments and implementations of compound opera- tors and predicates in SAA schemes begin with a string of “=” characters. SCHEME QUICKSORT_SEQUENTIAL ==== "main(n)" ==== Locals ( "Declare an array (a) of type (int) and size (n)"; "Declare a variable (i) of type (int)"; "Declare a variable (end) of type (int)"); "Fill the array (a) of size (n) with random values"; "end := a + n"; "qsort(a, end)"; "qsort(begin, end)" ==== IF NOT('begin = end') "Reduce (end) by (1)"; "Reorder array (a) with range (begin) and (end) so that elements less than pivot (end) come before it and greater ones come after it; save pivot position to variable (middle)"; "qsort(begin, middle)"; "Increase (middle) by (1)"; "Increase (end) by (1)"; "qsort(middle, end)" END IF END OF SCHEME QUICKSORT_SEQUENTIAL 42 A. Doroshenko, K. Zhereb and O. Yatsenko To automate the transformation (e.g. parallelization) of programs we augment ca- pabilities of IDS with rewriting rules technique [8], [11]. At the first step we construct high-level algebraic models of algorithms based on SAA in IDS (see also [2], [9], [19]). After high-level program model is created, we use parallelizing transformations to implement a parallel version of the program on a given platform (multicore in this paper). Transformations are represented as rewriting rules and therefore can be ap- plied in automated manner. The declarative nature of rewriting technique simplifies adding new transformations. Also transformations are separated from language defini- tions (unlike approach used in [16]), therefore simplifying addition of new transfor- mations or new languages. We use the rewriting rules system Termware [8], [11]. Termware is used to de- scribe transformations of terms, i.e. expressions in a form f t1 , , tn . Transforma- tions are described as Termware rules, i.e. expressions of form source [condi- tion]-> destination [action]. Here source is a source term (a pattern for match), condition is a condition of rule application, destination is a trans- formed term, action is additional action that is performed when rule fires. Each of 4 components can contain variables (denoted as $var), so that rules are more gener- ally applicable. Components condition and action are optional. They can exe- cute any procedural code, in particular use the additional data on the program. 3 Generation of Terms and Programs and Experimental Results IDS system performs generation of programming code on the basis of an algorithm tree, received as a result of designing an algorithm in the IDS Constructor (see Sec- tion 2), and also code templates – implementations of basic operators and predicates in a target language (Java, C++, Cilk++), that are stored in IDS database. In the proc- ess of generation, IDS translates SAA operations into corresponding operators of programming language. Compound operators can be represented as subroutines (methods). IDS database contains various code patterns for generation of parallel programs, namely using WinAPI threads, Message Passing Interface (MPI), and Cilk++ operations [7]. For implementation of parallel version of our illustrative ex- ample (Quicksort algorithm), we used Cilk++ as it facilitates programming of recur- sive parallel programs [7]. Cilk++ is a general-purpose programming language, based on C/C++ and designed for multithreaded parallel computing. Table 1 gives a list of main SAA operations and templates of their implementation in Termware and Cilk++, which are stored in the IDS database. The implementations contain placeholders like ^condition1^, ^operator1^ etc., which are replaced with program code during the program generation. For the purpose of transformation of some algorithm, IDS performs the generation of a corresponding term and developer specifies a set of rules for transformation. Then Termware carries out the actual transformation, the result of which can further be used for code generation in a programming language. Using Algebra-Algorithmic and Term Rewriting Tools … 43 Table 1. The main SAA operations and templates of their implementation in Termware and Cilk++ languages Text of SAA operation Termware implementation Cilk++ implementation “operator1”; then (^operator1^, ^operator1^; “operator2” ^operator2^) ^operator2^ IF ‘condition’ IF (^condition1^, if (^condition1^){ THEN “operator1” ^operator1^, ^operator1^ } ELSE “operator2” ELSE (^operator2^)) else {^operator2^} END IF FOR '(var) from FOR (%1, %2, %3, for (%1, %2, %3) { (begin) to ^operator1^ ^operator1^ (end)' ) } LOOP “operator1” END OF LOOP (“operator1” Parallel( cilk_spawn PARALLEL “opera- ^operator1^, ^operator1^; tor2”) ^operator2^) ^operator2^ WAIT ‘condition’ WAIT cilk_sync; (^condition1^) Example 2. We will parallelize the sequential Quicksort algorithm (see Exam- ple 1), using IDS and Termware. For the parallelization, function qsort has to be transformed, so we generated the term for this function: qsort(Params(begin, end), IF (NOT(Equal(begin, end)), then (Dec(end, 1), then (Partition(a, begin, end, end), then (CALL(qsort(begin, middle)), then (Inc(middle, 1), then (Inc(end, 1), CALL (qsort(middle, end))))))))) Then the operation of parallel execution of operations has to be added to this term. This is done by applying the following two Termware rules: 1. then(CALL($x), then ($y, $z)) -> Parallel (CALL($x), then($y, $z)) 2. then($x1, Parallel($x2, $x3)) -> then($x1, then(Parallel($x2, $x3), WAIT(AllThreadsCompleted(n)))) The first rule replaces the operation of sequential execution of operators with paral- lel execution. The second rule adds a synchronizer WAIT(AllThreads Com- 44 A. Doroshenko, K. Zhereb and O. Yatsenko pleted(n), which delays the computation until all threads complete their work. The result of the transformation is given below. qsort(Params(begin, end), IF(NOT(Equal(begin, end)), then (Dec(end, 1), then (Partition(a, begin, end, end), then (Parallel( CALL (qsort(begin, middle)), then (Inc(middle, 1), then (Inc(end, 1), CALL (qsort(middle, end))))), WAIT(AllThreadsCompleted(n))))))) Thus, as a result of parallelization, the first operator (thread) of Parallel opera- tion executes the operator qsort(begin, middle), and the second one calls two Inc operators and qsort(middle, end). Operation WAIT(AllThreadsCompleted(n)) performs the synchronization of threads. The threads are created recursively; their quantity is specified as an input parameter of function main. Notice that these transformations are only valid if two qsort calls are independent. The system doesn’t check this property: it has to be asserted by a developer. The resulting parallel algorithm scheme Quicksort was used for generation of code in Cilk++ using IDS system. The parallel program was executed on Intel Core 2 Quad CPU, 2.51 GHz, Windows XP machine. Fig. 2 shows the program execution time in seconds. The speedup at execution of program with usage of 2, 3 and 4 processors was 2; 2.9 and 3.8 accordingly, which shows that the program has a good degree of parallelism and is scalable. Fig. 2. The execution time of parallel Quicksort program on a quad-core processor; the size of input array is 5107 elements Using Algebra-Algorithmic and Term Rewriting Tools … 45 4 Conclusion We have described our approach of constructing efficient parallel programs using high-level algebra-algorithmic specifications and rewriting rules technique. Algebra- algorithmic toolkit IDS and rewriting rules engine Termware are combined to enable formal, yet easy-to-understand algorithm specifications and automate program syn- thesis and parallelization process. The combined development toolkit can be retar- geted to various subject domains and implementation languages, as exemplified by Cilk++. The developed system could be further extended with automated code analy- sis facilities based on rewriting technique. References 1. Akhter, S., Roberts, J.: Multi-Core Programming. Intel Press, Hillsboro (2006) 2. Andon, F. I., Doroshenko, A. Y., Tseytlin, G. O., Yatsenko, O. A.: Algebra-Algorithmic Models and Methods of Parallel Programming. Akademperiodika, Kyiv (2007) (in Rus- sian) 3. Apel, S. et al.: An Algebraic Foundation for Automatic Feature-Based Program Synthesis. Science of Computer Programming. 75(11), 1022–1047 (2010) 4. Bagheri, H., Sullivan, K.: Pol: Specification-Driven Synthesis of Architectural Code Frameworks for Platform-Based Applications. In: Proc. 11th Int. Conf on Generative Pro- gramming and Component Engineering, pp. 93–102, ACM, New York (2012) 5. Batory, D.: Program Refactoring, Program Synthesis, and Model-Driven Development. In: Proc. 16th Int. Conf. on Compiler Construction. LNCS 4420, pp. 156–171 Springer- Verlag, Berlin Heidelberg (2007) 6. Bures, T. et al.: The Role of Ontologies in Schema-Based Program Synthesis. In: Proc. Workshop on Ontologies as Software Engineering Artifacts, Vancouver (2004) 7. Cilk Home Page, http://cilkplus.org/ 8. Doroshenko A., Shevchenko R.: A Rewriting Framework for Rule-Based Programming Dynamic Applications, Fundamenta Informaticae, 72(1–3), 95–108 (2006) 9. Doroshenko, A., Tseytlin, G., Yatsenko, O., Zachariya, L.: A Theory of Clones and For- malized Design of Programs. In: Proc. Int. Workshop on Concurrency, Specification and Programming (CS&P’2006), pp. 328–339, Wandlitz, Germany (2006) 10. Doroshenko, A. Y., Zhereb, K. A., Yatsenko, Ye. A.: On Complexity and Coordination of Computation in Multithreaded Programs. Problems in Programming, 2, 41–55 (2007) (in Russian) 11. Doroshenko, A., Zhereb, K.: Parallelizing Legacy Fortran Programs Using Rewriting Rules Technique and Algebraic Program Models. In: Ermolayev, V. et al. (eds.) ICT in Education, Research, and Industrial Applications. CCIS 347, pp. 39–59. Springer Verlag, Berlin Heidelberg (2013) 12. Fischer, B., Schumann, J.: AutoBayes: a System for Generating Data Analysis Programs from Statistical Models. J. Funct. Program. 13(3), 483–508 (2003) 13. Flener, P.: Achievements and Prospects of Program Synthesis. In: Kakas, A. C., Sadri, F. (eds.) Computational Logic: Logic Programming and Beyond. LNCS 2407, pp. 310–346, Springer Verlag, London (2002) 46 A. Doroshenko, K. Zhereb and O. Yatsenko 14. Gulwani, S.: Dimensions in Program Synthesis. In: 12th Int. ACM SIGPLAN Symposium on Principles and Practice of Declarative Programming, pp. 13–24. ACM, New York (2010) 15. Kitzelmann, E.: Inductive Programming: a Survey of Program Synthesis Techniques. Ap- proaches and Applications of Inductive Programming, LNCS 5812, pp. 50–73. Springer Verlag, Berlin Heidelberg (2010) 16. Leonard, E. I., Heitmeyer, C. L.: Automatic Program Generation from Formal Specifica- tions using APTS. In: Automatic Program Development. A Tribute to Robert Paige, pp. 93–113. Springer Science, Dordrecht (2008) 17. Mannadiar, R., Vangheluwe, H.: Modular Synthesis of Mobile Device Applications from Domain-Specific Models. In: Proc. 7th Int. Workshop on Model-Based Methodologies for Pervasive and Embedded Software, pp. 21–28. ACM, New York (2010) 18. Srivastava, S., Gulwani, S., Foster, J. S.: From Program Verification to Program Synthesis. In: Proc. 37th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Program- ming Languages, pp. 313–326. ACM, New York (2010) 19. Yatsenko, O.: On Parameter-Driven Generation of Algorithm Schemes. In: Proc. Int. Workshop on Concurrency: Specification and Programming (CS&P’2012), pp. 428–438, Berlin, Germany (2012) 1.2 Machine Intelligence, Knowledge Engineering and Management for ICT An Intelligent Approach to Increase Efficiency of IT- Service Management Systems: University Case-Study Nikolay Tkachuk1, Vladyslav Sokol1 and Kateryna Glukhovtsova1 1 National Technical University “Kharkov Polytechnic Institute” Frunze str., 21, Kharkov, Ukraine tka@kpi.kharkov.ua, vladislav.sokol@gmail.com, kat_1109@mail.ru Abstract. A comprehensive framework to increase efficiency of IT-services management systems (ITSMS) is proposed, which resolves 3 interconnected tasks in a target organization: 1) providing an effective configuration of ITSM- modules according to their specific features and needs; 2) integration a given ITSMS with existing enterprise architecture; 3) advanced incidents management in ITSMS. The applicability of this approach was tested successfully on the case-study at the National Technical University “Kharkov Polytechnic Institute” (www.kpi.kharkov.ua ). Keywords. IT-service management, effectiveness, multi-criteria ranking, data integration, adaptive ontology, case-based reasoning, e-learning Key terms. Academia, ICTInfrastructure, KnowledgeManagementProcess, Model, SoftwareEngineeringProcess 1 Introduction: Problem Actuality and Research Objectives Nowadays the concept of ITIL (IT Infrastructure Library) [1] and the new kind of computerized management systems, namely: IT Service Management Systems (ITSMS) became a growing and perspective approach to solve very important and complex technical problem and, at the same time, business-focused one: how to or- ganize a well-structured and controllable IT-environment at an appropriate organiza- tion? According to ISO/IEC 20000 [2] an IT Service Management System (ITSMS) pro- vides “…a framework to enable the effective management and implementation of all IT-services”. Due to high complex and multi-dimensional nature of IT-services in large modern business organizations, which ITSMS are dealing with, recent publica- tions in these domain present some sophisticated approaches to design and to use these facilities. One such important topic in ITIL-ITSM domain is the integration of ITSMS functionality into enterprise architecture (see, e.g. in [3,4]). Another recent trend in ITSMS-development is the usage of ontologies and model-driven architecture An Intelligent Approach to Increase Efficiency of IT-Service Management Systems 49 (MDA) [5, 6] for knowledge handling and re-using. Their authors emphasize the ac- tual need to elaborate and to apply several knowledge-oriented approaches to re- quirements analysis within ITSMS-development, and to quantitative quality assess- ment of appropriate project solutions. Taking into account some ITSMS-issues mentioned above, the main objective of the research presented in this paper is to propose the first vision for intelligent com- plex approach to increase efficiency of typical ITSMS, with a proof of concept basing on the ITSMS university case-study. The rest of this paper is organized in the follow- ing way: Section 2 analyses some existing ITSMS, introduces our vision about their typical functionality, and shows the list of prioritized tasks to be resolved to increase an efficiency of an ITSMS. In Section 3 we present the method elaborated for effec- tive ITSM-modules configuring in a target business organization, and Section 4 re- ports the first version of ITSMS - ontologies to integrate the selected modules into enterprise architecture (EA). In Section 5 the designing perspective for the combina- tion case-based reasoning (CBR) with ontology-based approach to advanced incident management it ITSMS is briefly outlined. In Section 6 we present the university case- study for our method to estimate an effectiveness of different ITSMS configurations and discuss the results achieved. In Section 7 the paper concludes with a short sum- mary and with an outlook on the next steps to be done in the proposed development framework. 2 Typical Functionality of ITSMS and the Complex of Intelligent Tasks to Increase its Efficiency In order to elaborate a way how to provide a complex approach to increase an effi- ciency of ITSM-system operating, it is necessary to understand its typical functional- ity and to analyze its specific features. 2.1 Overview of existing ITSMS We have analyzed some already existing ITSMS [7-10], and the results of this study is presented in the Table 1. Basically, all such systems can be divided into 3 groups, namely: (a) advanced business ITSM-products; (b) open source ITSM-solutions; (c) bespoke ITSM-systems. To the group (a) belong such systems as, e.g., HP OpenView Service Desk [7] and BMC Remedy [8]. The first software product is the absolutely leader in this market segment, because the most part of organizations which prefer ITSM-business solu- tions from the group (a), are using exactly HP-platform. The number of its running installations is essentially less than HP, at least because of more expensive costs of Remedy ITSM Suite. 50 N. Tkachuk, V. Sokol and K.Glukhovtsova Table 1. Results of comparison for some ITSMS OMNINET HP Service BMC Remedy Axios OmniTracker Criteria / Systems Manager ITSM Suite 7.5 Assyst 7.5 ITSM Center 7.10 2.0 Basic 5 5 5 4 functionality Maintainability 5 4 5 4 Report generation 4 5 5 4 Scaleability 4 2 3 5 Web-interface 5 5 5 5 ITSM-solutions from the group (b) also are used in practice, but they definitely have limited functionality and provide less level of IT-services management. The typical open source ITSMS are, for instance, GLPI [8], OTRS [9], and some others, which are listed at the Web-resource SourceForge [10]. And, objectively, the business organizations, which are not ready to buy advanced software products from group (a), and which are not satisfied with functionality pro- vided by ITSM-systems from group (b), because they have some specific IT-needs and challenges, exactly these companies try to develop their own ITSM-solutions to be considered as members of the group (c). The more detailed comprehensive study of some existing ITSM-systems is presented in [11]. 2.2 Typical ITSMS-functionality Based on the given analysis of the real ITSMS (see above), we have elaborated the following vision for their typical functionality (see Fig. 1). There are 5 main subsystems (or packages) of system functions, namely: 1. IT Business Alignment: this subsystem is supposed to implement a ІТ-strategy in given business organization with respect to its main goals and needs, and to pro- vide a base for costs assessment to whole IT-infrastructure; 2. Service Operations: this facility is responsible for customer’s requests management (regarding to a current incident and to a related problem), and for providing of ITSM-support functions; 3. Service Delivery Assurance: this functional package implements a configuration and change management of all ITSM-software tools thus is extremely important for a stable IT-environment; 4. Service Design and Management: this ITSMS-functionality provides detailed in- formation about new perspective IT-services to be designed with respect to their availability and quality for IT-customers; 5. Service Development and Deployment: this subsystem allows to create and to test new ITSM-services and appropriate IT-infrastructure solutions, including installa- An Intelligent Approach to Increase Efficiency of IT-Service Management Systems 51 tion of new hard-ware components, development of additional software applica- tions, and training programs for ITSM-staff’ and for end-users as well. As we can see on the structure presented in Fig.1, each of these 5 subsystems is built from several functional modules (they are depicted as UML-classes). The most important of them are the following ones: Module М1 =”Incident management”: it includes organizational procedures and appropriate tools to resolve current incidents, which IT-service users are facing with (hard-and software errors, network connection problems, request for consultations, etc.); Fig. 1. Typical functionality of a ITSMS Module М2 = ”Problem management”: this facility provides tools to detect and to eliminate any problem situation which is a reason for different incidents; Module М3 = ”Configuration management”: this module supports all operating sub-schemes in the IT-infrastructure of given business organization; Module М4 = ”Change management”: it supervises and coordinates all changes which arise in IT-infrastructure; . Module М5 =”Service level management”: this unit is responsible for definition and implementation of an appropriate level of IT-services to be provided for cus- tomers. In ITIL-best practice manuals (see e.g. in [12]) the following 3 main schemes are considered to introduce these modules into IT-infrastructure of a target organization: a classic scheme (S1); a contract scheme (S2); an infrastructure-centered scheme (S3). 52 N. Tkachuk, V. Sokol and K.Glukhovtsova A classic scheme S1 is the most applied solution in the ITSM-domain, and it supposes the following sequence of modules М1-М5: S1 M 1, M 3, M 4, M 2, M 5 (1) This approach quickly allows to resolve the most actual communication problems between IT-service department and customers basing on incident management (mod- ule М1), and it provides some tools for all IT-services support (the modules М3 and М4), and after that a platform for future IT-infrastructure development is introduced (modules М2 та М5 respectively). But in this case it has to be taken into account this scheme is a most expensive way for a given business organization, and it requires a lot of resources exactly at an initial phase of whole ITSM-configuring framework. A contract scheme S2 actually aims to formalize a communication process between IT-service department and customers, and it has the following modules-workflow: S 2 M 5, M 3, M 1, M 4, M 2 (2) In this case all customer requirements to IT-services have to be collected and specified (in module М5), and appropriate IT-infrastructure sub-schemes can be built (using module М3), in order to define prospective IT-strategy in the target organiza- tion, next an operative ITSM-functionality is provided, including incident manage- ment (in module М1), change management (in module М4), and problem manage- ment (in module М2). Obviously, this scheme definitely has some risk factors regard- ing its efficiency, if the initial IT-service specifications were done not correctly (in module М5). And, finally, an infrastructure-centered scheme S3 proposes the modules sequence indicated as following: S 3 M 3, M 4, M 2, M 1, M 5 (3) that is, firstly, to provide tools for all IT-services support (modules М3 and М4 re- spectively). Secondly, this approach allows to manage all typical problem situations (in module М2), and already based on this one to detect and to resolve corresponded incidents by IT-service customers (in module М1). Thirdly, it creates an opportunity to define in computer-aided way the necessary composition and the IT-service level management (in module М5). It is necessary to note that besides some empirical recommendations concerning the possible ITSM-modules configurations defined as (1)-(3), in the appropriate tech- nical documentation there are no more or less proved suggestions about possible quantitative estimations for effectiveness of these alternative approaches. 2.3 The complex of intelligent tasks to increase of ITSM-system efficiency Taking into account the results of performed analysis (see above), and based on some modern trends in the domain of ITSMS-development (see Section 1), the following list of prioritized tasks can be composed in order to increase ITSMS-efficiency, namely An Intelligent Approach to Increase Efficiency of IT-Service Management Systems 53 1. to provide an effective configuring of ITSM-modules for a target organization, taking into account its specific features and needs; 2. to elaborate an integration framework for a given ITSM-system’s configuration and for an existing enterprise architecture (EA); 3. to support an advanced incidents management in the already installed ITSM- system. In our opinion, the task (I) can be resolved basing on some expert methods for multi-criteria ranking, with respect to specific IT-infrastructure’s features and cus- tomer needs in a concerned business organization [13,14]. The task (II) belongs to already well-known integration issues in distributed heterogeneous information sys- tems, and e.g. an ontology-based approach can be used for this purpose (e.g. in [3,6,15]). And, finally, to solve the task (III) an additional decision-making function- ality for typical ITSM-services (see Fig.1) has to be elaborated, e.g. basing on the combination of case-based reasoning (CBR) approach with ontologies [16,17]. Below these tasks and their possible solutions are presented and discussed in more detail. 3 The Method for Effectiveness Estimation of Alternative ITSM-Module Configurations To formalize the task (I) from their list considered in the Section 2.3, namely: to provide an effective configuring of ITSM-modules for a target business organization, the following factors have to taken into account: such a problem has a high complexity grade and it is semi-formalized; estimation criteria for it are of different nature and they are multi-valued; an information base to solve this task mainly can be collected basing on expert data only; available expert data could be quantitative and qualitative values both. To solve this task we have chosen one of the multi-criteria ranking methods, which is presented in [14]. Accordingly to this approach the following steps have to be performed: Step 1. A set of possible alternatives, X { x1 , x 2 ,..., x n } { x i , i 1, n} (4) and a set of global importance criteria to characterize these alternatives K {K1, K 2 ,..., K m } {K j , j 1, m} (5) have to be defined. Step 2. Each global criteria K j is characterized by a subset of appropriate local crite- ria K j {k j1 , k j 2 ,..., k jQ } {k jq , q 1, Q} (6) further, a set of membership functions according to all local criteria alternatives 54 N. Tkachuk, V. Sokol and K.Glukhovtsova { k j1 ( xi ), k j 2 ( xi ),..., k jQ ( xi )} { k jq ( xi ), q 1, Q, j 1, m} (7) and the weight coefficients of their relative importance for these local criteria {w j1 , w j 2 ,..., w jQ } {w jq , q 1, Q} (8) have to be determined, where the following condition has to be fulfilled Q w jq 1 (9) q 1 Step 3. To determine membership functions of alternatives { x i , i 1, n} to criteria K j ,{ j 1, m} based on an additive convolution of their local criteria Q (10) k ( xi ) w jq k ( xi ) j jq q 1 Table 2. Definition of membership functions for criteria to alternatives (fragment) Alternatives Criteria K K1 … KM k11 … k1Q … kM 1 … k Mm X x1 k11 ( x1 ) … … k1Q ( x1 ) … k M 1 ( x1 ) Mm ( x1 ) … … … … … … … … xn k11 ( xn ) … k1Q ( xn ) … k M 1 ( xn ) … Mm ( xn ) … … … Step 4. Taking into account the membership functions obtained { K j ( xi ), j 1, m} for all alternatives xi ,{i 1, n} it is possible to determine a joined membership function for a generalized criterion K : m K ( xi ) w j K j ( xi ) (11) j 1 where w j , j 1, m are coefficients of their relative importance K j , j 1, m . Step 5. Finally, an alternative with a maximum value of membership function for generalized criterion K can be chosen as a target solution: ( x* ) max{ K ( xi ), i 1, n}) (12) Below in Section 6 we present the case-study, which was performed to prove this method, and we discuss the results achieved. An Intelligent Approach to Increase Efficiency of IT-Service Management Systems 55 4 Ontological Specifications for ITSMS-EA Integration Framework As already mentioned above (see Section 2), any ITSMS has to be integrated into an existing EA of a target organization. In our approach this task (II) has to be resolved for an ITSMS-configuration defined with the method presented in Section 3. This issue is already discussed intensively in a lot of publications, and their authors consider both its conceptual and technological aspects. E.g., an ITSMS-EA integra- tion based on well-known SOA – framework is presented in [3], and as the important conceptual input for this issue the appropriate meta-model (actually, some kind of a domain ontology) for IT services is designed. In [5] an approach to integration of ITSM-services and business processes in given organization is elaborated, using onto- logical specifications to formalize the good practice guidance for ITSM. An ontology- based framework to integration of software development and ITSMS-functioning is proposed in [15], thus resulting in enhanced semantic-aware support tools for both processes. Even this brief overview allows us to conclude that exactly an ontology- based approach is a most effective way to solve this problem. That is why, in our opinion, to provide ITSM-EA integration effectively, it is necessary to combine the following information resources (IR), namely: a) IR related to ITSMS – functionality, b) IR concerned EA-domain, c) IR characterized a target organization (TO), which is facing an ITSMS-EA integration problem with. Let’s define these IR (a)-(c) as: Onto- ITSMS, Onto-EA, and Onto-TO respectively. Thus, the IR needed to provide an ITSMS-EA integration should be specified using an appropriate joined ontology, designated as Onto_ITSMS-EA. Onto _ ITSM EA Onto ITSMS , Onto EA, Onto TO (13) Obviously, some already existing ITIL / ITSM ontological specifications can be used for this purpose, e.g.: Onto-ITIL ontology elaborated in [5] basing on OpenCyc ontology (www.opencyc.org), Onto-SPEM (Software Process Engineering Meta- model) ontology [18], and Onto-WF (WorkFlow) ontology [19]. Taking these re- sources into account, we can represent the ontological specification for Onto_ITSM in the following way Onto ITSMS Onto ITIL, Onto SPEM , Onto WF (14) There are also several ontologies developed to specify EA, and according to one of recent and comprechensive researches in this domain presented in [20], we accept the following 3-level definition for EA-ontology Onto EA Onto BT , Onto AC , Onto RS (15) where: Onto_BT is a sub-ontology of Business Terms (BT), Onto-SC is a sub- ontology of Architecture Components (AC), and Onto-RS as a sub-ontology of RelationShips (RS) among items of AC. And finally, to define an Onto-TO ontology for target organization given in expression (13), its specific features and needs related to ITSMS-usage within 56 N. Tkachuk, V. Sokol and K.Glukhovtsova existing EA have to be taken into account. As a small excerpt of such domain-specific Onto-TO, which is elaborated in our University-ITSMS case-study (see Section 6), the following UML-class diagram in Fig. 2 is shown. Fig. 2. Taxonomy of customers in a University-ITSMS as a part of a Onto-TO ontology The proposed ontology-based approach for ITSMS-EA integration can also be used to elaborate the solution for the task (III) from their list completed in Section 2.3. 5 Adaptive Onto-CBR Approach to Advanced Incidents Management in ITSM In order to solve the task (III), namely: to provide an advanced incidents management in ITSMS, accordingly to our inter-disciplinary vision about the ITSMS-development in general, we propose to amalgamate the following design-principles (i)-(iv) listed below (i) an incident management as a weak-formalized and complex task within the ITSMS-support for its customers can effective be resolved using one of the intel- ligent decision-support methods, e.g., using CBR-method; (ii) to enhance a CBR-functionality, especially with respect to specific needs in a target organization, an appropriate domain-ontology should be elaborated and used combining with CBR; (iii) because of the permanent changes in an IT-infrastructure of a given organization, and of the changes arising in its environment as well, such a domain-ontology has to be constructed as an adaptive ontology; (iv) to provide a possibility for knowledge gathering and their reusing in ITSMS, some e-Learning models and technologies can be applied. There are already the approaches elaborated to combine a CBR-method with on- tologies [16, 17], which allow to provide more efficiently a case-representation, to enhance case-similarity assessment, and to perform case-adaptation process for a new solution. From the other hand, an ontology-centered design for ITSM-services, and An Intelligent Approach to Increase Efficiency of IT-Service Management Systems 57 especially, for Incident Management (IM), is also discussed in some recent publica- tions in this domain. In particular, the proposed in [21] Onto-IM ontology is built according to ISO/IEC20000 for ITIL/ITSM [2], it includes such concepts as Incident Management, Incident Record, Incident Entity, etc. specified using OWL (Ontology Web Language), and the small example of its notation is shown in Fig.3. Fig. 3. The excerpt of Onto-IM ontology elaborated in [21] These results provide a solution for the tasks (i)-(ii), but in our opinion to cover the task (iii) in efficient way, with respect to permanent changes in IT-infrastructure of a target organization, an appropriate ontology has to be constructed as an adaptive facil- ity [22]. In this way the Onto-TO ontology given in Section 4 should be given as the following tuple Onto TO ( adapt ) C , R , P , W ( C ) , W ( R ) (16) where, additionally to the basic components of any ontology, namely: C – set of con- cepts, R – a set of relationships among these concepts, and P – a set of axioms (se- mantic rules), the following ones have to be defined: W(C) is a set of weight coeffi- cients for concepts of C, and a W(R) is a set of weight coefficients for relationships of R respectively. Usage of these weight coefficients allows us, e.g., to take into account an appropriate importance grade in several types of ITSMS-customers (see Fig. 2) to provide IM - services for them. In order to get all information resources needed for a completed solution of tasks (i)- (iv), we propose to apply some e-Learning models and technologies within an ITSMS, especially, for skills training and experience gathering by ITSMS-staff, des- ignated in the Onto-IM ontology as Incident Manager, ServiceDeskEmployee, Spe- cialist [21]. For this purpose an e-Learning ontology (Onto-EL) can be used, e.g., in [23] the Onto-EL is elaborated to build for learners their personal paths in e-learning environment, according to the selected curriculum (Incident Management in terms of Onto-IM), syllabus (Incident Record) and subject (Incident Entity). Summarizing aforementioned issues concerning the tasks listed in (i)-(iv), the conceptual mechanism to provide an advanced incidents management (AIM) in ITSMS can be represented at the high-architectural level as the UML-package dia- gram shown in Fig.4. Below the approach to resolve the task (1) from their list given in Section 2 is il- lustrated using the real case-study within our research and practice activities to apply ITSMS to manage IT-infrastructure of National Technical University “Kharkov Poly- technic Institute” (www.kharkov.ua) referred in following as NTU “KhPI”. 58 N. Tkachuk, V. Sokol and K.Glukhovtsova Fig. 4. The AIM – architectural framework (to compare with the scheme given in [24]) 6 University Case-Study: Effective ITSM-Modules Configuring It is to mention that exactly university- and /or a campus-domains are considered from many authors as a suitable example of ITSMS-usage (see, e.g., in [25-26]), because intensive research- and educational activities obviously require a modern and well- organized IT-environment. That is why we also proved our approach to effectiveness estimation of alternative configurations of ITSM-modules using the test-case data collected at the NTU “KhPI”. 6.1 Application domain description: IT-infrastructure of NTU “KhPI” The NTU “KhPI” is one of the largest technical universities of Ukraine located in the city of Kharkiv, which is the important industrial and cultural center at the East of the country. The university has about 22000 students, ca. 3500 of faculty members, and accordingly there is the advanced IT-infrastructure to support all educational and research tasks. The main characteristics of IT-operating at the NTU “KhPI” are sum- marized in Table 3. Table 3. Some technical data about IT-infrastructure NTU “KhPI” Parameters Values PCs in the network configuration 1525 User’s accounts 2700 Buildings 23 Servers 60 Routers 80 Peripheral units 6000 IT-specialists in the central office 11 Incidences per day (registered) 5-7 An Intelligent Approach to Increase Efficiency of IT-Service Management Systems 59 In cooperation with the IT-staff at the University IT control office we have analyzed retrospective data about some typical problem situations occurred, and about the corresponded incidents, which daily have been resolved within the direct communication with IT-service customers. In this way the main types of ITSM- incidents and their initial problem situations were identified, and they are described in Table 4. Table 4. Main types of ITSM-incidents and their related problem situations № Incident type Cause ( problem situation) 1 No Internet-connection at - router was turned off; Dept or on local PC - network cable breaked or failure on router hardware; - incorrect network setup; - problems with software on local PC № Incident type Cause ( problem situation) 2 High-loading of PC processor - computer viruses with a small number of active - high degree of PC hard driver’s de-fragmentation. user’s programs 3 Installing problems for new - computer viruses software - absence of additional (middleware) software needed for installation. 4 Failure to send email -incorrect setup of local network server (proxy) -problems with central e-mail server. 5 Troubles in the use of third- -lack of specific configuration, party software - improper use of system services. Basing on the analysis results obtained, we can apply the elaborated method to es- timate alternative ITSMS-module configurations (see Section 3). 6.2 Customizing of the elaborated estimation method: alternative configurations and criteria definition According to the Step 1 of the method presented in Section 2.2, the list of alternative ITSM-module configurations have to be defined, and in our case they are presented: X 1 = Service Desk subsystem (SDS) and Incident Management Module X 2 = SDS, Incident Management Module and Configuration Management Mod- ule X 3 = SDS, Incident Management Module and Change Management module X 4 = SDS, Incident Management Module and Problem Management Module On the next Step 2, according to the formulas (4) - (10), we determine the criteria for the quantitative evaluation of the proposed alternatives and their performance indicators, which are shown in Table.5. These criteria and their indicators (metrics) are taken from [35], and they are recommended to evaluate effectiveness of IT- infrastructure in any business organization. 60 N. Tkachuk, V. Sokol and K.Glukhovtsova Table 5. List of values for global and local criteria (fragment) Global and Semantics performance measurement Insecure Effec- Scope of local criteria criteria and target values value tive values value K1 Effectiveness of incident management k11 Average time incident resolution →min >30 15 9999min. min. min. k12 Percentage of incidents resolved proac- 0% 15% 0-100% tively →max Global and Semantics performance measurement Insecure Effec- Scope of local criteria criteria and target values value tive values value k13 Percentage of incidents resolved at the <65% 85% 0-100% first level of support →max k14 Percentage of incidents that have been <75% 90% 0-100% resolver from the first time →max K2 Effectiveness of problem management k 21 The ratio of the number of solved prob- <10% 35% 0-100% lems to total problems (%)→max …….. ………. For example, a value of 10 for an alternative X 3 to criteria k14 (see Table 5) means, that the implementation of Service Desk and Incident Management Module will help to increase the ratio of incidents, which are resolved successfully, to its effective value of 90%, etc. The obtained in this way results are given in Table 6. Table 6. Estimated values for the alternatives with respect to the defined criteria (fragment) K1 : Effective incident management → opt k11 (opt =20м) k12 (15%) k13 (85%) k14 (90%) X1 5 5 5 6 X2 6 7 6 6 X3 5 5 5 6 X4 7 6 8 7 … …………… … … … In order to implement the elaborated method with customized data introduced above, the special software tool was developed. 6.3 Results of estimation and their analysis To continue the usage of our method presented in Section 2.2 (Step 3 and Step 4 re- spectively) using the pair-wise comparison the weight coefficients of relative impor- An Intelligent Approach to Increase Efficiency of IT-Service Management Systems 61 tance (WCRI designated as w( ki , j ) ) for the local criteria regarding their global ones were determined: The WCRI values of the local criteria for the global criterion K1 : w(k11 ) = 0,239458, w(k12 ) = 0,239458, w(k13 ) = 0,432749, w(k14 ) = 0,088335 The WCRI values of the local criteria for the global criterion K 2 : w(k 21 ) = 0,68334, w(k 22 ) = 0,19981, w(k 23 ) = 0,11685 The WCRI values of the local criteria for the global criterion K 3 : w(k31 ) = 0,332516, w(k32 ) = 0,527836, w(k33 ) = 0,139648 The summarized WCRI values for the global criterion K i : K1 = 0,527836, K 2 = 0,332516, K 3 = 0,139648 And finally, according to Step 5 of this method (see Section 3.2), and using the multi-criteria ranking formulas (11) - (12), we obtain the following ultimate results of the effectiveness assessment for the considered alternatives (see Table 5), namely X 1 0.537, X 2 0.671, X 3 0.578, X 4 0.727 (17) To confirm the reliability of the results given in (17), the comparative analysis with some "best practices" in ITSMS implementation was carried out, using the data of IDC-company [28]. In particular, IDC has reviewed approx. 600 organizations worldwide, which used ITSM for over a year, and in this study especially the prioriti- zation issues of different ITSM-modules implementation were analyzed. In Fig. 5 the result of the performed comparison is shown. Fig. 5. Graphical representation of the obtained results 62 N. Tkachuk, V. Sokol and K.Glukhovtsova As we can see, to provide Change Management and Configuration Management is necessary to have within an IT-infrastructure database (DB) of IT-configurations, and DB of problem situations as well, these facilities are rather costly for the University, and therefore the implementation of these modules is not a priority task. The most effective ITSM-modules configuration for NTU "KhPI" includes an Incident Man- agement module and a Service Desk subsystem. 7 Conclusions and Future Work In this paper we have presented the intelligent approach to increase efficiency of ITSMS, which has to resolve 3 interconnected tasks for its effective usage in a target organization: 1) providing an effective configuration of ITSM-modules according to its specific features and needs; 2) elaboration an integration framework for a given ITSMS with existing EA; 3) advanced incidents management in ITSMS. To solve these tasks in a comprehensive way the interdisciplinary framework is elaborated, which includes: the expert method for multi-criteria ranking of alternative ITSM- modules configurations, the ontological specifications for ITSMS-EA integration, and the approach to enhanced incident management based on the combination of adaptive ontologies and CBR-methodology. To implement the first part of this approach the appropriate software tool was elaborated, and its applicability was tested successfully within the case-study at the NTU “Kharkov Polytechnic Institute”. In future we are going to implement and to test the appropriate software solutions for other tasks in the proposed framework, using such technologies as OWL, BPMN, XML /XLST, and Web-services. References 1. Office of Government Commerce: ITIL Library. London (2003) 2. International Organization for Standardization. ISO/IEC 20000-1,2: Information Technol- ogy-Service Management, Part 1, 2. Geneva, Switzerland: ISO/IEC (2005) 3. Braun, C., Winter, R.: Integration of IT Service Management into Enterprise Architecture. In: Proceeding of SAC’07, Seoul, Korea (2007) 4. ITSM Frameworks and Processes and their Relationship to EA Frameworks. In: A White Paper by: R. Radhakrishnan, IBM Global Technology Services (2008) 5. Valiente, M.-C., Vicente-Chicote, C., Rodriguez, D.: An Ontology-based and Model- driven Approach for Designing IT Service Management Systems. In: Int. Journal of Ser- vice Science, Management, Eng. and Techn., 2(2), pp. 65--81 (2011) 6. Valiente, M.-C., Barriocanal-Garcia E., Sicilia, M.-A.: Applying an Ontology Approach to IT Service Management for Business-IT Integration. In: Knowledge-Based Systems, vol., 28, pp. 76--87 (2012) 7. Official Web-site of the Protocol, Ltd. company, http://www.protocolsoftware.com/hp- openview.php 8. Official Web-site of the BMC Software company, http://www.bmc.com/products/ rem- edy-itsm/solutions-capabilities/it-service-management-suite.html 9. Official Web-site of the OTRS Group company, http://www.otrs.com An Intelligent Approach to Increase Efficiency of IT-Service Management Systems 63 10. Official Web-site of the SourceForge code repository, http://sourceforge.net 11. Tkachuk M. V., Sokol V.Y.: Some Problems on IT-infrastructure Management in Enter- prises: State-of the-Art and Development Perspective. East-European Journal on Advanced Technologies, 48 (6/2), 68–72 (2010) (in Russian) 12. Official Web-site of the Cleverics company, http://www.cleverics.ru/en 13. Saaty, T. L.: Fundamentals of the Analytic Hierarchy Process. RWS (2000). 14. Jabrailova Z. Q.: A Method of Multi-Criteria Ranging for Personnel Management Problem Solution. In: Artificial Intelligent, 56 (4), pp.130–137 (2009) (in Russian) 15. Valiente, M.-C., Barriocanal-Garcia E., Sicilia, M.-A.: Applying Ontology-Based Models for Supporting Integrated Software Development and IT Service Management Processes. In: IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Re- views, 42(1), 61–74 (2012) 16. Lopez-Fernandez, H., Fdez-Riverola, L., Reboiro-Jato, M.: Using CBR as Design Meth- odology for Developing Adaptable Decision Support Systems. University of Vigo, Spain, pp. 123–145 (2011) 17. Prentzas, J., Hatzilygeroudis, I.: Combinations of Case-Based Reasoning with Other Intel- ligent Methods. Int. J. of Hybrid Intelligent Systems, 55–58 (2009) 18. Rodríguez-García, D., Barriocanal, E., Alonso, S., Nuzzi, C.: Defining Software Process Model Constraints with Rules Using OWL and SWRL. J. of Soft, Eng., Knowl. Eng., 20(4), 533–548 (2010) 19. Prieto, A. E., Lozano-Tello, A.: Use of Ontologies as Representation Support of Work- flows. J. Network and Systems Management, 17(3), 309–325 (2009) 20. Kang, D., Lee, J., Choi, S., Kim, K.: An Ontology-based Enterprise Architecture. J. Expert Systems with Applications, 37(2), 1456–1464 (2010) 21. Pansa, I., Reichle, M., Leist, C., Abeck, S.: A Domain Ontology for Designing Manage- ment Services. In: Proc. 3d Int. Conf. on Advanced Service Computing, pp. 11--18 (2011) 22. Litvin V.: Multi-Agent Decision Support Systems Based on Precedents that Use of Adap- tive Ontology. Artificial Intelligent, 54(2) 24–33 (2009) (in Ukrainian) 23. Chung, Н.-S., Kim, J.-M.: Learning Ontology Design for Supporting Adaptive Learning in e-Learning Environment. In: IPCSIT-2012, vol. 27, Singapore, pp.148–152 (2012) 24. Suh, H., Lee, J.: Ontology-Based Case-Based Reasoning for Engineering Design. Design Research Group Manufacturing Engineering Lab (2008) 25. Boursas, L.: Efficient Technical and Organizational Measures for Privacy-Aware Campus Identity Management and Service Integration. In: Proc. EUNIS’06, Tartu, Estonia (2006) 26. Knittl, S., Hommel, W.: SERVUS@TUM: User-Centric Service Support and Privacy Management. In: J.-F. Desnos., Y. Epelboin (eds.) EUNIS’07, Grenoble, France (2007) 27. Brooks, P.: Metrics for IT-service Management. Van Haren Publishing (2006) 28. Official Web-site of the International Data Corporation (IDC), http://www.idc.com Refining an Ontology by Learning Stakeholder Votes from their Texts Olga Tatarintseva1,2 and Vadim Ermolayev1 1 Department of IT, Zaporozhye National University, 66 Zhukovskogo st., 69063, Zaporozhye, Ukraine tatarintseva@znu.edu.ua, vadim@ermolayev.com 2 Satelliz, 158 Lenina st., P.O. Box 317, 69057, Zaporozhye, Ukraine Abstract. This paper reports on our experiments evaluating the improvement of OntoElect approach to ontology refinement in the case study with the ICTERI Scope Ontology. OntoElect is based on collecting and assessing the commit- ment of domain knowledge stakeholders for ontological refinement offerings. We report the improvement with respect to the previous results. Our first ex- periment evaluates the change in the quality of ontology due to the involvement of domain knowledge stakeholders in semantic annotation of their papers, com- pared to the previous study in which the annotations were done by knowledge engineers. Our second experiment checks if the result became better after the introduction of the automated term extraction from the full texts of ICTERI pa- pers. Extracted terms are compared to the manual annotations. The results of the experiments verify the proposed ontology changes and are further used for the ICTERI Scope ontology refinement. Keywords. ICTERI Scope ontology, ontology engineering, domain knowledge stakeholder, term mining, evaluation, refinement Key terms. KnowledgeEngineeringMethodology, SubjectExpert, SubjectDo- main, Metric, Ontology 1 Introduction Maintaining an ontology in its lifecycle that fits all the requirements of the subject domain stakeholders is a complicated task in ontology engineering which does not have a complete solution so far. The problem is to a large extent in devising a meth- odology for ontology refinement that enables a complete and timely account for those requirements and maps them to the updated revision of the ontology. One complica- tion is that the stakeholders who own the requirements need to be committed to pro- vide their inputs for ontology refinement. Furthermore, those inputs need to be meas- Refining an Ontology by Learning Stakeholder Votes from their Texts 65 ured and applied correspondingly to the utility of their contribution and in a harmo- nized way to ensure the consistency and validity of result. This paper reports on the improvement of our OntoElect approach for iterative on- tology refinement [1, 2]. The approach has been proposed in [1] using the allusion of elections in which different “ontology offerings” compete for the commitment of the pool of the relevant domain knowledge stakeholders being the “electorate”. OntoElect has been basically validated in an experiment reported in [2] where the approach was detailed by offering voting metrics for the ICTERI ontology built and refined itera- tively based on semantic annotation of the pool of papers of ICTERI 2011 conference. The results of our previous experiment suggested several important technical as- pects [2] for improving OntoElect methodology as a whole and the accuracy of our measurements in particular. Some of those aspects have been addressed in the work reported in this paper. Firstly, our previous experiment was based substantially on the manual annotation of papers. A knowledge engineer assigned key terms or suggested missing terms based on her personal interpretation of the abstract of a paper. By that we mimicked voting by paper authors while keeping them free of extra annotation effort. The low- lights of this approach to annotation were that: Domain knowledge stakeholders (paper authors) were in fact not involved in the workflow and therefore not motivated to be committed to the resulting ontology re- finement The quality of semantic annotations we obtained has been perceived as fairly low because (a) done by a knowledge engineer who is not a subject expert with respect to the annotated paper; (b) the source for this work was just an abstract, not a pa- per, and its meaning has been interpreted by a knowledge engineer. To overcome those shortcomings we first decided to involve the authors more ac- tively by requesting that they themselves semantically annotate their submissions to ICTERI 20121. It has also been considered as promising to refine the approach by automated extraction of terms from the papers authored by our subject domain knowledge stakeholders. Here we present the results of our experiments which checked how these two refinements helped improving the quality and adequacy of the ICTERI Scope ontology. Further, the document corpus used in the previous experiment was fairly small in size for assuring reliable judgements about the opinion of the stakeholder community. For improving on that we continued the collection of ICTERI papers which has been extended by all papers of ICTERI 2012. We first repeat the previous experiment [2] based however on the document corpus of ICTERI 2012 papers semantically annotated by their authors. We then focus on answering the question about annotation quality by: (i) performing automated term extraction from the full texts of our complete document corpus (ICTERI 2011 and 2012); and (ii) comparing the results of automated term extraction to the outputs of manual semantic annotation. 1 See http://isrg.kit.znu.edu.ua/icteriwiki/index.php/ICTERI-Terms 66 O. Tatarintseva and V. Ermolayev The remainder of the paper is structured as follows. Section 2 briefly reviews the related work in relevant fields. Section 3 outlines the OntoElect approach to ontology refinement and presents the case study dealing with ICTERI Scope Ontology as well as the document corpus at our disposal. Section 4 sets up our experiments by describ- ing the workflow, evaluation metrics, and used tools. Section 5 presents and discusses the results of our experiments. The paper is further concluded and our plans for the future work are outlined. 2 Related Work One of the possible ways to check if a conceptualization of a domain is correct and complete is to evaluate the model against the interpretation of the meaning of the representative set of relevant documents. The document corpus will be relevant and representative if it covers the majority of the views by the domain knowledge stake- holders. Their interpretations may be collected and further analysed for refining the ontology using different techniques which may be sought in several areas of research and development. In this section we briefly outline the related work in the relevant fields of research and refer to our previous publication [1] for a more in-depth and detailed coverage. One of the popular relevant research areas studying how interpretations are col- lected is collaborative or social tagging and annotation. A good survey of the field is [3] where the use of tags for different purposes and associated shortcomings are ana- lysed. Semantic annotation and tagging approaches further refine social tagging tech- niques by offering the collections of terms that are taken from taxonomies, folksono- mies, or thesauri [4]. Hybrid approaches for collaborative tagging and annotation aiming at the enrichment of seed knowledge representations by a user community are reported for example in [5]. One of the promising approaches focused, besides collecting interpretations or sub- jective conceptualizations, on motivating more people to take part in developing or refining ontologies is offering a game with a purpose to intended users. Following this approach, ontology development or refinement can be implicitly embedded in a game software. There ontology elements are created, updated, and validated implicitly in the background [6]. Gaming approach has also been tried for evaluating how well ontological specifications fit to the interpretations of random users (FACTory Game by Cycorp, http://game.cyc.com/). Several game scenarios have been developed [5] for ontology building and refinement, ontology matching, annotating content using lightweight ontologies. Those are similar to our OntoElect approach. Both approaches offer possibilities to identify whenever users start to agree on and share commitment to certain ontological items. Social and gaming approaches that involve the direct participation of human stake- holders are complemented by the plethora of research results in automated knowledge extraction or ontology learning. This strand of research involves the stakeholders indirectly – through making use of their professional outputs, like authored texts. A comprehensive survey of the techniques used to learn ontologies from texts is [7]. In Refining an Ontology by Learning Stakeholder Votes from their Texts 67 the second experiment we present in this paper only term extraction using the Ter- Mine tool [8] has been performed. Yet one more important aspect in developing or refining an ontology is the re-use of the other ontologies or their most relevant parts to the developed ontology. In this context the ontology meaning summarization approach [#] makes good sense for helping an ontology engineer choose the most relevant and valuable parts for re-use. The approach is based on detecting the “key concepts” of an ontology under analysis which best characterize its meaning. The key concepts are determined using a combi- nation of criteria from lexical statistics, taxonomy graph analysis, and popularity based on a number of hits. Especially in using the popularity and coverage metrics, this approach coincides well with our approach (OntoElect). OntoElect is however used not for summarizing but refining an ontology based, among other things, on assessing the coverage and popularity of the Key Terms. Besides that the mechanisms of obtaining the measures are different. From the other hand, OntoElect does not yet consider ontology re-use as one of important mechanisms for refinement. Hence, combining some features of [9] in OntoElect may be enriching. 3 ICTERI Case Study The idea of OntoElect approach [1] was inspired by public election campaigns. Just as the leader in a public election campaign gets the major part of the electorate’s com- mitment to win, the extent of the domain knowledge stakeholders' commitment hints about the quality and completeness of the ontology. Following this allusion, the votes of the domain knowledge stakeholders for alternative ontology offerings are collected and used as the measure of their commitment. The ontology offering that collects the biggest share of votes could therefore be considered as the best and most complete. In our case study the OntoElect approach is applied for refining the ICTERI Scope Ontology in the iterative ontology engineering experiment. Our domain knowledge stakeholders are the authors of ICTERI papers. Ontology offerings in the reported work are the strucrtural contexts2 in the five thematic areas of the ICTERI scope of- fered to the authors for choosing the appropriate key terms to annotate their papers. As this data had to be selected we decided to simulate the opinions of the electorate by annotating the papers of ICTERI 2011 manually. For this we extracted the terms which were specified as the list of ICTERI Key Terms if it was possible. In some cases we had to add Missing Concepts (also called Missing Key Terms) for the pa- pers, if such terms did not exist in the list. So, we received three semantic annotation types: KeyWord – for the key words, which were selected by the authors KeyTerm – for the terms which were selected manually and were found in the list of the ICTERI terms 2 A structural context, as suggested e.g. in [10], is composed of a central concept with all his domain and object properties and the concepts connected to the central concept by these ob- ject properties. 68 O. Tatarintseva and V. Ermolayev MissingConcept - for the terms, which did not exist in the list of the ICTERI terms, but were covered during annotation One use of the particular term was considered to be one vote for the selected term. The votes were normalized as frequencies of use. Such information allowed us to measure the popularity of each semantic context, circumscribe the most frequently demanded part of the ontology and make suggestion about the completeness of the ontological offerings. For the papers of ICTERI 2012 we requested that the authors annotate their papers not only using the freely chosen key words, but also using the terms found in ICTERI scope ontology. As the result the corpus for the futher analysis was increased. The data provided by the authors can be accepted as more authentic than that which was obtained ourselves, as they are the real domain experts for the field they study. The analysis of the received data is presented in Section 5. But even when we use the information presented by the authors, and the results of our manual annotation we can't guarantee that this information is accurate enough for applying it to the ontology refinement process. To obtain the experiment we needed to have results received in different ways because we wanted to achieve the impartial assessment of OntoElect approach. Before applying the results in ontology refining process we decided to check them with freely available tool for text mining. For our experiment we chose one of the services provided by the National Centre for Text Mining (NaCTeM). As reported in the official website of NaCTeM3 it is the first publicly-funded text mining centre in the world. It provides text mining services in response to the requirements of the UK academic community. NaCTeM is operated by the University of Manchester. 4 Experimental Set-up and Tools To control the results of the experiment we have to understand which main questions we are going to answer after its realization and how to measure these results. Our measurable objectives for the experiment have been formulated as follows [2]: Does the ontology fit to the requirements of the subject experts in the domain? The fitness of the ontological offering will be measured as a ratio of the average frequency of use of the available Key Terms (positive votes) to the similar for the missing Key Terms (negative votes). Special attention will be paid to the freely cho- sen key words that are identical to the available Key Terms. Those will be considered as extra positive votes for the semantic context of the Key Term. Is there a particular part in the ontology that is the most important for the stake- holders? The importance of an ontology fragment comprising particular concepts will be measured as frequency of use of these concepts (positive votes). Fragments of differ- ent importance will also be presented as percentiles. 3 Official web site of National Centre for Text Mining http://www.nactem.ac.uk/ Refining an Ontology by Learning Stakeholder Votes from their Texts 69 Is there a part in the ontology that could be dropped as the stakeholders do not really require it? Similarly to importance, these ontology fragments will be outlined using low fre- quency of use percentiles. What would be a most valuable addition to the ontology that will substantially improve stakeholders’ commitment to it? The papers have been annotated using missing Key Terms and freely chosen key- words. Those missing Key Terms that are frequently used will form the core of this effective extension. If some of the keywords are also used frequently by the authors they may become good candidates for the inclusion in the effective extension as well. Special attention will be paid to the freely chosen key words that are identical to the missing Key Terms. Those will reinforce the votes on the addition to the ontology. The flow of activities has been organized in three consecutive phases as presented in Fig. 1. The description of each phase in details is presented in [2]. Conference - Annotate using Management System ICTERI MediaWiki ICTERI Scope ICTERI Semantic MediaWiki (EasyChair) Ontology concepts - Parse - Collect missing ICTERI Key Terms - Add Wiki mark-up concepts - Import to ICTERI - Collect free key Wiki words Semantic annotations (votes) Missing Key Terms PHASE 1 PHASE 2 - 2011: ontology engineer Submissions Data ICTERI paper Corpus pages Legend: - Activity is performed using the developed - Analyse votes tool using SMW queries - Activity is performed manually ICTERI papers PHASE 3 - ontology Keywords engineer Fig. 1. The workflow for processing ICTERI papers and collecting stakeholders’ votes repeated according to [2] At phase 1 we have extracted the semi-structured information about the papers ac- cepted for ICTERI 2012 and transformed these into the collection of paper articles in the ICTERI Wiki. At Phase 2 we extracted the freely chosen KeyWords and the KeyTerms from the ICTERI Scope ontology assigned by the authors and added these to the semantic annotations of the papers. In several cases we detected considerable meaning gaps between the extracted key words and Key Terms when annotated the papers manually. Therefore, we opted to add the missing Key Terms to the corre- sponding semantic annotations. As a result of this Phase the semantic relationships between the pages of Category:Paper and the pages of Category:Concept have been specified as semantic properties. These semantic properties allowed us to receive all the measurements planned for the evaluation experiment. These measure- ments have been done using different Semantic MediaWiki queries at Phase 3. 70 O. Tatarintseva and V. Ermolayev Compared to the previous year experiment [2], we automated the extraction of the frequency of use statistics which made the process less error prone and faster. For that the SMWAskAPI4 extension of the Semantic MediaWiki has been used. This exten- sion supports semantic queries of #ask and enables the use of the corresponding API for executing Semantic MediaWiki ask queries. Each page of the ICTERI Wiki uses semantic tagging. An example of the Semantic properties specified for pages in Category:Paper is given in Fig. 2. For our analysis we used the pages from Category:Paper and Cate- gory:Workshop with the property hasPublicationYear equal to 2012, namely the values of the semantic properties hasKeyWord, hasKeyTem, and MissingConcept. Fig. 2. Semantic properties for the ICTERI Wiki page in the Category:Paper The scripts for analyzing these values were coded in Python. Some steps were also implemented using shell scripting. As outputs we have received: The list of KeyWords, KeyTerms, and MissingConcepts for each article, if they were defined The overall number of the papers according to the values of the properties hasPub- licationYear, and the selected Category The number occurrences of each KeyWord, KeyTerm and MissingConcept. The analysis and discussion of the results is given in Section 5. To perform our second experiment we applied the Term Management System named TerMine, which identifies key phrases in text. It uses C-value [8], a domain- independent method for automatic term recognition (ATR) which combines linguistic and statistical analyses with the emphasis on the statistical part. The linguistic analy- sis enumerates all candidate terms in a given text by applying part-of-speech tagging, extracting word sequences based on adjectives/nouns, and stop-list. The statistical 4 See the description of the SMWAskAPI on http://www.mediawiki.org/wiki/Extension: SMWAskAPI Refining an Ontology by Learning Stakeholder Votes from their Texts 71 analysis assigns a candidate term to a termhood by using the following four character- istics: The occurrence frequency of the candidate term The frequency of the candidate term as part of other longer candidate terms The number of these longer candidate terms The length of the candidate term The data corpus for this term extraction and analysis was the merge of the pools of ICTERI 20115 and ICTERI 20126 papers published in the respective proceedings, and consisted of 63 papers. The papers from both proceedings volumes have been merged in a single file and uploaded for processing by TerMine. The workflow for the second experiment is pictured in Fig. 3. ICTERI 2011 File proceedings upload Term extraction using Statistical results ICTERI TerMine for further analysis proceedings ICTERI 2012 proceedings Fig. 3. The workflow for conducting the second experiment for term mining and analysis The data processed in the pipeline is illustrated by the example of a single paper [1] in Fig. 4. The results of term mining were provided in several forms. All the terms defined in the text were highlighted by colour markings (upper part of Fig. 4a). The information about the overall number of the terms mined from the text was also given (433 terms listed – in the bottom of Fig. 4a). The terms were also presented in the table view, each preceded with the assigned rank number and followed by the statistical score measure (lower part of Fig. 4a). The rank of a term means the position of each term in the table sorted by the score; the rank values of the terms with the same score are equal. The scores were computed automatically using the Term Recognition tech- nique [8] which uses the information about the frequencies of term occurrence. This approach is essentially a shallow bag of terms extraction technique – therefore the output needs to be post-processed as described using our single paper data example. For this example the number of extracted terms was 433 which is obviously too many. To compare, the authors were advised to assign 3-5 KeyTerms to their papers which best describe its meaning. Among those extracted terms that we needed to sort out were also names, affiliations, cities, etc, which had no semantic relationship to the 5 http://ceur-ws.org/Vol-716/ 6 http://ceur-ws.org/Vol-848/ 72 O. Tatarintseva and V. Ermolayev meaning of the paper. Also, it has been assumed that the terms with a low number of occurrences in text have a negligent semantic contribution and may also be filtered out – so only the higher ranked part of the term list may be considered. Rank Term Score TimesInText 1 ontology engineering 25,85 27 2 knowledge representation 14,90 16 Term list 3 subject expert 11 12 purifying 4 intended user 10 12 5 ontology element 7 7 6 psi suite 6 6 6 stakeholder commitment 6 6 6 semantic technology 6 7 9 active involvement 5 5 9 semantic web 5 5 9 ontological context 5 5 (b) 11 high-ranked terms listed Comparative analysis of achieved results 433 terms listed (a) Statistical results presented by TerMine (c) Selected KeyWords and KeyTerms by the authors Fig. 4. An example of the data processed in the term mining experiment While post-processing the list of mined terms we decided to leave only the terms, which were used more than 5 times, and had score more than 5 points. Applying this threshold returned 11 of 433 terms for the example outlined in Fig. 4., which consti- tutes only 2.54 per cent of the overall number of the mined terms. The manual check of the example however indicates that these 11 high ranked terms indeed contribute most significantly to describing the semantics of the corresponding paper (Fig. 4b). The right part of Fig. 4 allows to compare the result of term extraction (Fig. 4b) with the output of manual semantic annotation (Fig. 4c) for the selected example pa- per. A mechanical comparison reveals substantial difference, which however is not that big after manual mapping of the extracted terms to the concepts of the ICTERI Scope ontology. In fact there is a subset of extracted terms that could be directly mapped into the Key Terms of the ontology: subject expert; ontology engineering (as a methodology). Another group is relevant to the assigned KeyWords: ontology, stakeholder commitment, ontology engineering. Some are synonymic in the context of this paper: stakeholder and intended user. Some represent the meaning which is too fine-grained for a semantic annotation: ontology element. And, which is most impor- tant, some are the new valid candidates for the inclusion into the ICTERI Scope on- tology: knowledge representation, semantic technology, semantic web. 5 Results and Discussion In this section we present the results of the experiment. The set up of all its stages is described in Section 4. The discussion of the experiment results is structured along the measurable items. Refining an Ontology by Learning Stakeholder Votes from their Texts 73 The frequency of use diagrams (Fig. 5 and 6) are built in regard to the total amount of the papers and the number of occurrences of a particular term. Similar work was reported in our previous publication [2]. In it we described the mechanism of ranging. The actual experiment is based on the previous results. But they changed as the document corpus we are working with has increased and the data to work with has changed. As we consider OntoElect approach in the case study of iterative refinement of the ICTERI Scope Ontology such changes are greatly impor- tant. The diagrams which show these changes are shown below: For the KeyWords which were selected by the authors manually (only those, which were chosen by at least two authors, Fig. 5) For the KeyTerms, which were selected from the ICTERI ontology terms (Fig. 6) Algebraic programming Competence Competence Formation Process Computational experiment Computer Science Design Development Distance learning Distance learning system Dynamic Systems Educational Process E-learning Evaluation Feedback Grid Informatics UsedKeyWord Information Information and communication Information System Information technology Informatization Insertion Modeling Internet IT Specialist Labor Market Object-oriented Approach Ontology Ontology Engineering Partial Predicates Reachability Software Software Design Software development Sorting algorithms Teaching Process Verification Virtual Class 0 1 2 3 4 5 6 7 8 9 10 FrequencyOfUse, % ICTERI 2011 KeyWord ICTERI 2012 KeyWord Fig. 5. The frequency of use of the freely chosen KeyWords We did not provide a frequency of use diagram comparing the Missing Key Terms because the difference in the results of 2011 and 2012 is tiny and could be neglected. We provided the comparison analysis for KeyWords and Missing Key Terms lists (Fig. 7). To compute the range of use of each term we divided each frequency of use index by the frequency value of the most popular term, which range of use was taken as 100 per cent. These terms are not the part of the ontology, but are the most possible candidates. 74 O. Tatarintseva and V. Ermolayev Academia Agent Approach BioInspiredApproach BusinessIntelligence Capability Characteristic Collaboration Competence CompetenceFormationProcess Computation Computing Cooperation Data DecisionSupport Dev elopment Didactics Env ironment Ex perience FormalMethod Grid GridInfrastructure HighPerformanceComputing ICTComponent ICTEnv ironment ICTTool ImmaterialArtifact Industry InformationCommunicationTechnology InformationTechnology Infrastructure Integration Intelligence Know ledgeEngineeringMethodology Know ledgeEngineeringProcess Know ledgeEv olution Know ledgeManagementMethodology UsedKeyTerm Know ledgeManagementProcess Know ledgeRepresentation Know ledgeTechnology Know ledgeTransfer LaborMarket LinkedData MachineIntelligence Management MathematicalModel MathematicalModeling Method Methodology Metric Model ModelBasedSoftw areDev elopmentMethodology MultiAgentSy stem Need Nominativ eData PSI-ULO:Object PSI-ULO:Process PSI-ULO:ProcessPattern Qualification Quality AssuranceProcess Qualty AssuranceMethodology Reasoning Requirement Research SemanticWebServ ice Socially InspiredApproach Softw areComponent Softw areEngineeringProcess Softw areSy stem SpecificationProcess StandardizationProcess SubjectDomain SubjectEx pert TeachingMethodology TeachingPattern TeachingProcess Technology Tool VerificationProcess WebServ ice 0 4 8 12 16 20 24 28 FrequencyOfUse, % KeyTerm 2011 KeyTerm 2012 Fig. 6. The frequency of use of available KeyTerms Communication ComputationalEx perime ConceptualModeling Constraint DataSy nchronization DecisionMaking DistanceLearningSy stem Distribution EducationalTool ELearningSy stem Ev idenceModeling GridComputing Term InformationSy stem InsertionModeling IntensionalizedData Logic Negotiation Ontology Ontology Engineering Quality ManagementSy st Serv iceNegotiation Softw are Softw areDev elopment TaskAssignment Verification 0,00 20,00 40,00 60,00 80,00 100,00 Range, % KeyWord MissingConcept Fig. 7. The range of use for Missing Key Terms and KeyWords Refining an Ontology by Learning Stakeholder Votes from their Texts 75 The results which we received at the previous step can be merged and we can re- ceive the new version of the potential ontology offering. But before doing this we will look into the results of the text mining experiment. Using TerMine tool for automation of the knowledge mining process we received some interesting results. The overall number of the found terms was 8487. All of the selected terms were graduated and received the position in the rank table. Studying the results it is obvious that the number of the terms selected by the data mining tool is too big. It was decided to leave in the rank table only those terms which have the score of 10 and more. The popularity of the terms which were picked up is evident. The total number of such terms is 157. It is easy to count that it makes up less than 2% of all the terms proposed by the tool. After refining the list and deletion of the superfluous information only 140 terms left. To compare the frequencies of use provided by the TerMine and our own calcula- tions we decided to use percentage method (similar to that used for building the dia- gram in Fig. 7). We took the maximal value for each group of concepts as 100 per cent and divided it by the frequency of use value of a particular term. As a result each term got the value, called the range, which could be compared with the ranges of the other terms. We analyzed the three groups of mined term matches to the: (i) Key- Words; (ii) MissingConcepts; and (iii) KeyTerms. As the number of the Missing Con- cepts is not too big we decided to combine them with the Key Words in the diagram (Fig. 8). AbstractQuantumAutomaton AlgebraicExpression ComputerScience ConceptualModeling ConstraintProgramming DeadlockAnalysis DistanceLearning DistanceLearningSystem EducationalProcess EducationalProgrammingLangua ElectronicDocumentManagement FinalDemand FormalVerification GrossOutput InformationResource InformationSystem Term InsertionModeling LaborMarket LogicalSystem MultiagentSystem OntologyEngineering ProgrammingLanguage ProofAssistant QualityManagementSystem R&DManagement RussianMusic SemanticWeb ServiceNegotiation SocialNetwork SoftwareDevelopment SoftwareEngineering VariabilityManagement VirtualLaboratory 0 10 20 30 40 50 60 70 80 90 100 Range, % TermByTool KeyWord MissingConcept Fig. 8. The range of use for terms detected by tool, Key Words and Missing Concepts. Range values were normalized by the frequency of use of the highest scored extracted term (100) 76 O. Tatarintseva and V. Ermolayev After the automated search of the identical terms in the lists of KeyTerms and terms mined by tool we discovered that some of them were missed as the search did not use the rules of common sense and the relations described in the ontology. For example, according to the ICTERI Scope ontology the term Integration subsumes to PSI-ULO:Process. Knowing this fact we understand that the term IntegrationProcess is just the same as the term Integration. But this match is not obvious for the simple search and will not be detected. Therefore, to find the matches in the lists of KeyTerms and terms mined by the tool we decided to undertake a more careful analysis. We scanned the list of the KeyTerms for matching the ToolTerms manually. Besides for this process we used the whole pool of 8487 terms mined by the tool. The result is pictured in the diagram (Fig. 9). Agent BusinessIntelligence Characteristic Collaboration Competence CompetenceApproach Computation Computing DecisionSupport Development Didactics Environment FormalMethod Grid HighPerformanceComputing IctComponent ICTEnvironment IctTool ImmaterialArtifact Industry InformationCommunicationTechnology InformationTechnology Infrastructure Integration Intelligence KnowledgeEngineeringMethodology KnowledgeEngineeringProcess KnowledgeEvolution KnowledgeManagementProcess KnowledgeRepresentation KnowledgeTransfer LaborMarket Term LinkedData MachineIntelligence MachineIntelligence Management MathematicalModel Method Methodology Metric Model ModelBasedSoftwareDevelopmentM MultiAgentSystem NominativeData – datum PSI-ULO:Object PSI-ULO:Process QualityAssuranceProcess Reasoning Requirement Research Semantics SemanticWebService SoftwareComponent SoftwareEngineeringProcess SoftwareSystem SpecificationProcess StandardizationProcess SubjectDomain SubjectExpert TeachingPattern TeachingProcess Technology Tool VerificationProcess WebService 0 10 20 30 40 50 60 70 80 90 100 Range, % TermByTool KeyTerm Fig. 9. The range of use for the KeyTerms and the terms extracted by the TerMine tool Refining an Ontology by Learning Stakeholder Votes from their Texts 77 6 Concluding Remarks and Future Work The paper reported on the experiment evaluating the improvement of OntoElect ap- proach to ontology engineering in the case study with the ICTERI Scope Ontology. In particular, the approach has been used to evaluate the validity of the papers’ annota- tion by drawing knowledge stakeholders to their own papers’ annotation and by studying their quality using voting for full texts. The experiment consisted of two stages. The first one was based on the compari- son analysis of the papers presented during international conferences ICTERI 2011 and ICTERI 2012. The second one was dedicated to performing automated term ex- traction from the full texts and comparing the results of automated term extraction to the outputs of manual semantic annotation. Achieved results stress the important parts of the ontology and those which are less popular among the authors. The comparison analysis of the first experiment shows how the situation changed during two years. The KeyWords and MissingCon- cepts which have high frequency of use values, especially if they are named in both lists, are the first candidates to become the new part of the ontology. The second experiment shows which ontological offerings agree with the terms mined by tool and which numeric characteristics these matches have. The terms ex- tracted by the tool and their matches with the KeyWords and MissingConcepts, which have range more than 50 per cent, are also good candidates to be added to the ontol- ogy. Besides, the degree to which the extracted terms match the KeyWords and KeyTerms indicate about the adequacy of paper annotation. Overall the overlap be- tween the meanings of the extracted terms and the KeyTerms measures the range of so to say the similarity in the meanings of the papers within the corpus and the onto- logical offerings aimed at covering these meanings. The quantitative results of our experiments still need to be processed and analyzed more thoroughly before deciding about the implementation of the changes to the ontology. Besides that several other aspects still need to be researched in our future work. Firstly, the document corpus used in the case study, though growing, is still not very big to allow robustly applying the majority of traditional knowledge extraction techniques. At the moment it could only be stated that the information we have now is enough to prove the concept – i.e. the validity of the approach based on the assess- ment of and account for domain knowledge stakeholder opinions, implicitly reflecting their needs. After applying the refinements suggested by the stakeholders, the ontol- ogy still needs to be evaluated and validated using other methods. Secondly, in this paper we reported about only a partial and shallow way of ex- tracting knowledge from paper texts. A possible refinement to this preliminary solu- tion could be sought in using a hybrid iterative knowledge extraction workflow that incrementally adds ontology elements to the “ontology learning layer cake (c.f. [11])”. 78 O. Tatarintseva and V. Ermolayev References 1. Tatarintseva, O., Ermolayev, V., Fensel, A.: Is Your Ontology a Burden or a Gem? – To- wards Xtreme Ontology Engineering. In: Ermolayev, V. et al. (eds.) Proc. 7-th Int. Conf. ICTERI 2011, Kherson, Ukraine, May 4-7, 2011, CEUR-WS.org, vol-716, ISSN 1613- 0073, 65–81, online (2011) 2. Tatarintseva, O., Borue, Yu., and Ermolayev, V.: OntoElect Approach for Iterative Ontol- ogy Refinement: a Case Study with ICTERI Scope Ontology. In: Ermolayev, V. et al. (eds.) Proc. 8-th Int. Conf. ICTERI 2012, Kherson, Ukraine, June 6-10, 2012, CEUR- WS.org, vol-848, ISSN 1613-0073, 244, online (2011) 3. Gupta, M., Li, R., Yin, Z., Han, J.: Survey on Social Tagging Techniques. SIGKDD Ex- plorations 12(1), 58–72 (2010) 4. Uren, V., Cimiano, P., Iria, J., Handschuh, S., Vargas-Vera, M., Motta, E., Ciravegna, F.: Semantic annotation for knowledge management: Requirements and a survey of the state of the art. Science. Services and Agents on the World Wide Web 4(1), 14–28 (2006) 5. Hunter, J., Khan, I., Gewrber, A.: HarvANA – Harvesting Community Tags to Enrich Col- lection Metadata. In: Paepcke A, Borbiha J, Naaman M (eds.) 8th ACM/IEEE-CS Joint Conference on Digital Libraries, 147–156. ACM New York, New York (2008) 6. Siorpaes, K., Hepp, M.: Games with a Purpose for the Semantic Web. IEEE Intelligent Systems 23(3), 50--60 (2008) 7. Wong, W., Liu, W., and Bennamoun, M.: Ontology learning from text: A look back and into the future. ACM Comput. Surv., 44(4), Article 20, 36 pages. DOI=10.1145/2333112.2333115 (September 2012) 8. Frantzi, K., Ananiadou, S. and Mima, H.: Automatic recognition of multi-word terms. Int. J. of Digital Libraries 3(2), pp.117-132 (2000) 9. Peroni, S., Motta, E., d'Aquin, M. Identifying Key Concepts in an Ontology, through the Integration of Cognitive Principles with Statistical and Topological Measures. In: Proc. 3rd Asian Semantic Web Conference (ASWC 2008), Dec 08-11, 2008, Bangkok, Thailand (2008) 10. Ermolayev, V., Copylov, A., Keberle, N., Jentzsch, E., Matzke, W.-E.. Using Contexts in Ontology Structural Change Analysis.. In: Ermolayev, V., Gomez-Perez, J.-M., Haase, P., Warren, P, (eds.) CIAO 2010, CEUR-WS, vol. 626 (2010) 11. Wong, W., Liu, W., Bennamoun, M.: Ontology Learning from Text: a Look Back and into the Future. ACM Comput. Surv., 44(4), Article 20, 36 p., http://doi.acm.org/10.1145/2333112.2333115 (2012) Answering Conjunctive Queries over a Temporally-Ordered Finite Sequence of ABoxes Sharing one TBox Natalya G. Keberle ? Zaporozhye National University, Dept. of Information Technologies 66, Zhukovskogo str. 69063, Zaporozhye, Ukraine nkeberle@gmail.com Abstract. Ontology-based data access (OBDA) assumes that data in a database are mediated with a conceptual layer, available for clients and hiding data storage details. Ontologies are good candidates for such a conceptual layer presentation, whereas databases are good for huge data storage. One of the interesting appli- cations of OBDA is checking a finite set of constraints defined in some language against a temporally-ordered sequence of ABoxes sharing one TBox, where each constraint is considered as a conjunctive query. Presented is one algorithm of con- junctive query answering for such a language, proved are its termination, sound- ness and completeness. Keywords. Ontology-based data access, temporal conjunctive query language, description logic knowledge base Key terms. KnowledgeEvolution, KnowledgeManagementProcess, DecisionSup- port 1 Introduction Ontology-based data access (OBDA) [1] assumes that data in a database are mediated with a conceptual layer, available for clients and hiding data storage details. Ontologies are good candidates for such a conceptual layer presentation, whereas databases are good for huge data storage. The benefits from combination of databases and knowledge bases are as follows: – database management is the mature field of research, there is a lot of commercially and freely avaliable DBMSs, showing good-to-excellent performance on large datasets. It is an obvious place to store the assertional part of some knowledge base, i.e. an ABox; – a TBox often requires a reasoning support to deduce additional assertions, axioms and to check the consistency of a knowledge base. ? The work done during the research visit to Dresden University of Technology, sponsored by The Ministry of Education and Science, Youth and Sport of Ukraine 80 N. G. Keberle At the same time, employing such an approach is rather challenging due to signifi- cant differences between relational database systems and ontology languages, based on Description Logics, such as OWL. At first, relational databases adopt a closed-world semantics, i.e. all facts that are not explicitly stated to be true are assumed to be false. In contrast, OWL is based on an open world semantics which does not requre one to fix the truth value of every fact and is more similar to an incomplete database. Second, relational databases are unaware of the intensional part of a knowledge base (called a TBox). Research has been done so far in the OBDA field considers only one ABox stored in a data source, that is an actual set of assertions on individuals and their pairs. However, real applications show that ABox is changing over time. The examples of such dynamic systems can be easily found in practice: environmental conditions, air traffic load, com- puter system load and performance, health control for the people suffering from serious diseases. Therefore, in some applications of situation awareness [2], there is a need to store an archive of ABoxes, keeping ABoxes actual at different time points. Temporal logics are often used as the means to formulate constraints a dynamic system should obey during its work. The main results of the paper are: – for the point-based linear finite time structure elaborated is the language of unions of temporal conjunctive queries, which allows to evaluate atemporal unions of con- junctive queries at different time points; – proposed is an algorithm of answering a union of temporal conjunctive queries, which harnesses set-theoretic operations on atomic queries answer sets. Proved are its termination, soundness and completeness; The paper is organized as follows: in the next section a language of unions of temporal conjunctive queries is introduced and the definitions of main reasoning tasks available for such a query language are presented. In the section 3 the algorithm of answering temporal unions of conjunctive queries is presented and illustrated in examples. The section 4 is dedicated to the proofs of logical properties of the algorithm. The section 5 discusses the related work in the field of conjunctive queries answering. 2 Conjunctive Queries: Syntax, Semantics Assume there is a knowledge base K = (A, T ), where T is a set of concept axioms (a TBox), A is a set of assertional axioms (an ABox). Fix a language of a knowledge base to ALC [3]. An interpretation of K, named I, is a pair (·I , ∆), where ∆ is a domain of individuals, obeying unique name assumption (UNA) and ·I is an interpretation func- tion, which assigns every concept C a set C I ⊆ ∆, every atomic role R a binary relation RI ⊆ ∆ × ∆, and every individual name a an individual aI ∈ ∆. Assertional axioms are C(a) - concept assertion and R(a, b) - role assertion. Query answering is the extension of a well known task of instance checking: given a knowledge base K and an assertion α. Check whether this assertion is entailed by an ABox of K. Answering Conjunctive Queries over a Temporally-Ordered ... 81 2.1 Conjunctive Queries Basics Let Vars(Q) be a set of all distinguished and non-distinguished variables which appear in a query Q, let Inds(Q) to denote the set of all individual names which appear in query Q and Terms(Q) to denote the set of all terms in Q, i.e. Vars(Q) ∪ Inds(Q). Let us formally define conjunctive queries and Boolean conjunctive queries for a well- elaborated language ALC [3]. Definition 1 (Conjunctive query, Union of conjunctive queries). Let x, y, c are re- spectively tuples of distinguished variables (answer variables), of non-distinguished variables and of individual names, and t, t1 , t2 are terms in Terms(Q). A conjunctive query (CQ) is an expression of the form conj(x, c) = ∃y.q1 ∧ . . . ∧ qm , where qi ::= C(t) | r(t1 , t2 ) A Boolean conjunctive query is a CQ without answer variables. A union of conjunctive queries (UCQ) is a disjunction of conjunctive queries (CQs) of the form Q(x) = {x | conj1 (x, c) ∨ . . . ∨ conjn (x, c)} Example 1. The example of a query asking about all students that attend some courses and take some exams could be as follows: Q(x) = {x | ∃y.takeCourse(x, y) ∧ takeExam(x, y)} This query can be modified to a Boolean query by substitution of x with an individ- ual name: Q(x) = {| y.takeCourse(”Eldora”, y) ∧ takeExam(”Eldora”, y)} We use |Q| to denote the size of Q - the number of symbols required to build the query. The arity of a query will be the number of answer variables in the query. If all terms in Q are individual names, we say Q is ground. We write Q(c) for a query whose answer variables x are substituted by c, Q(x) for a conjunctive query and simply Q for a Boolean conjunctive query. Sometimes we write x1 , . . . , xn instead of x, and similarly for y and c. Given an ALC-knowledge base K = hT , Ai, an interpretation I satisfies a query Q(x) iff the interpretation function can be extended to the variables in Q(x) in such a way that I satisfies every term in Q(x). A query Q(x) is true w.r.t. K (written K |= Q) iff every interpretation that satisfies K also satisfies Q. Definition 2 (Query answering, query entailment). Given a query Q(x) with a tuple of answer variables x, and a knowledge base K, a tuple of individuals c with the same arity of x is an answer for Q in K if I |= Q(c) for every model I in K. Given a Boolean conjunctive query Q, and a KB K, query entailment is a task to decide whether K |= Q if I |= Q for every model I of K. Given a conjunctive query Q(x), a tuple of individuals a, and a KB K, query an- swering is a task to decide whether a is an answer for Q(x) in K. 82 N. G. Keberle 3 Temporal Conjunctive Queries Let K = hT , (Ai )0≤i≤n i be a knowledge base with a sequence of ABoxes sharing one TBox. Let’s describe a query language extending the language of conjunctions of positive existential formulae built from query atoms. Having in mind linear temporal logic LT L (see e.g. [4]), this language allows for the following temporal operators: # (next), #− (previous), U (until), S (since). Definition 3. Temporal conjunctive query (TCQ) Ψ is an expression tconj(x, c) = ∃y.q1 ∧ . . . ∧ qm , where qi ::= ϕ | Ψ ϕ ::= C(t) | r(t1 , t2 ) Ψ ::= ϕ | Ψ1 ∧ Ψ2 | #− Ψ | #Ψ | Ψ1 S Ψ2 | Ψ1 U Ψ2 and C is a concept description, r is a role name, t, t1 , t2 are terms in Terms(Q). Derived temporal modalities like 3− (sometimes in the past), 2− (always in the past), 3, 2 can be defined in a usual way (see, e.g. [4]). Example 2. A query asking about students who had defended their thesis some time ago and had been ex-matriculated since then is expressed as follows: Q(x) = {x | ∃y.3− Student(x) ∧ exM atriculated(x) S hasDef ended(x, y)} The semantics of the TCQ is defined as follows. Definition 4. A total function π : Terms(Ψ ) → ∆ is a binding for a query Ψ in an interpretation I, if π(a) = a for all individuals a ∈ dom(π), and the validity I, π |= Φ for atemporal CQ ϕ is defined inductively: I, π |= C(t) iff I |= C(π(t)) I, π |= r(t1 , t2 ) iff I |= r(π(t1 ), π(t2 )) I, π |= ϕ1 ∧ ϕ2 iff I, π |= ϕ1 and I, π |= ϕ2 I, π |= ∃yϕ iff ∃e ∈ ∆ : π 0 = π[y/e] and I, π 0 |= ϕ where the notation π[y/e] represents a binding π extended with π(y) = e if y is not in the domain of π, otherwise the original value for y is replaced by e. The validity for a TCQ Ψ and a KB K = hT , (Ai )0≤i≤n i is extended as follows: K, i, π |= ϕ iff ∀I |=T Ai .I, π |= ϕ K, i, π |= Ψ1 ∧ Ψ2 iff K, i, π |= Ψ1 and K, i, π |= Ψ2 K, i, π |= #Ψ iff i < n and K, i + 1, π |= Ψ K, i, π |= #− Ψ iff i > 0 and K, i − 1, π |= Ψ K, i, π |= Ψ1 U Ψ2 iff ∃k, i ≤ k ≤ n : K, k, π |= Ψ2 and ∀j, i ≤ j < k : K, j, π |= Ψ1 K, i, π |= Ψ1 S Ψ2 iff ∃k, 0 ≤ k ≤ i : K, k, π |= Ψ2 and ∀j, k < j ≤ i : K, j, π |= Ψ1 Answering Conjunctive Queries over a Temporally-Ordered ... 83 For a binding π, if, for every i, ∀I |=T Ai .I |= K, this implies I |= Ψ . If such evaluation exists, we write K |= Ψ and we say π is a match for Ψ in K. For a tuple of individuals c1 , . . . , cn mapped to a tuple of answer variables x1 , . . . , xn we say c1 , . . . , cn is a certain answer for Ψ in K, iff K |= Ψ [x1 , . . . , xn /c1 , . . . , cn ]. We denote a set of certain answers for Ψ as Ans(Ψ ). Definition 5. A union of temporal conjunctive queries (UTCQ) Q(x) is a disjunction of temporal conjunctive queries (see Definition 3): Q(x) = {x | tconj1 (x, c) ∨ . . . ∨ tconjn (x, c)} 4 Answering a Union of Temporal CQs Over a Sequence of ABoxes 4.1 Algorithm Answering a Union of Temporal CQs The idea of answering a UTCQ against a set of ABoxes is to use temporal operators as the means of detection of time points at which atemporal CQs should be evaluated. Due to the recursive nature of such temporal operators as S , U we have to store all the ABoxes and the values of particular CQs depending on the operator. Intuitively, given Ψ = #φ at a time point i, φ is evaluated at the time i + 1, and so on. To be able to combine certain answers obtained from different CQs of one TCQ, let’s take a closer look at the nature of certain answers. A certain answer to a CQ φ is a binding π of each xi ∈ x (distinguished variables) to some individual name that appeared in the KB K, such that in all models of K, K |= φ(π(x)). There could be more than one certain answer for a CQ φ, so further we shall consider a set of certain answers for a query φ(x). A correspondent set of matches for φ actually produces some k-ary relation, where k is the arity of a CQ φ. A certain answer to a UCQ Φ is a combination of answers of CQs in Φ, i.e. c1 ∪. . .∪ cn where n is a number of CQs in Φ. For such a combination there are two possible situations: (i) disjuncts φj1 , φj2 in UCQ Φ use pairwise disjoint sets of distinguished variables (i.e. there are no common distinguished variables in two arbitrary disjuncts of Φ); (ii) some disjuncts can share (some) distinguished variables of each other. To deal with sets of certain answers (that are actually relations) we adopt two operators of relational algebra, namely, × - a cross-product, and ./ - a natural join. Cross-product operator × [5] is used for the case (i). Definition 6. Given two bindings π1 : (x1 , . . . , xn ) → ∆, π2 : (y1 , . . . , ym ) → ∆, their cross-product, π1 × π2 is a binding π : X → ∆ where x, y are free variables that do not have any variables in common, and X = (x1 , . . . , xn , y1 , . . . , ym ). Join operator ./ [5] is used for the case (ii) to join two bindings w.r.t. common variables in both bindings are mapped to same constant. Definition 7. Given two bindings π1 : (x1 , . . . , xn , z) → ∆, π2 : (y1 , . . . , ym , z) → ∆, their join, π1 ./ π2 is a binding π : X → ∆ where x, y, z are free variables and X = (x1 , . . . , xn , y1 , . . . , ym , z), iff every common variable z must be mapped to same constant c ∈ ∆. 84 N. G. Keberle A correspondent binding for Φ will be: for (i) π = πφ1 × . . . × πφn , and for (ii) π = πφ1 ./ . . . ./ πφn The following theorems show applications of × and ./ for bindings. Theorem 1. Given a formula Φ = φ1 ∧ φ2 , where φ1 , φ2 are CQ formulas, a binding π = π1 ./ π2 is a match for Φ iff bindings π1 , π2 are matches for φ1 , φ2 . Proof. It is true based on the definition of the join operator. Theorem 2. Given a formula Φ = φ1 ∨ φ2 , where φ1 , φ2 are CQ formulas, a binding π = π1 × π2 is a match for Φ iff the binding π1 is a match for φ1 or the binding π2 is a match for φ2 . Proof. The ⇒ direction is trivial. For ⇐ direction, assume π1 : (x1 , . . . , xn , z) 7→ ∆, π2 : (y1 , . . . , ym , z) 7→ ∆, and they are matches for φ1 and φ2 . From the nature of disjuntion, we know that formula Φ is satisfiable if either φ1 or φ2 is satisfiable. That means if there is a match for either φ1 or φ2 . If z appears in both of the CQs, renaming z in one of the CQs does not change the validity. Therefore, we have that π : (x1 , . . . , xn , z, y1 , . . . , ym , z 0 ) 7→ ∆ which is obtained from π1 × π2 is indeed a match for Φ. Now, consider a structure of a certain answer to a union of temporal CQs (UTCQ). It is a combination of answers to a (set of) TCQ obtained at proper time points, referred by temporal operators used in a UTCQ. One more thing to be explicitly addressed is that known algorithms for conjunctive query answering, such as [6], [7], are focused on query entailment, that is, a Boolean conjunctuve query answering. This means that the task of answering an atemporal CQ requires a preprocessing step, and considers a Boolean conjunctive query answering algorithm as a black box. Namely, at the preprocessing step a candidate match (a tuple of variables, substituted via some binding π with a tuple of individuals c) is submitted to a Boolean conjunctive query answering engine, and that engine decides if such a candidate match is a certain answer. Now, present the algorithm informally. Eliminate temporal operators in a UTCQ The important step in our algorithm is to get a normal form where the temporal operators are used to decide at which time point should each CQ be evaluated. This is done by iterative applecation of the expansion rules Table 1. For every # and #− operators, we just shift one point forward and backward. By doing these, we obtain a query that is in normal form whose atoms are UCQs, except some recursion atom which is a TCQ. Replace the boolean operators with relational operators Every conjunction is replaced with join and every disjunction - with cross-product. Retrieve an answer Use an arbitrary query answering algorithm [6–9] as a black-box approach to compute a set of answers for a given UCQ. If the original UTCQ con- tains U , S , 2, 2− , 3, 3− , the normal form of the transformed query might contain a recursion. In such case, if the time point i < 0 or i > n, then return ∅, else evaluate CQs with leading # or #− for U , S and for derived modalities (if any). Answering Conjunctive Queries over a Temporally-Ordered ... 85 Algorithm 1 Decide Q Input: K = {T , (Ai )0≤i≤n } : knowledge base consists of a TBox and a sequence of ABoxes at a time point i, 0 ≤ i ≤ n Q : a UTCQ Output: Ans(Q, i) - a set of certain answers to Q at time point i Ans(Q, i) = ∅ repeat Ans0 = Ans(Q, i) if Q = T CQ1 ∨ T CQ2 then Ans(Q, i) = Ans(T CQ1 , i) × Ans(T CQ2 , i) end if if Q = T CQ1 ∧ T CQ2 then Ans(Q, i) = Ans(T CQ1 , i) ./ Ans(T CQ2 , i) end if if Q = #− T CQ then if i=1 then Ans(Q, i) = ∅ else Ans(Q, i) = Ans(T CQ, i − 1) end if end if if Q = #T CQ then if i=n then Ans(Q, i) = ∅ else Ans(Q, i) = Ans(T CQ, i + 1) end if end if if Q = T CQ1 U T CQ2 then if i=n then Ans(Q, i) = Ans(T CQ2 , i) else Ans(Q, i) = Ans(T CQ2 , i) × (Ans(T CQ1 , i) ./ Ans(Q, i + 1)) end if end if if Q = T CQ1 S T CQ2 then if i=1 then Ans(Q, i) = Ans(T CQ2 , i) else Ans(Q, i) = Ans(T CQ2 , i) × (Ans(T CQ1 , i) ./ Ans(Q, i − 1)) end if end if until Ans0 = Ans(Q, i) return Ans(Q, i) 86 N. G. Keberle Table 1. Equivalence rules of LTL for future operators. Taken from [4] idempotent rule 2Ψ ≡ 22Ψ 3Ψ ≡ 33Ψ Ψ1 U (Ψ1 U Ψ2 ) ≡ Ψ1 U Ψ2 (Ψ1 U Ψ2 ) U Ψ2 ≡ Ψ1 U Ψ2 commutativity rule 2 # Ψ ≡ #2Ψ 3 # Ψ ≡ #3Ψ #(Ψ1 U Ψ2 ) ≡ (#Ψ1 U # Ψ2 ) distributivity rule 2(Ψ1 ∧ Ψ2 ) ≡ (2Ψ1 ∧ 2Ψ2 ) 3(Ψ1 ∨ Ψ2 ) ≡ (3Ψ1 ∨ 3Ψ2 ) #(Ψ1 ∧ Ψ2 ) ≡ (#Ψ1 ∧ #Ψ2 ) #(Ψ1 ∨ Ψ2 ) ≡ (#Ψ1 ∨ #Ψ2 ) ((Ψ1 ∧ Ψ2 ) U Ψ3 ) ≡ ((Ψ1 U Ψ3 ) ∧ (Ψ2 U Ψ3 )) (Ψ1 U (Ψ2 ∨ Ψ3 )) ≡ ((Ψ1 U Ψ2 ) ∨ (Ψ1 U Ψ3 )) temporal recursion rule 2Ψ ≡ Ψ ∧ #2Ψ 3Ψ ≡ Ψ ∨ #3Ψ Ψ1 U Ψ2 ≡ Ψ2 ∨ (Ψ1 ∧ #(Ψ1 U Ψ2 )) absorption rule 323Ψ ≡ 23Ψ 232Ψ ≡ 32Ψ In Table 1, presented are some equivalence rules in LTL, used in Algorithm 1. For the illustration of Algorithm 1 consider some examples, assuming that Algo- rithm 1 returns a set Ans of answers to Ψ at the time point i. Example 3. Given a TCQ query Ψ = #− (Φ1 U Φ2 ) at a point i. Ans(Ψ, i) = Ans #− (Φ1 U Φ2 ), i ∗\move back one pointby #− = Ans Φ1 U Φ2 , i − 1 ∗\expansion rule for U = Ans Φ2 ∨ (Φ1 ∧ #Ψ ), i − 1 ∗\transforming ∨ = Ans Φ2 , i − 1 × Ans Φ1 ∧ #Ψ, i − 1 ∗\transforming ∧ = Ans Φ2 , i − 1 × Ans Φ1 , i − 1 ./ Ans # Ψ, i − 1 ∗\move forward one point by # = Ans Φ2 , i − 1 × Ans Φ1 , i − 1 ./ Ans Ψ, i If i = 0 in Ans Φ2 , i − 1 and Ans Φ1 , i − 1 , then the evaluation of Ψ is the empty set. A more complex example is given below. Answering Conjunctive Queries over a Temporally-Ordered ... 87 Example 4. Given a TCQ query Ψ = 3− (Φ1 U Φ2 ) at a point i. Ans(Ψ, i) = Ans 3− (Φ1 U Φ2 ), i ∗\expansion rule for 3− = Ans (Φ1 U Φ2 ) ∨ #− Ψ, i ∗\transforming ∨ = Ans Φ1 U Φ2 , i × Ans #− Ψ, i ∗\move back one point by #− = Ans Φ1 U Φ2 , i × Ans Ψ, i − 1 ∗\We substitute Φ1 U Φ2 with Ψ 0 Expansion rule for U = Ans Φ2 ∨ (Φ1 ∧ #Ψ 0 , i × Ans Ψ, i − 1 ∗\transforming ∨ = Ans Φ2 , i × Ans Φ1 ∧ #Ψ 0 , i × Ans Ψ, i − 1 ∗\transforming ∧ = Ans Φ 2 , i × Ans Φ1 , i ./ Ans # Ψ 0 , i × Ans Ψ, i − 1 ∗\move forward one point by # = Ans Φ2 , i × Ans Φ1 , i ./ Ans Ψ 0 , i + 1 × Ans Ψ, i − 1 If i = n in Ans Ψ 0 , i + 1 , then Ans Ψ 0 , i + 1 is evaluated to the empty set. There is one thing we have to ensure that in the intersection of two sets of answers for conjunction of CQs a certain answer is obtained, i.e. there is a common answer for both CQs, otherwise an empty set. One way to do this is to retrieve all answers for each CQ and then to intersect them to get some common answers. Another way is first to retrieve an answer of a UCQ and then to decide if this answer is also the answer for the other CQs in the conjunction, otherwise keep retrieving and deciding until there is no more answer obtained. The former way is preferred since it offers more practical solution. It means that we can deal with it using relational algebra operators or database language operators. 4.2 Termination, Soundness, Completeness of the Algorithm Definition 8. (UTCQ closure). Given a temporal union of conjunctive queries Q, its closure set, Cl(Q) is a set of query atoms closed under the following rules if q ∈ Q then q ∈ Cl(Q) if #− q ∈ Q then q ∈ Cl(Q) if #q ∈ Q then q ∈ Cl(Q) if q1 ∧ q2 then q1 , q2 ∈ Cl(Q) if q1 ∨ q2 then q1 , q2 ∈ Cl(Q) if q1 U q2 then q1 , q2 , #(q1 U q2 ) ∈ Cl(Q) if q1 S q2 then q1 , q2 , #− (q1 S q2 ) ∈ Cl(Q) Since a closure set for a UTCQ is finite, Algorithm 1 terminates after a finite number of steps. 88 N. G. Keberle Theorem 3. (Local) termination. Given a UTCQ Q and a knowledge base K = {T , (Ai )0≤i≤n }. Algorithm 1 always terminates. Proof. We can show the local termination inductively. Base case. Any query is also cointained in the closure set of itself. Inductive case. (C(a), r(a1 , a2 )) If we have a query Q which is atomic, then the closure set contains C(a) or r(a1 , a2 ). (#− T CQ) For such query Cl(Q) = {T CQ, #− T CQ}, i.e. evaluated are two el- ements, and in case of i = 0 the value of #− T CQ is known to be ∅, so Algorithm 1 stops after two evaluations. (T CQ1 U T CQ2 ) For such query Cl(Q) = {T CQ2 , T CQ1 , T CQ1 U T CQ2 , # (T CQ1 U T CQ2 )} (T CQ1 S T CQ2 ) For such query Cl(Q) = {T CQ2 , T CQ1 , T CQ1 S T CQ2 , # (T CQ1 S T CQ2 )} Theorem 4. Soundness. If for UTCQ Q its answer set Ans(Q(x), i), obtained with Al- gorithm 1, is not empty, then Q has at least those certain answers that are in Ans(Q, i). Proof. We prove by induction. We start with evaluating non-temporal query, i.e. a query containing no temporal operator. Base case If we have an atomic query in the form of C(a), then using any ap- proach of CQ answering we obtain all the answers for the query Q entailment over K = {T , (Ai )0≤i≤n }. If K |= C(a) and a ∈ Ans(Q(x), i), the function returns a and this value is stored in Ans(Q(x), i). By Definition 4, this result tells us that the individual a is a certain answer to the query C(x) w.r.t. the match π(x) = a. The same result is obtained if we have atomic query in the form of r(a, b). Inductive case can be obtained by Definition 4. Theorem 5. Completeness. If a UTCQ Q has a certain answer ans, then Algorithm 1 shows that this answer is in Ans(Q, i). Proof. By contradiction. Assume that (i) Q(x) has a certain answer ans w.r.t π, (ii) Ans(Q, i) - is a set of certain answers obtained by Algorithm 1, and (iii) ans 6∈ Ans(Q, i). By (i), we know that K |= Q(ans) and that for all time points 0 ≤ i ≤ n in all models I, such that I |= K, I |= Q(ans). By (ii), for Algorithm 1 to return Ans(Q, i) such that ans 6∈ Ans(Q, i) there are several reasons for it. Q is atomic. If Q is atomic, i.e. in the form C(x) or r(x, y), then we know that Ans(Q, i) does not contain ans. This means that there is a model I of a knowledge base K which does not entail Q(ans). But this is a contradiction to our assumption (i). (T CQ1 ∧ T CQ2 ). If Ans(Q, i) does not contain ans, according to Algorithm 1 it means that ans 6∈ Ans(T CQ1 , i) ./ Ans(T CQ2 , i). This, in turn, leads to the existence of a model I of a knowledge base K such that I |= T CQ1 (ans) and I 2 T CQ2 (ans) or vice versa, that contradicts to (i). (T CQ1 ∨ T CQ2 ). If Ans(Q, i) does not contain ans, according to Algorithm 1 it means that ans 6∈ Ans(T CQ1 , i) × Ans(T CQ2 , i). This, in turn, leads to the ex- istence of a model I of a knowledge base K such that either I 2 T CQ1 (ans) or I 2 T CQ2 (ans), that contradicts to (i). Answering Conjunctive Queries over a Temporally-Ordered ... 89 (#− T CQ). If Ans(Q, i) does not contain ans, according to Algorithm 1 it means that ans 6∈ Ans(Q, i − 1). This, in turn, leads to the existence of a model I of a knowledge base K such that I, i − 1 2 Q(ans), that contradicts to (i). (T CQ1 U T CQ2 ). If Ans(Q, i) does not contain ans, according to Algorithm 1 it means that ans 6∈ Ans(T CQ2 , i) × (Ans(T CQ1 , i) ./ Ans(Q, i + 1). This, in turn, leads to the existence of a model I of a knowledge base K such that either I, i 2 T CQ2 (ans) or I, i 2 T CQ1 (ans) and I, i + 1 2 Q, that contradicts to (i). (T CQ1 S T CQ2 ). If Ans(Q, i) does not contain ans, according to Algorithm 1 it means that ans 6∈ Ans(T CQ2 , i) × (Ans(T CQ1 , i) ./ Ans(Q, i − 1). This, in turn, leads to the existence of a model I of a knowledge base K such that either I, i 2 T CQ2 (ans) or I, i 2 T CQ1 (ans) and I, i − 1 2 Q, that contradicts to (i). The proof for the temporal operator # acting in the direction of future can be com- pleted in the same manner. 5 Related Work and Conclusions Transition graphs for a temporal query language answering over a finite set of versions of a database were investigated in [10]. The expressivity of a temporal query language presented is however restricted either to past [11], or to future [10], [12] direction of time. Known are several algorithms for answering unions of conjunctive queries over knowledge bases with static TBox and ABox, for example works of Ortiz [6], Glimm [7], Tessaris [9], Motik [8] should be mentioned. Any of those algorithms could serve as a basis for finding answers to atemporal CQs at particular time points, whereas possible extensions of those algorithms for the application to a sequence of ABoxes is an open question. A language of temporal conjunctive queries with negation, together with the computational and combined computational complexity is intriduced in [13]. Summing up, obtaining benefits from keeping a large evolving ABox of a knowledge base in a database and applying TBox of that knowledge base to obtain missing as- sertional axioms is one of the ways of dealing with complex evolving domains. It is interesting, due to high computational complexity of temporal conjunctive query an- swering in general, to find a balance between the expressivity of a query language and its practical applicability. Acknowledgements The presented results were obtained during the research visit of the author to the Chair of Automata Theory at Dresden University of Technology. The author is grateful to the group of Prof. Franz Baader, and in particular, Eldora, Marcel Lippmann and Anni-Yasmin Turhan for the fruitful discussions and ideas at the stage of early drafts of the paper. References 1. Poggi, A., Lembo, D., Calvanese, D., De Giacomo, G., Lenzerini, M., Rosati R. Linking Data to Ontologies. J. on Data Semantics, X, 133–173 (2008) 90 N. G. Keberle 2. Baader, F., Bauer, A., Baumgartner, P., Cregan,A., Gabaldon,A., Ji, K., Lee,K., Rajarat- nam,D., Schwitter, R. A novel architecture for situation awareness systems. In: Giese, M. and Waaler, A. (eds.) Proc. 18th International Conference on Automated Reasoning with An- alytic Tableaux and Related Methods (Tableaux 2009). LNCS, vol. 5607, pp. 77–92. Springer, Berlin/Heidelberg (2009) 3. Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F. (eds.). The De- scription Logic Handbook: Theory, Implementation, and Applications. Cambridge University Press (2003) 4. Baier, C., Katoen, J.-P. Principles of Model Checking. The MIT Press, Cambridge, Mas- sachusetts, USA (2008) 5. Abiteboul, S., Hull, R., Vianu, V. Foundations of Databases. Addison-Wesley (1995) 6. Ortiz de la Fuente, M.M. Query Answering in Expressive Description Logics Techniques and Complexity Results. PhD Thesis. Technischen Universitt Wien, Fakultt fr Informatik (2010) 7. Glimm, B. Querying Description Logic Knowledge Bases. PhD Thesis. The University of Manchester (2007) 8. Motik, B. Reasoning in Description Logics using Resolution and Deductive Databases. Uni- vesitt Karlsruhe (2006) 9. Tessaris, S. Questions and answers: reasoning and querying in Description Logic. The Uni- versity of Manchester (2001) 10. Lipeck, U.W. Transformation of Dynamic Integrity Constraints into Transaction Specifica- tions. Theor. Comput. Sci., 76(1), pp. 115–143 (1990) 11. Schwiderski, S., Hartmann, T., Saake, G. Monitoring Temporal Preconditions in a Behaviour Oriented Object Model. Data Knowl. Eng., 14(2), pp. 143–186 (1994) 12. Lipeck, U.W., Feng, D. Construction of Deterministic Transition Graphs from Dynamic In- tegrity Constraints. LNCS, vol. 344, pp.166–179. Springer Verlag (1989) 13. Baader F., Borgwardt, S., Lippmann, M. On the Complexity of Temporal Query Answering. Technical report LTCS-Report 13-01. Available at http://lat.inf.tu- dresden.de/research/reports/2013/BaBoLi-LTCS-13-01.pdf (2013) An Adaptive Forecasting of Nonlinear Nonstationary Time Series under Short Learning Samples Elena Mantula1 and Vladimir Mashtalir1 1 Kharkiv National University of Radio Electronics, informatics department Lenin ave., 14, 61166, Kharkiv, Ukraine Mashtalir@kture.kharkov.ua, ElenaMantula@gmail.com Abstract. Methods of nonstationary nonlinear time series forecasting under bounded a priori information provide an interdisciplinary applications area that is concerned with learning and adaptation of solutions from a traditional artifi- cial intelligence point of view. It is extremely difficult to solve this type of problems in its general form, therefore, an approach based on the additive nonlinear auto regressive model with exogenous inputs and implemented on the base of parallel adalines set has been proposed. To find optimal combination of forecasts, an improvement of global random search has been suggested. Keywords. Neural networks, forecasting model, combination of forecasts Key terms. Environment, MathematicalModel 1 Introduction ‘Conscious’ decision making, in all possible varieties, is perhaps the most principal goal of artificial intelligence systems. Necessary ‘creativity’ implies the ability to produce novel solutions which are better than previous ones. The computational tools that assist in decision making should be such that they should take into all aspects of dissimilarity between a priori and a posteriori uncertainty. Uncertainty account is, per se, a manifestation of information deficiency, and relevant information is, on the con- trary, a capacity to reduce uncertainty. An elimination of such rich in content gaps provides groundwork of knowledge engineering and management. In machine intelli- gence, manifold forecasts can be used for knowledge producing. The goal of the paper consists in reasonable (perfectly optimal) combination of forecasts to provide reliable semantic interpretation of achieved results with purpose knowledge generation. Nowadays mathematical forecasting models of the behavior of objects, systems and phenomena in a wide variety of applications are well understood. There is a wealth of publications on this subject. It should be noted that the behavior of the ob- jects is often given in the form of time series. Thus to forecast its behavior a variety of approaches to the analysis of time series can be used. Such approaches can be either traditional statistical methods (regression, correlation, spectral, Box-Jenkins) or adap- tive, based on an exponential smoothing, tuning or learning forecasting models, or 92 E. Mantula and V. Mashtalir intellectual, using vrious neural networks. At present there are many objects (financial, economical, biomedical, etc.), de- scribed by time series containing unknown behavior trends, seasonal components, stochastic and random components, which significantly complicate synthesis of an effective predictive model. This complexity is especially pronounced in the environ- mental monitoring problems [1], where the analyzing time series have in equal meas- ure stochastic and chaotic type of changes, have apparent nonstationarity and are sub- jected to striking changes. In these conditions artificial ccc have proved to be useful tools in the best way [2- 13]. As a rule, they realize so-called NARX-model [14], which has the form ŷ(k ) f (y(k 1),...,y(k nA ), x(k 1),..., x(k nB ) (1) where ŷ(k ) is an estimation of forecasted variable y(k ) at discrete time k 1,2,...; f () denotes certain nonlinear transform which is realized by a neural network; x(k ) is the observed exogenous factor that influences the behavior of y(k ) ; nA , nB are observations memory parameters. Moreover, it is not a matter of available observations insufficiency, since proper- ties of time series (e.g. such indicator as air pollution in ecological forecasting) are changed so often that a neural network does not have time to detect separate station- ary parts. In this connection there is a need to construct based on the neural network approach simplified predictive models for training which require the small enough volume data set. 2 Synthesis of a forecasting model In conditions of input data lack instead of NARX-model (1) it is appropriate to use the so-called ANARX-model introduced in [15, 16] and fully investigated in [17, 18]. In general ANARX-model can be written as ŷ(k ) f1(y(k 1), x(k 1)) f 2 (y(k 2), x(k 2)) ... f max {nA , nB}(y(k nA ), x(k nB )) (2) max {n A , nB } f l (y(k l ), x(k l )) l 1 where original task is decomposed into many local ones with two input variables y(k l ), x(k l ) , l 1,2,..., max {nA ,nB} . For such nonlinear transforms it is quite convenient to use so-called N-adaline (abbr.: adaptive linear element) [19-21] that provide quadratic approximation of the data sequence. Fig. 1(a) demonstrates the architecture of N-adaline and (b) illustrates the architecture of ANARX-model constructed using N-adaline. As we can see, N-adaline represents a generally accepted two-input adaline with a nonlinear preprocessor formed by three blocks of the product ( ) and the evaluator of the quadratic combination in the form f l (y(k l ), x(k l )) wl 0 wl1 y(k l ) wl 2 y 2 (k l ) wl 3 y(k l )x(k l ) wl 4 x 2 (k l ) wl 5 x(k l ) An Adaptive Forecasting of Nonlinear Nonstationary Time Series … 93 where each N-adaline contains 6 synaptic weights wlp , l 1,2,..., max {n A , nB } , p 0,1,...,5 . As a matter of fact, ANARX-model is formed by two lines of delay ele- ments z 1 and max {n A ,nB } parallel learned N-adaline. (a) N-adaline (b) ANARX models Fig. 1. N-adaline and ANARX models based on N-adalines Each from N-adalines is configured with any of the linear learning algorithms [22], however, it is clear that a limited amount of a priori information requires the use of time-optimal procedures. As such can be, for example, adaptive-multiplicative modi- fication of Kachmarz adaptive algorithm [23], which assumes in this case the form y(k ) wlT (k 1)l (k ) wl (k ) wl (k 1) 2 l (k ) (3) l (k ) where wl (wl 0 ,wl1,wl 2 ,wl 3 ,wl 4 ,wl 5 )T ; l (k ) (1, y(k l ), y 2 (k l ), y(k l )x(k l ), x 2 (k l ), x(k l ))T ; 0 2 , 0 are some algorithm parameters selected on the base of empirical reasons. If the data sequences are ‘contaminated’ by perturbations, instead of the one-step algorithm (3) it is profitably to apply procedures that provide filtering of perturbation and at the same time they have to be suitable for using in non-stationary conditions. It should be noted that modification of the recursive least squares method on a sliding window can be used [24]. The traditional estimation method of least squares on the window with s observations has the form k 1 k wl (k ) ( k s 1 l ()l ()) k s 1 l ()y() T and recurrent one can be presented as 94 E. Mantula and V. Mashtalir Ps (k 1)l (k )Tl (k )Ps (k 1) P(k ) Ps (k 1) , 1 Tl ( k )Ps ( k 1 )l ( k ) P(k )l (k s)lT (k s)P(k ) Ps (k ) P(k ) , (4) 1 l (k s)P(k )l (k s ) ps (k ) ps (k 1) l (k )y(k ) l (k s)y(k s), wl (k ) Ps (k )ps (k ). We also note that if the algorithm (3) is in fact time-optimal gradient procedure, then the algorithm (4) is produced by Gaussian-Newton optimization procedure. 3 Optimal combination of forecasts In real conditions the choice of the forecasting model structure is not a trivial task, especially that the same time series can be effectively described by a variety of differ- ent models. Also, the value of the lag orders nA , nB remains unknown what makes it necessary to consider a set of competing models, and nonstationarity of analyzed series necessitates the use of various learning algorithms (in this case, (3), (4)) with different values , ,s . Thus, there arises a set of forecasts of the same process, from which we have to select the best. To find the best forecast it is possible to use sufficiently effective approach, based on the optimal combination of forecasts [25], under which optimal in the sense of given criterion J c linear combination is searching for a set of existing forecasts of the same series ŷ j (k ), j 1,2,..., m m ˆy(k ) c j ˆy j (k ) (5) j 1 where the parameters of the combination satisfy the condition of unbiasedness m c j 1. (6) j 1 In [25], an analytical approach to the weights c j finding in (5) by optimizing the sum of squared errors criterion for forecasting with the constraints (6) is proposed. The use of one-step squared forecast errors criterion leads to the estimation ŷ j (k ) c j (k ) m . j ŷ (k ) j 1 However, combining of the analytical parameter estimates can be obtained under application of standard quadratic criterion J c solely that specified by linearity of it derivatives so the solution of the problem reduces to solving a system of linear equa- tions. At the same time for practitioners as a rule assess of the quality of forecasting An Adaptive Forecasting of Nonlinear Nonstationary Time Series … 95 using the residual variance is unconvincing, and therefore characteristics allowing to estimate the accuracy in percentage are generally used, such as the criterion of a minimum of absolute percentage error N y(k ) ˆy(k ) MAPE 100% (7) k 1 y(k ) or maximum of the determination coefficient N (y( k ) ˆy( k ))2 R 2 (1 N k 1 N )100% . (8) 1 2 (y( k ) y(k )) k 1 N k 1 It is obvious that in this case analytical estimations can not be obtained, and the use of gradient optimization procedures becomes more complicated due to sufficiently complex properties of functions (7), (8). In this connection the use of genetic algo- rithms is proposed in [26, 27]. Though such algorithms can find the global extremum, their own distinctive features are numerical awkwardness, they have a set of free pa- rameters necessary defined by the user and at last it should be mentioned a low rate of convergence. Therefore, notice should be taken to more an efficient approach based on the random search [28] and its adaptive modifications. The most simple procedure, which allows to search for a global extremum, is walking random global search [28]. In general, this procedure is a statistic extension of the regular gradient search, and to provide the global search, random disturbance (k ) superimposes on character on a gradient movement what creates stochastic walking mode. In the continuous case, the gradient method of minimization (maximization) of the goal function J c (t ) is reduced to the motion of a point c(t ) c1(t ),..., c j (t ),..., cm (t ) in m -dimensional space of adjustable parameters by a force directed toward the anti- gradient. The trajectory of movement by antigradient c(t ) leads tuning process to a singular point. If starting point c(0) belongs to an attraction region of global extremum then the corresponding trajectory will lead to a global minimum of the function J c (t ) . But if the point c(0) does not belong to this region, the movement in the direction of anti- gradient will result in a local minimum, from which it is impossible to get out under the influence of forces directed by antigradient. Exactly because, it is helpful to use a random mechanism. Random shocks may help point c(t ) to overcome the barrier that separates the local minimum in which the learning process hit from the area in which the objective function J c(t ) could further decrease. Under the influence of ‘skew’ toward anti-gradient and random shocks such movement is determined by the differ- ential equation dc(t ) c J c (t ) (t ) dt where (t ) is m -dimensional normal random process with zero mathematical expec- 96 E. Mantula and V. Mashtalir tation, delta-figurative autocorrelation function and components variance 2 ; is parameter of step, c denotes gradient vector. It should be emphasized that for func- tion (7) the components of the gradient can acquire the value 1 or 1 . Generally, this algorithm provides searching for a global extremum [29]. Searching for global extremum can be speed up by reasonable selection of 2 and an adaptation during this process can be introduced in two ways. First, under intro- ducing inertia in the learning process, it is possible to get a search similar to the movement by the method of ‘heavy ball’ [30]. Such movement is described by the differential equation d 2c(t ) dc(t ) 2 b J c (t ) (t ) (9) dt dt where b is shockproofing coefficient (the more b , the less manifest of inserted iner- tia). On time series processing, i.e. in discrete time, procedure (9) corresponds to the learning algorithm, described by the second order difference equation [31] c(k ) c(k 1) bc(k 2) (k )c J c (k ) (t ) (10) coinciding under b 0 with walking random search. It is interesting to note that (10) is none other than the ARX- model of the second-order. Second, the adaptation in the process of global search can be introduced by random process (t ) control, for example, d (t ) dJ c (t ) (t ) 2 H(t ) (11) dt dt where 0 is a autocorrelation parameter of random process (t ) ; H(t ) is a vector of flat random noise. Introduce a modification of (11) in the discrete form (k ) (1 )(k 1) (k ) J c (k ) 2 H(k ) (12) where is the symbol of the first difference (discrete analogue of the derivative). As it is easily seen from (11), (12), the optimization of the search process can be performed by appropriate selection of parameters , and 2 , since each of them acts on the certain properties of the search. Indeed, variation of the autocorrelation parameter determines the rate of the process (k ) decay that regulates its relations with the past. Thus, one can have an influence upon a search making it more or less dependent on the previous history if it is necessary. Some few words of comment are desirable for parameters and interaction explanation. If the search step determines the intensity of accumulation of learn- ing experience, then characterizes the level of this experience forgetting during the search. In this sense, these parameters are antagonistic. If in general 0 and there is no forgetting the vector (k ) increases in the direction of anti-gradient. Variance An Adaptive Forecasting of Nonlinear Nonstationary Time Series … 97 of the process (k ) is determined by the value 2 and intensity of the flat random noise disturbance H(k ) . If 2 is sufficiently large then search may become unstable and, at low value, global properties are worsening. Thus, the use of a modified global random search allows simplify significantly the process of linear combination cj (k ), j 1,2,..., m tuning. 4 Conclusion The problem of nonstationary nonlinear time series forecasting under bounded a priori information has been considered. An approach based on the additive nonlinear auto regressive model with exogenous inputs and implemented on the base of parallel adalines set has been proposed. To find optimal combination of forecasts, an im- provement of global random search has been suggested. Distinctive feature of the approach is the computational simplicity and high performance attained by significant reducing the number of adjustable parameters. References 1. Zanetti, P.: Air Pollution Modelling. Van Nostrand Reinhold, New York (1990) 2. Reich, S.L., Gomez, D.R., Dawidowski, L.E.: Artificial Neural Network for the Identifica- tion of Unknown Air Pollution Sources. Atmosphere Environment, Vol. 33, pp. 3045-3052 (1999) 3. Perez, P., Trier, A., Reyes, J.: Prediction of PM2.5 Concentration Several Hours in Ad- vance Using Neural Networks in Santiago, Chile. Atmospheric Environmental, Vol. 34, pp. 1189–1196 (2000) 4. Niska, N., Hiltunen, T., Karppinen, A., Ruuskanen, J., Kolehmanen, M.: Evolving the Neural Network Model for Forecasting Air Pollution Time Series. Engineering Applica- tion of Artificial Intelligence, Vol. 17, 159–167 (2004) 5. Corani G.: Air Quality Prediction in Milan: Feed-Forward Neural Networks, Pruned Neu- ral Networks and Lazy Learning. Ecological Modeling, Vol. 185, pp. 513–529 (2005) 6. Athanasiadis, I.N., Karatzas, K.D., Mitkas, P.A.: Classification Techniques for Air Quality Forecasting. In: Brewka G., Coradeschi S., Perini A. and Traverso P. (eds.): Proc. 17th European Conference on Artificial Intelligence, IOS Press, Amsterdam, 4.1–4.7 (2006) 7. Perez, P., Reyes, J.: An Integrated Neural Network Model for PM10 Forecasting. Atmos- pheric Environment. Vol. 40, pp. 2845–2857 (2006) 8. Lira, T.S., Barrozo, M.A.S., Assis. A.J.: Air Quality Prediction in Uberlandia, Brasil, Us- ing Linear Models and Neural Networks. In: Plesu V., Agachi P. (eds.): Proc. 17th Euro- pean Symp. on Computer Aided Process Engineering, Elsevier, Amsterdam, pp. 1–6 (2007) 9. Kurt, A., Gulbagci, B., Karaca, F., Alagha, O.: An Online Air Pollution Forecasting Sys- tem Using Neural Networks. Environmental International, Vol. 34 (2008) 592–598 10. Carnevale, C., Finzi, G., Pisoni, E., Volta, M.: Neuro-Fuzzy and Neural Network Systems for Air Quality Control. Atmospheric Environmental, Vol. 43, pp. 4811–4821 (2009) 98 E. Mantula and V. Mashtalir 11. Nagendra, S.M., Shiva, Khare M.: Modelling Urban Air Quality Using Artificial Neural Network. Clean Technical Environmental Policy, Vol. 7, pp. 116–126 (2005) 12. Aktan, M, Bayraktar, H.: The Neural Network Modeling of Suspended Particulate Matter with Autoregressive Structure. Ekoloji, Vol. 19, No. 74, pp. 32–37 (2010) 13. Esau, I.: On Application of Artificial Neural Network Methods in Large-Eddy Simulations with Unresolved Urban Surfaces. Modern Applied Science, Vol. 4, No. 8, 3–11 (2010) 14. Nelles, O.: Nonlinear System Identification: From Classical Approaches to Neural Net- works and Fuzzy Models. Springer, Berlin (2001) 15. Chowdhury, F.N., Input-Output Modeling of Nonlinear Systems with Time-Varying Lin- ear Models. IEEE Trans. on Automatic Control, Vol. 45, No. 7, pp. 1355–1358 (2000) 16. Kotta, Ü., Sadegh, N.: Two Approaches for State Space Realization of NARMA Models: Bridging the Gap. Mathematical and Computer Modeling of Dynamical Systems, Vol. 8, No. 1, pp. 21–32 (2002) 17. Belikov, J., Vassiljeva, K., Petlenkov, E., Nomm S.: A Novel Taylor Series Based Ap- proach for Control Computation in NN–ANARX Structure Based Control of Nonlinear Systems. In: Proc. 27th Chinese Control Conference, Beihang University Press, Kunming, pp. 474–478 (2008) 18. Vassiljeva, K., Petlenkov, E., Belikov, J.: State-Space Control of Nonlinear Systems Iden- tified by ANARX and Neural Network Based SANARX Models. In: Proc. WCCI 2010 IEEE World Congress on Computational Intelligence, IEEE CSS, Piscataway, pp. 3816– 3823 (2010) 19. Pham, D.T. Liu, X.: Modeling and Prediction Using GMDH Networks of Adalines with Nonlinear Preprocessors. Int. J. System Science Vol. 25, No. 11 (1994) 1743–1759 20. Pham, D.T. Liu, X.: Neural Networks for Identification, Prediction and Control. Springer, London (1995) 21. Rudenko, O.G., Bodyanskiy, Ie.V.: Artificial Neural Networks. SMIT, Kharkov (2005) (in Russian) 22. Haykyn, S.: Neural Networks: A Comprehensive Foundation. Prentice Hall. Inc., New York (1999) 23. Raybman, N.S., Chadeev, V.M.: Creating of Manufacture Process Models. Energiya, Mos- cow (1975) (in Russian) 24. Perelman, I.I.: Operative Identification of Control Objects. Energoatomizdat, Moscow (1982) (in Russian) 25. Sharkeya, A.J.C.: On Combining Artificial Neural Nets. Connection Science, Vol. 8, No. 3, pp. 299–314 (1996) 26. Zagoryjko, N.G.: Empirical Prediction. Nauka, Novosibirsk (1979) (in Russian) 27. Zagoryjko, N.G.: Applied Approach of Data Analysis, р. 264 (1999) (in Russian) 28. Rastrigin, L.A.: Statistical Search Technology. Nauka, Moscow (1968) (in Russian) 29. Rastrigin, L.A.: Systems of Extremal Control. Nauka, Moscow (1974) (in Russian) 30. Polyak B.T.: Introduction into Optimization. Nauka, Moscow (1983) (in Russian) 31. Bodyanskiy, Ie. V., Rudenko, O.G.: Artificial Neural Networks: Arhitectures, Learning, Applications. TELETEX, Kharkov (2004) (in Russian) Application of an Instance Migration Solution to Industrial Ontologies Maxim Davidovsky1, Vadim Ermolayev1 and Vyacheslav Tolok 2 1 Department of IT, Zaporozhye National University, 66 Zhukovskogo st., 69600 Zaporozhye, Ukraine m.davidovsky@gmail.com, vadim@ermolayev.com 2 Department of Mathematical Modeling, Zaporozhye National University, 66 Zhukovskogo st., 69600 Zaporozhye, Ukraine vyacheslav-tolok@yandex.ru Abstract. The paper presents the results of evaluating the software solution for ontology instance migration problem in the use case involving the ontologies used in construction industry – freeClass and eClassOWL with the Bau- DataWeb dataset representing the individuals. Ontology instance migration problem is understood as a sub-problem of ontology alignment. Our methodol- ogy assumes (semi-) automated iterative process possibly involving a human for validating the results. The process consists of the two steps: (1) schema- based mappings discovery done by the agent-based matcher software; and (2) ontology instance transformation and migration according to the discovered mappings done by the ontology instance migration engine software. The evaluation experiment has been conducted in two phases and yielded results of acceptable quality in terms of precision, recall, and f-measure. Keywords. Ontology alignment, industrial application, ontology instance mi- gration, evaluation experiment. Key terms. Industry, Integration, Interoperability, KnowledgeManagement- Process, AgentBasedSystem, 1 Introduction Ontologies are being widely adopted today in the academic world and increasingly attract the attention of researchers and practitioners in information technology and knowledge-based system development and applications. Many authors, e.g. [1], argue that ontologies constitute the substance of the advanced technologies for solving the problems of interoperability, communication, and cooperation between different ap- plications within the same environment. Indeed, ontologies conceptualize semantics of the domains within a discourse that are common for interoperating systems. Thus, 100 M. Davidovsky, V. Ermolayev and V. Tolok ontologies serve as a bridge for “understanding” between the systems or their parts. Despite that, application of ontologies in industry still faces several problems. The first group of problems concerns the inertia that is typical for the process of application of advanced technologies in industry, e. g. [2]. This paper reflects the views of the practitioners who have witnessed incomprehension and opposition in trying to solve customer problems using ontologies. These problems are attempted to be resolved through establishing a closer contact with domain knowledge stake- holders and their more active involvement in the development of ontologies – e.g. [3]. Another complementary and important activity is lowering the effort for developing ontologies which could be done via providing the tool support for domain experts taking part in ontology development. The other important stratum of problems in the application of ontologies in indus- try is related to the re-use of existing large industrial knowledge bases, collections, or ontologies and the exploitation of those knowledge assets within large enterprise in- formation systems (IS). Obviously it is obligatory to provide stable interoperation of ISs in industrial settings to prevent substantial errors in maintenance, production, and sales. However the use of ontologies per se doesn’t completely solve interoperability issues as it essentially raises heterogeneity problems to a higher level [4]. So, the methods for aligning ontologies need to be provided to understand and explicitly specify semantic mappings between these different conceptualisations. Industrial ontologies as a rule contain large quantity of individuals (or instances). Hence, an important and typical sub-problem of ontology alignment in industrial settings is on- tology instance migration that is the process of transferring instances between aligned ontologies. The numbers of the individuals in industrial knowledge bases is very often high, so their manual alignment is not feasible. Therefore it is important to provide the tools that at least partially automate the process of alignment and do that with the quality acceptable for industries. Another important aspect of the use of ontologies in industrial settings is that industrial ISs are often distributed and belong to autonomous business entities. In such settings using intelligent software agents for ontology alignment and ontology instance migration in particular becomes an attractive imple- mentation pathway. The remainder of the paper is organized as follows. In section 2 we provide a clas- sification of industrial applications of ontology alignment types of problems and de- scribe some typical use cases. Based on this classification, we analyze industrial re- quirements to ontology alignment solutions. Section 3 outlines our software solution for ontology instance migration problem. Section 4 reports about the setup and results of our evaluation experiments. Finally the conclusion is given and the plans for the future work are outlined. 2 Related Work, Applications, and Use Cases Surveys of ontology alignment for a wide range of applications can be found in [5], [6], [7]. Applications of agent-based ontology alignment and respective requirements Application of an Instance Migration Solution … 101 are analyzed in [8]. This paper focuses on industrial applications of ontology alignment in broad and ontology instance migration as its sub-problem [8]. The following industrial application categories may be outlined that require ontology alignment and instance migration solutions. 1. Industrial knowledge-driven simulation models. Simulation models are widely used in industry ([9], [10], [11]). The complexity level of modern simulation systems requires the use of knowledge-based models. This knowledge may be related to various branches of science, engineering disciplines, can contain different models satisfying different demands. This requires the use of ontologies and related activities such as ontology merging and alignment. 2. Industrial information systems in the context of Semantic Web and eCommerce. eCommerce is a type of industry where buying and selling of product or service is conducted over electronic systems such as the Internet and other computer networks. In order to perform such an exchange of business information, this informa- tion must contain product (or service) descriptions. As a rule such information is pre- sented in the form of product or service ontology [12]. Good examples of such on- tologies are [13], [14], [15]. When a business process involves more than one party or in a case of using more than one source respective ontologies obviously have to be aligned. This situation is also typical for The Semantic Web where ontologies along with intelligent software agents are the main pillars [16]. 3. Integration and interoperability of heterogeneous enterprise ISs. Today information ecosystem of a modern enterprise as a rule contains numbers of applica- tions from different vendors and used for different purposes. In order to effectively use these heterogeneous applications together with distributed data and knowledge repositories they must be integrated into a single system. Likewise implementation and deployment of new software solutions must be reconciled and integrated with legacy software systems. Here ontologies may be used not only as domain knowledge representation models, but also as mediators for integration of heterogeneous applica- tions. Enterprise integration attracts substantial interest of research community and a number of solutions are proposed (e.g., [17], [18]). 4. Knowledge sharing and migration between enterprise ISs. Interaction and cooperation of modern enterprises often implies knowledge sharing and migration. In such a way enterprise may enrich and harmonize their knowledge assets. In this case knowledge models obviously must be reconciled and aligned. This issue is not widely addressed in literature (but some early efforts, e.g. [19], are described) as it usually requires some (combination of) typical ontology management activities (such as on- tology evolution and knowledge sharing – please see some details above). Each of the application categories sets up some requirements to specific alignment methods used within the category. Due to the wide variety of ontologies used in industry it is difficult to set up a detailed set of requirements for ontology matching methods. These requirements may substantially vary depending on ontology size and structure so we outline only the most general observations. We analyze the require- ments for ontology alignment regardless to industrial application in [8]. Run-time. 1st and 2nd categories assumed the matching process to be performed at run-time. In that case the maximum level of automation must be reached. In 3rd 102 M. Davidovsky, V. Ermolayev and V. Tolok and 4th categories it is allowed to perform matching and relative activities previously and separately. This allows active involvement of experts to the matching process (for alignment validation, relevance verification, etc.). Completeness. Completeness is of the most importance in the 1st and 3rd cases. It is important not to miss knowledge in these cases. At the same time, in the 2nd category the response time of method implementation to a system query is more critical as in that case matching is usually performed during runtime. Relevance. In the 4th case, the relevance of knowledge is the most critical (particularly during migration from an older system to a newer one). Here it is first of all important to save actual knowledge, but some obsolete knowledge may be discarded. 3 Solution Overview The main focus of the paper is evaluation of ontology alignment and instance migra- tion methodology in industrial settings. The methodology assumes (semi-) automated iterative process of ontology alignment and instance migration with possible human intervention for checking the correctness and setting up the process. The overall methodology consists of two steps: (1) mapping discovery and determination of struc- tural differences between ontologies and (2) ontology instance transformation and migration according to the determined differences. The first step is essentially the process of ontology matching with the only differ- ence that it results not only in ontology alignment but also produces an output of a set of transformation rules that further drive the process of ontology instance migration. The solution for the first step is based on the implementation of meaning negotiation between intelligent agents (we call this agent-based solution ABOA matcher [8]). The matching process embodies the strategy that originates from [20] and is described in detail in [21]. Negotiations among the agents are conducted in an iterative way and with an aim to reduce the semantic distance between the negotiated structural contexts of the respective ontology schemas. A negotiation is stopped when the distance reaches a commonly accepted threshold or the parties exhaust their propositions and arguments. At the second step agents use Instance Migration Engine in order to transfer in- stances between ontologies based on the transformation rules generated at the first step. Instance migration results in the transfer of all the assertions that do not require the resolution of the problem cases by the ontology engineer. The cases that caused problems are recorded in the migration log. The details on the second step of the methodology are described in [22]. Application of an Instance Migration Solution … 103 4 Evaluation Experiment To test our methodology and solution of ontology alignment and instance migration we choose real industrial ontologies: freeClass1 ontology for construction and build- ing materials and services and eClassOWL2 [14] – the web ontology for products and services. The dataset of the European building and construction materials market for the Semantic Web (BauDataWeb3) has been selected as the set of assertions for mi- gration. Structural parameters of the ontologies are presented in Table 1. General experimental set-up specified in ISO/IEC 24744 notation for describing methodolo- gies [23] is pictured in Figure 1. Table 1. Structural parameters of industrial ontologies used in the second experiment Number of Total number Number of Number of object Number of data Number of logical axi- of axioms classes properties properties individuals oms freeClass 78414 9622 5231 168 3 1335 eClassOWL 360243 117090 60662 4900 2453 4766 Over 60 BauDataWeb - - - - - million instances Ontology 1 Ontology 2 Ontology Reference Instance Ontology 1 Ontology 2 Instance TBox (OWL) TBox (OWL) Mappings Alignment Transforma- ABox (OWL) ABox (OWL) Migration Ontology 1 Ontology 2 (Alignment (Alignment tion Rules Log ABox (OWL) ABox (OWL) Format) Format) r r c r m r m r r m r r c Next iteration c Generate Migrate Check and Migrate Discover Compare I Mappings II III Transformation IV Instances V Correct VI Problem Cases and Analyze Rules Instance Agents Domain Ontology Migration Ontology Ontology Expert Engineer Engine Engineer Editor Refine and Supplement Fig. 1. The set-up of the evaluation experiment The test case doesn’t contain any reference alignment. Hence, we had to determine reference mappings manually in order to objectively judge about the obtained results. For convenience both freeClass and eClassOWL ontologies may be divided into 2 parts. The first parts are actually the sets of entities directly inherited from the 1 http://www.freeclass.eu/ – the ontology for construction and building materials and services 2 http://www.heppnetz.de/projects/eclassowl/ – the web ontology for products and services 3 http://semantic.eurobau.com/ – BauDataWeb: the European Building and Construction Mate- rials Database for the Semantic Web 104 M. Davidovsky, V. Ermolayev and V. Tolok GoodRelations ontology4 [13] and also some concepts from other common-sense vocabularies. The schemas of those parts of the ontologies are almost identical, so the difference is mostly in the sets of individual assertions. Further, those parts do not cause any problems in the discovery of the reference mappings as the entities mainly have human-understandable names and labels. Based on the analysis of the above- mentioned parts of the ontologies we constructed the set of reference mappings (fur- ther mentioned as Alignment 1). The second parts of the ontologies consist of internal entities that do not have understandable names (the names represent some identifiers composed of numbers and characters), but some of them still have labels with de- scriptions. Due to the big quantity of those entities we did not analyze the whole sets and choose the 20 entities that are semantically similar. Then we discovered respec- tive mappings for those chosen entities (further mentioned as Alignment 2). The pa- rameters for both alignments are presented in Table 2 where for brevity we include only the information about the classes and properties. Table 2. Parameters of reference alignment for the experiment with the BauDataWeb dataset Number of mapped entities Classes Object properties Datatype Properties Alignment 1 53 55 53 Alignment 2 20 11 0 Thus, the experiment with the BauDataWeb dataset has been performed in two phases. Within the first phase we constructed the reference alignment (Alignment 1) and started the matching process using the ABOA matcher. Then we found the map- pings that correspond to Alignment 1 and compared them to the reference ones. Alignment quality values for the results of this step are very high (Table 3, row 1) as these parts are almost identical. Table 3. Matching results Experiment Alignment Quality Measures Step Precision Recall F-Measure 1 0.99999 0.99999 0.99999 2 0.69552 0.42384 0.52671 It might be considered that the Alignment 1 in our experiment is not a topically in- teresting case as the semantic differences are tiny and could be easily discovered manually. However, this experimental phase represents a good case for validating the generated instance transformation rules and instance migration quality. In this phase all of the generated transformation rules were correct. More details on the transforma- tion rules could be seen in [22]. Within the second phase we determined the Align- ment 2 and tried to find the respective mappings within the alignment discovered by the matcher. The alignment quality measures for the second phase are lower than for 4 http://www.heppnetz.de/projects/goodrelations/ – the web vocabulary for e-commerce Application of an Instance Migration Solution … 105 the phase 1, which is conditioned by the relatively weak semantic similarity between the structural contexts [20] that correspond to these parts of ontologies. It is also worth noticing that the Precision value is noticeably higher than the Recall one within phase 2. It is so because string-based structural similarity measurement methods yield high values on labels. Labels can contain parts (e.g. words) that are common for many of them, but respective entities in general are not semantically similar. For example the comparison of labels “construction technology” and “pump technology” will give noticeably high similarity values. However those labels belong to the entities that are obviously not that similar semantically. 5 Concluding Remarks and Future Work The paper presented the experiment evaluating our methodology and software solu- tion for ontology instance migration on real-world industrial ontologies. The experi- ment shows acceptable results that allow a positive judgement about the applicability of our methodology in industrial settings. The results also suggest some directions for the future work. The experiment with large ontologies (BauDataWeb dataset) shows that the ontology instance migration engine allows migrating about several million instances using a conventional desktop computer. Hence, a technique to overcome this upper limit is needed for scaling the tool up to the volumes characteristic to Big Semantic Data. Looking for such a technique is on our research and development agenda. In the future we also plan to conduct a series of experiments with the ontolo- gies specified in OWL sublanguages5 and OWL 2 profiles6. Another important direc- tion for the future research is evaluating our approach on ontologies having different structural patterns like a taxonomy (tree-type) structure, a network structure (ontolo- gies rich with object properties), OWL graphs with high and low vertex degrees, etc. Acknowledgments The authors are grateful to the colleagues who provided the industrial ontologies for our experiments – Univ. Prof. Dr. Martin Hepp and Dipl. Ing. Andreas Radinger from E-Business and Web Science Research Group of the Universität der Bundeswehr, München. References 1. Bittner, T., Donnelly, M., Winter, S.: Ontology and Semantic Interoperability. In: D. Pros- peri and S. Zlatanova (ed.) Large-scale 3D Data Integration: Problems and Challenges. London, CRCPress (2005) 5 http://www.w3.org/TR/2004/REC-owl-features-20040210/#s1.3 6 http://www.w3.org/TR/owl2-profiles/ 106 M. Davidovsky, V. Ermolayev and V. Tolok 2. Malzahn, D.: Industrial Application of Ontologies. In: eKNOW 2011, The Third Interna- tional Conference on Information, Process, and Knowledge Management, (2011) 3. Tatarintseva, O., Ermolayev, V., Fensel, A.: Is Your Ontology a Burden or a Gem? – To- wards Xtreme Ontology Engineering. In: Ermolayev, V. et al. (eds.) Proc. ICTERI 2011, CEUR-WS.org/Vol-716, 65-81 (2011) 4. Euzenat J., Shvaiko P.: Ontology Matching. Berlin Heidelberg, Springer-Verlag (2007) 5. Corcho, O.: Methodologies, tools and languages for building ontologies. Where is their meeting point? Data & Knowledge Engineering 46, 41–64 (2003) 6. Ehrig, M.: Ontology Alignment: Bridging the Semantic Gap (Semantic Web and Beyond). Springer (2006) 7. Zhdanova A. V., de Bruijn, J., Zimmermann, K., Scharffe, F.: Ontology Alignment Solu- tion. Deliverable D14 v2.0, (2004) 8. Ermolayev, V., Davidovsky, M.: Agent-Based Ontology Alignment: Basics, Applications, Theoretical Foundations, and Demonstration. Tutorial Paper. In: Dan Burdescu, D., Aker- kar, R., Badica, C. (eds.) Proc. WIMS 2012, 11-22, ACM (2012) 9. Silver, G., Hassan, O.H., Miller, J.: From domain ontologies to modeling ontologies to ex- ecutable simulation models. In: Proc. of the 2007 Winter Simulation Conference, (2007) 10. Novák, P, Šindelář, R.: Applications of ontologies for assembling simulation models of in- dustrial systems. In: Proc. of the 2011th Confederated international conference on the move to meaningful internet systems (OTM'11), pp.148–157, Springer-Verlag Berlin, Hei- delberg (2011) 11. Ermolayev, V., Keberle, N., Matzke, W.-E.: An Upper Level Ontological Model for Engi- neering Design Performance Domain, LNBIP, vol. 20, pp.127–141. Springer, Heidelberg (2008) 12. Ding, Y., Fensel, D., Klein, M., Omelayenko, B., Schulten, E.: The Role of Ontologies in eCommerce. In: Steffen Staab, Rudi Studer (eds.): Handbook on Ontologies. International Handbooks on Information Systems. pp. 593-616, ISBN 3-540-40834-7, Springer (2004) 13. Hepp, M. GoodRelations: An Ontology for Describing Products and Services Offers on the Web, LNCS, vol. 5268, pp. 332–347. Springer Berlin Heidelberg (2008) 14. Hepp, M.: Products and Services Ontologies: A Methodology for Deriving OWL Ontolo- gies from Industrial Categorization Standards. In: Int'l Journal on Semantic Web & Infor- mation Systems 2(1) (2006), pp. 72–99, (2006) 15. Morgenstern, L., Riecken, D.: SNAP: An Action-Based Ontology for E-commerce Rea- soning. In: Proc., Formal Ontologies Meet Industry, Verona, Italy, (2005) 16. Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web (Berners-Lee et. al 2001). Scientific American 284, 28–37 (2001) 17. Izza, S., Vincent, L., Burlat, P.: A Unified Framework for Enterprise Integration: An On- tology-Driven Service-Oriented Approach. In: Pre-proc. of the First International Confer- ence on Interoperability of Enterprise Software and Applications (INTEROP-ESA’2005), pp. 78–89. Geneva, Switzerland, February 23 – 25, (2005) 18. Stoutenburg, S. et al.: Ontologies in OWL for Rapid Enterprise Integration. Time 122, 82– 89 (1994) 19. Lochovsky, F. H., Woo, C. C., Williams, L. J.: A micro-organizational model for support- ing knowledge migration. In: Proc. of the ACM SIGOIS and IEEE CS TC-OA conference on Office information systems, pp.194–204. Cambridge, Massachusetts, US, (1990) 20. Ermolayev V., Keberle, N., Matzke, W.-E., Vladimirov, V.: A Strategy for Automated Meaning Negotiation in Distributed Information Retrieval. In: Y. Gil et al. (Eds.): ISWC 2005 Proc. 4th Int. Semantic Web Conference (ISWC'05), 6–10 November, Galway, Ire- land. LNCS 3729, pp. 201–215 (2005) Application of an Instance Migration Solution … 107 21. Davidovsky, M., Ermolayev, V., Tolok, V.: Agent-based implementation for the discovery of structural difference in OWL DL ontologies. In: Mayr, H. C., Ginige, A., Liddle, S. (ed.) Proc. Fourth Int United Information Systems Conference (UNISCON 2012). LNBIP 137, Berlin, Heidelberg: Springer-Verlag (2013) 22. Davidovsky, M., Ermolayev, V., Tolok, V.: Instance Migration between Ontologies having Structural Differences. Int. J. on Art. Int. Tools. 20(6), 1127-1156 (2011) 23. Henderson-Sellers, B., Gonzalez-Perez, C.: Standardizing Methodology Metamodelling and Notation: An ISO Exemplar. In: Kaschek, R., Kop, C., Steinberger, C., Fliedl, G. (eds.) UNISCON 2008. LNBIP, vol. 5, pp. 1–12. Springer, Berlin/Heidelberg, (2008) Extracting Knowledge Tokens from Text Streams Eugene Alferov1,2 and Vadim Ermolayev1 1 Department of IT, Zaporozhye National University, 66 Zhukovskogo st., 69063, Zaporozhye, Ukraine alferov.evgeniy@gmail.com, vadim@ermolayev.com 2 Kherson State University, 27, 40 Rokiv Zhovnya ave., 73000, Ukraine alferov_jk@ksu.ks.ua Abstract. This problem analysis paper presents our position on how could the solution be sought to the problem of extracting semantically rich fragments from a stream of plain text posts. We first present our understanding of the problem context and explain the focus of our research. Further, in the problem setting section we elaborate the workflow for knowledge extraction from in- coming information tokens. This workflow is then used as a key to structure our review of the literature on the relevant component techniques which may be ex- ploited in a combination to achieve the desired outcome. We finally outline our plan for conducting the experiments with an aim to validate the workflow and find a proper combination of the component techniques for all steps which may solve our specific research problem. Keywords. Workflow, knowledge extraction, text streams, processing, ontol- ogy learning, component techniques Key terms. Data, Process, Knowledge, Approach, Methodology 1 Introduction The dramatic growth of data volumes we face today is accelerated by the increase of social networking applications that allow non-specialist users create a huge amount of content easily and freely. Equipped with rapidly evolving mobile devices, a user is becoming a nomadic gateway boosting the generation of additional real-time sensor data. The emerging Internet of Things makes each and every thing a data or content, adding billions of additional artificial and autonomic sources of data to the overall landscape. Smart spaces, where people, devices, and their infrastructures are all loosely connected, also generate data of unprecedented volumes and with velocities rarely observed before. Noticeably, the major part of the new data comes in streams. An expectation is that valuable information will be extracted out of all these data to help improve the quality of life and making our world a better place – for humans. Extracting Knowledge Tokens from Text Streams 109 Humans are however left bewildered about how to use, analyze, understand all these data, giving a proper account to its dynamics. A topical recent estimate of the need for data-savvy managers in the United States is 1.5 million [1]. This manpower is needed to extract and use valuable information and knowledge for further decision making. The critical steps in this work are (i) extracting information and knowledge; and (ii) bringing the descriptions of the reflections of the world or domain into a refined state – accounting for the changes brought in by new data, at scale. In this paper we focus on the step (i) extraction. In Section 2 we present the prob- lem statement by giving basic definitions and providing our view on how could a processing workflow look like. The plethora of approaches, techniques, technologies, and software tools already exist for solving different parts of the overall problem. Hence we analyze the related work and structure this analysis using the workflow as the key in Section 3. Finally we conclude the paper and present our plans for the fu- ture proof of concept experimental work in Section 4. 2 Problem Statement Ontology is a complex artifact that comprises structural components of several types. Further the structural denotation of an ontology used in Description Logics [2] is exploited: an ontology O comprises its schema S and the set of individuals I : O ( S , I ) . Ontology schema is also referred to as a terminological component (TBox). It contains the statements describing the concepts of O, the properties of those concepts, and the axioms over the schema constituents. If a finer grained look at an ontology schema is taken, one may consider S comprising the following interrelated constituents: S {S C , S O , S D , S A }, where S C is the set of statements describing concepts, S O is the set of statements describing object properties, S D is the set of statements describing datatype properties, and S A is the set of axioms specifying constraints over S C , S O , and S D (c.f. [3]). One may notice that these constituents correspond to the types of the schema specification statements of an ontology representation language L which is used for specifying O . The set of individuals, also referred to as assertional component (ABox), is the set of the ground statements about the individuals and their attribution to the constituents of the schema. Ontology Learning is the process of extracting the abovementioned constituents of O from a text stream source. More specifically, the problem which is approached in this research work is twofold: For every individual plain text document (further referred to as information token) arriving in the stream window DO: (i) Extract ontological fragment (further referred to as knowledge token) specifying the semantics of the information token. (ii) Refine the ontology O incorporating the changes brought in by the knowledge token. 110 E. Alferov and V. Ermolayev The focus of this paper is the first part of the problem – the extraction of knowl- edge tokens from information tokens of plain text in a particular professional domain coming in a stream. The texts of ICTERI paper abstracts have been chosen as the domain and source text corpus for our initial experiments – see also Section 4. As an ontology is a complex artifact, the extraction of knowledge tokens from texts is also a complex process. It comprises several steps and, possibly, iterations for ex- tracting different structural constituents of S and I . These steps produce several types of outputs in a particular sequence, sometimes referred to as the ontology learn- ing layer cake (c.f. [4]). Those outputs are terms – concepts and their instances – datatype properties – taxonomic relationships and object properties – axioms. Based on [5] we present in Fig. 1 a workflow putting together extraction steps, inputs, out- puts, and required component technology types. The overall workflow contains two consecutive phases – Text Pre-processing and Ontology Extraction. Text Pre-processing phase gets the information token as a plain text input and produces its structured representation as a set of terms by applying several statistical and linguistic techniques. All the tasks of the Ontology Extraction Phase use the output of Phase 1 as their input and incrementally build up the knowl- edge token by adding different ABox and TBox constituents. For that statistical, lin- guistic, semantic, and logical techniques are employed in combinations. Fig. 1 lists all relevant component techniques per task. All of those are never used in implementa- tions. Therefore our initial research objective is to find out which combination of component techniques works best of all for our specific data – i.e. copes well with (a) the texts of small size but belonging to a particular domain; and (b) limited processing time constrained by a stream window lifetime parameter. Further, after this constella- tion of component techniques is chosen, the objective would be to refine those which do not provide results of a satisfactory quality in our problem settings. 3 Related Research and Available Component Techiques In this section we will describe the component techniques, outlined in Fig. 1, which we found relevant to our work. Those component techniques could overall be catego- rized as linguistic, statistic, semantic and logical (c.f. [5]). As pictured in Fig. 1 they could be applied at different steps and for different purposes. Though not explicitly shown in Fig. 1, the steps may undergo iterations for refining their results. Therefore, the workflow proposed in this paper could be considered as hybrid and iterative. De-noising (statistical, linguistic). This is a method that extracts the de-noised text, comprising the content-rich sentences, from full texts [6]. Processing of noisy text becomes important because the quality of texts in the form of blogs, emails and chat logs can be extremely poor. The sentences in dirty texts are typically full of spelling errors, ad-hoc abbreviations and improper casing [7]. Tokenization. Tokenization is splitting the text into a set of tokens, usually words. This process is unsupervised and can be performed automatically by progam-parser. Part of speech detection/tagging (linguistic). Part of speech tagging (POST) is the process of assigning one of the parts of speech to the given word. POST provides the Extracting Knowledge Tokens from Text Streams 111 syntactic structures and dependency information required for further linguistic analy- sis in order to uncover terms and relations. POST is a semi-supervised or even unsu- pervised process. PHASE 1 (T1): PHASE 2 (T2 – T6): Text Pre-processing Ontology Extraction Bag of Terms Information Token Form Concepts and Extract Datatype T1 Extract Terms - De-noising T2 Concept Instances T3 Properties * - Sentence parsing Domain - Co-occurrence analysis - Syntactic structure analysis - Part of speech - Clustering - Dependency analysis detection Concepts - Latent semantic analysis - Association rule mining - Syntactic structure - Sub-categorization - Use of Lexico-syntactic analysis frames patterns Domain - Relevance analysis Semantic - Use of a Semantic lexicon Domain - Use of Semantic templates Terms - Co-occurrence analysis Properties - Logical inference Lexicon Knowledge Token Bag of Terms Extract / Discover Extract Object LEGEND: * T4 Concept Hierarchies T5 Properties T6 Extract Axioms Control flow - Clustering - Association rule minng - Use of Axiom templates Information flow - Term subsumption - Syntactic structure - Inductive logic Information flow in case - Use of a Semantic lexicon analysis programming - Syntactic structure - Dependency analysis of (semi-) supervised approach analysis - Use of Semantic Ontology / Resource - Dependency analysis templates Workflow step (task) Semantic - Use of Semantic templates - Use of Lexico-syntactic Lexicon - Use of Lexico-syntactic patterns patterns - Logical inference - Logical inference Fig. 1. A workflow for knowledge token extraction Lemmatization (linguistic). Lemmatization is the reduction of morphological variants of the tokens to their base form that can be performed in unsupervised way. For achieving this word form must be known, i.e. the part of speech of every word has to be assigned in the text document. This process usually takes a time and may con- tain errors. Chunking (linguistic). Chunking is unsupervised splitting a text in syntactically correlated parts. Sentence parsing. Sentence parsing is identifying the syntactic structure of a sen- tence, for example in a form of a parse tree. Syntactic structure analysis (linguistic). In syntactic structure analysis, words and modifiers in syntactic structures (e.g., noun phrases, verb phrases, and prepositional phrases) are analyzed to discover potential terms and relations. It can be done in un- supervised way. 112 E. Alferov and V. Ermolayev Relevance Analysis (statisitcal). The extent of occurrence of terms in individual documents and in text corpora is employed for relevance analysis. This is semi- supervised or even unsupervised technique. Co-occurrence analysis (statisitcal). Co-occurrence analysis identifies lexical units that tend to occur together for purposes ranging from extracting related terms to discovering implicit relations between concepts [5]. This technique is unsupervised. Clustering (statistical). Grouping together variants of terms to form concepts and separating unrelated ones is known as terms clustering. It usually unsupervised tech- nique. In this approach some measure of similarity is employed to assign terms into groups for discovering concepts or constructing hierarchy [8]. Some of the major issues in clustering are working with high-dimensional data and feature extraction and preparation for similarity measurement. This gave rise to a class of featureless simi- larity measures based solely on the co-occurrence of words in large text corpora. It is known that clustering results are of acceptable quality only if a statistically represen- tative (i.e. large) text corpora is processed. This fact limits the applicability of this technique in our settings (texts of small size). However, used in the combination with other techniques, clustering may yield some valuable addition to the result – and thus needs to be tried. Latent semantic analysis (statistical). Latent semantic analysis (LSA) is a theo- retical approach and mathematical method for determining the meaning similarity of words and passages by analysis of large text corpora. The main idea is that the aggre- gate of all the word contexts in which a given word does and does not appear provides a set of mutual constraints that largely determines the similarity of meaning of words and sets of words to each other [9]. LSA can be useful in our investigation because it is a fully automatic mathematical and statistical technique for extracting and inferring meaningful relations from the contextual usage of words in text. Sub-categorization (linguistic, semantic). Sub-categorization, or extracting sub- categorization frames, is an approach to extract one type of lexical information with particular importance for Natural Language Processing (NLP). Access to an accurate and comprehensive sub-categorization lexicon is vital for the development of success- ful parsing technology important for many NLP tasks (e.g. automatic verb classifica- tion) and useful for any application which can benefit from information about predi- cate-argument structure (e.g. Information Extraction) [10]. Using semantic lexicon (linguistic, semantic). A semantic lexicon is a dictionary or thesaurus of words/terms labeled with semantic classes (e.g., “ongoing effort” is an Activity) so associations can be drawn between words that have not previously been encountered [11]. Semantic lexicons are a popular resource in ontology learning and play an important role in many NLP tasks. Dependency analysis (linguistic). Syntactic structure consists of lexical items, linked by dependencies. They are binary asymmetric relations that are held between a head and its dependents. Dependency analysis examines dependency information to uncover relations at the sentence level. In this analysis, grammatical relations, such as subject, object, adjunct, and complement, are used for determining more complex relations. Dependency analysis is usually unsupervised approach. Extracting Knowledge Tokens from Text Streams 113 Association rule mining (statistical). Association rule mining aims to extract cor- relations, frequent patterns, associations or casual structures among sets of items in data repositories [12]. It is an unsupervised component technique which works well for considerably big data corpora. Association rules highlight correlations between features in the texts, e.g. keywords. Association rules can be easy interpreted and are understandable for an analyst or even for a normal user. Use of lexico-syntactic patterns (linguistic). Lexico-syntactic patterns (LSPs) are generalized linguistic structures or schemas that indicate semantic relationships among terms and can be applied to the identification of formalized concepts and conceptual relations in natural language text [13]. Lexico-syntactic patterns are suitable for automatic ontology building, since they model semantic relations. These display exactly the kind of relation between their parts that makes them easily translatable into an ontology representation. Use of semantic templates (semantic, linguistic). Semantic templates are similar to lexico-syntactic patterns in terms of their purpose. However, semantic templates offer more detailed rules and conditions for extracting not only taxonomic relations but also complex non-taxonomic relations [5]. Logical inference (logical, semantic). In logical inference implicit relations are de- rived from existing ones using rules such as transitivity and inheritance [5]. However, the introduction of invalid or conflicting relations may also happen in case of an in- complete or underspecified inference rule set – for example because of improper ac- count for the validity of transitivity or mutual disjointness axioms. Term subsumption (statistical, semantic). In the subsumption method, a given term subsumes another term if the documents in which the latter term occurs are a subset of the documents in which the given term occurs [14]. A term subsumption measure is used to quantify the extent of a term x being more general than another term y. This technique is semi-supervised and unsupervised too. The term subsump- tion technique is easy to implement and it makes labeling concepts an easy task. However, with this method, it is difficult to classify terms that do not co-occur fre- quently and it requires a large data set to work reliably. Use of axiom templates (semantic, linguistic). Axioms are useful for describing the relationships between the concepts of an ontology. They can be written in differ- ent ways depending on the relation that exist among the concepts. Inductive logic programming (logical, semantic). Inductive logic programming (ILP) is a research area at the intersection of inductive machine learning and logic programming. ILP generalizes the inductive and the deductive approaches by aiming to develop theories, techniques and applications of inductive learning from observa- tions and background knowledge represented in first order logical framework. The overview of the applicability of the presented component techniques and their interrelationship with respect to the tasks in our workflow are presented in Table 1. 114 E. Alferov and V. Ermolayev 4 Summary and Future Work Our literature search has revealed that extracting knowledge, or more specifically learning ontologies, from plain text corpora is a well developed research field that continues to produce new results. However, and to the best of our knowledge, extract- ing ontologies from text streams, with a constraint on the life time of an input infor- mation token, is a recently emerged research problem. The reasons for adding this specific problem to the research agenda are the phenomenon of Big Data, in particular its velocity dimension, as well as the need for better, more reliable, semantically rich solutions for automating Big Data analytics. One more complication introduced by our problem setting is the small size of an individual information token which hinders yielding good quality results using the majority of traditional statistical and linguistic techniques for ontology extraction from text corpora. We argued in this paper that applying a combination of the relevant existing com- ponent techniques in a structured and iterative way may overall produce such a result – as an incremental collection of ontology elements in a knowledge token provided by individual techniques at different stages in our proposed workflow. Table 1. Relevance of component techniques to the tasks within the workflow for extracting knowledge tokens from information tokens Task (Fig. 1.) Component technology T1 T2 T3 T4 T5 T6 De-noising st, li Part of speech detection/tagging li Lemmatization li Chunking li Syntactic structure analysis li li li li Relevance Analysis st Co-occurrence analysis st st Clustering st st Latent semantic analysis st Sub-categorization se, li Using semantic lexicon se, li se, li se, li Dependency analysis li li li Association rule mining st st Use of lexico-syntactic patterns li li li Use of semantic templates se, li se, li Logical inference lo, se lo, se lo, se Term subsumption st, se Use of axiom templates se, li Inductive logic programming lo, se Legend : li – linguistic; lo – logical; se – semantic; st – statistical; As this research is in an early phase, we do not yet have the proof for this hypothe- sis. However there is the plan in place for conducting the initial series of the “proof- of-concept” experiments in which the component technologies will be exploited in a semi-supervised or supervised fashion. For that we plan to use a small but well se- Extracting Knowledge Tokens from Text Streams 115 mantically annotated corpus of the abstracts (information tokens) and full texts of ICTERI papers collected in the ICTERIWiki portal1. This document corpus is incre- mentally extended by adding the papers and their semantic annotations for each new ICTERI conference instance. The annotations are done using the ICTERI Scope On- tology by Tatarintseva et.al. [15]. These annotations will be used as a “Golden Stan- dard” for evaluating the results of automated knowledge token extraction using the workflow proposed in this paper. After the concept is proven and the constellation of the component techniques is circumscribed, we plan to test the approach on one of the professional news portals. Further, it is planned to extend the proposed knowledge extraction procedure to sen- sor stream data processing. References 1. Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Hung Byers, A.: Big data: the Next Frontier for Innovation, Competition, and Productivity. McKinsey Global Institute (2011), http://www.mckinsey.com/insights/mgi/research/technology_and_ innovation/big_data_the_next_frontier_for_innovation 2. Nardi, D., Brachman, R.J.: An Introduction to Description Logics. In: Baader, F., Calvanese, D., McGuinness, D. L., Nardi, D., Patel-Schneider, P. F. (eds.) The Description Logic Handbook, Cambridge University Press New York, NY, USA (2007) 3. Davidovsky, M., Ermolayev, V., Tolok V.: Instance Migration between Ontologies Having Structural Differences. In: Int. J. on Artificial Intelligence Tools, vol. 20(6), pp. 1127– 1156 (2011) 4. Buitelaar, P., Cimiano, P., Magnini, B.: Ontology Learning from Text: an Overview. In: Buitelaar, P., Cimmiano, P., Magnini, B. (eds.). Ontology Learning from Text: Methods, Evaluation and Applications, IOS Press, Amsterdam (2005) 5. Wong, W., Liu, W., Bennamoun, M.: Ontology Learning from Text: a Look Back and into the Future. ACM Comput. Surv., 44(4), Article 20, 36 pages. http://doi.acm.org/10.1145/2333112.2333115 (2012) 6. Shams, R., Mercer, R. E.: Investigating Keyphrase Indexing with Text Denoising. In: Proceedings of the 12th ACM/IEEE-CS Joint Conf. on Digital Libraries, pp. 263–266, ACM (2012) 7. Wong, W., Liu, W., Bennamoun, M.: Enhanced Integrated Scoring for Cleaning Dirty Texts. arXiv preprint arXiv:0810.0332. (2008) 8. Cimiano, P., Hotho, A., Staab, S.: Learning Concept Hierarchies from Text Corpora using Formal Concept Analysis. Journal of Artificial Intelligence Research Archive, 24(1), 305– 339 (2005) 9. Landauer, T.K., Foltz, P.W., Laham, D.: Introduction to Latent Semantic Analysis. Journal: Discourse Processes, 25(2-3), 259–284 (1998) 10. Preiss, J., Briscoe, T., Korhonen, A.: A System for Large-Scale Acquisition of Verbal, Nominal and Adjectival Subcategorization Frames from Corpora. In: Annual Meeting. Association for Computational Linguistics, 45(1), 912 (2007) 11. Thelen, M., Riloff, E.: A Bootstrapping Method for Learning Semantic Lexicons using Extraction Pattern Contexts. In: Proc. ACL-02 Conf. on Empirical Methods in Natural 1 http://isrg.kit.znu.edu.ua/icteriwiki/ 116 E. Alferov and V. Ermolayev Language Processing, Association for Computational Linguistics, vol. 10, pp. 214–221 (2002) 12. Kotsiantis, S., Kanellopoulos, D.: Association Rules Mining: a Recent Overview. GESTS International Transactions on Computer Science and Engineering, 32(1), 71–82 (2006) 13. Summary on Requirements on Lexico-Syntactic Patterns (Synthesis by PC), http://www.w3.org/community/ontolex/wiki/Specification_of_Requirements/Lexico- Syntactic_Patterns 14. De Knijff, J., Frasincar, F., Hogenboom, F.: Domain Taxonomy Learning from Text: the Subsumption Method versus Hierarchical Clustering. Data & Knowledge Engineering, (2012) 15. Tatarintseva, O., Borue, Yu., Ermolayev, V.: Validating OntoElect Methodology in Refining ICTERI Scope Ontology. In: H.C. Mayr et al. (Eds.): UNISCON 2012, LNBIP 137, pp. 128--139 (2013) 1.3 Model-Based Software System Development Use of Neural Networks for Monitoring Beam Spectrum of Industrial Electron Accelerators Oleksandr Baiev, Valentine Lazurik and Ievgen Didenko School of Computer Science, V. N. Karazin Kharkiv National University, 4, Svobody Sqr., 61022, Kharkiv, Ukraine oleksandr.baiev@gmail.com, lazurik@hotmail.com, ievgen.v.didenko@gmail.com Abstract. This paper investigates technique for solving spectrometry inverse problem the neural network as method for reconstruction of electron beam spectrum using depth-charge curve. The inverse problem turned into multivariable optimization and the form of spectrum is based on proposed three-parameter model. Radial basis function network calcu- lates the parameters of this model. We developed computational experi- ment using Monte-Carlo technique to evaluate strengths and weaknesses of proposed approach and compare neural networks with conventional data evaluation methods. Keywords. Neural nets, Inverse problems, Monte Carlo, Radiation tech- nologies, Depth-charge curve Key terms. ComputerSimulation, Methodology, MachineIntelligence 1 Introduction One of the main characteristic of the irradiation processes is an energy of beam. This parameter influences on absorbed dose in target. Therefore, standards for radiation technologies [1, 2] predetermine the upper bound of beam energy to prevent ionization of the object under irradiation. Because of accelerator fea- tures, electrons in beam have different energy. Thus, the beam energy repre- sented by some function, which shows relations between particles number and their energies. This function called beam spectrum. In practice at least three parameters define the spectrum: average (Eav ) and probably (Ep ) energies and full width on half maximum (Ew ). In order to measure beam energy dosimet- ric wedge and stack are widely used in centers of radiation technologies. These devices allow to determine only average and probable energies of beam [1–6]. Of course, these two parameters does not allow to reconstruct full energy dis- tribution. Thereby developing of new instruments and methods of dosimetric measurements is actual problem. Mentioned devices intend to measure distributions of absorbed dose or charge [5, 6]. The measured depth-dose (depth-charge) curves relate to beam spectrum Use of Neural Networks for Monitoring Beam Spectrum . . . 119 through Fredholm integral equation and finding exact spectrum is an ill-posed inverse problem [7]. This means that evaluated spectrum obtained by conven- tional mathematical methods can differ with true energy distribution. There are, for example, method of least squares (MLS) or method of Tikhonov regu- larization (MTR). Above all, important disadvantage of the MLS and MTR is impossibility to include additional solution conditions, for example, correlations between parameters, positivity and other. This lack can bring to violation of conditions, given by physical lows. It should be mention that in common case the neural networks (NN) solve approximation tasks and find solutions based on existing precedents after supervised training [8–12]. So the one of the way of improving dosimetry effectiveness is developing of methods for measurement results evaluation based on neural networks. In order to apply NN for dosimetric data processing it is necessary to solve next problems: select networks topology, obtaining data for NN training, developing methods for data preprocessing and interpretation, system for evaluation network effectiveness. So current research is about feasibility of using neural networks for devel- oping system of measurement results evaluation for beam spectrum monitoring of industrial electron accelerators. We will discuss mathematical model of mea- surement process, which was built in order to compile training set for network learning procedure (Section 2). Section 3 describes methods under investigation. In section 4, we will show approach for methods evaluation, which contains com- putational experiment and comparison criteria. In section 5 given comparison results of neural networks and conventional methods testing. 2 Physical process and mathematical model In order to calculate radiation energy, it is a common practice in field of radiation technologies to measure depth-dose curve by dosimetric wedge. However, the works of recent years propose new devices based on measurement of depth- charge curve that can realize on-line energy monitoring [3–6]. In this work, we will consider mathematical abstraction of these devices and will build method for beam spectrum controlling using depth-charge curve. 2.1 Devices Device [5] consists of two plates only and intend to calculate probable energy as a value which linearly depends on charge in first plate to sum charge ratio. Measurer in [6] contains 10 absorbers. But in order to simplify average energy calculation the plates were combined and authors use similar to [5] dependency. Fig. 1 shows principal schema of measurer. Dosimetric stack consists of set of plates - absorbers. The absorbers material is often aluminum, because of radiation ruggedness. The electron beam falls on the sequence of plates. Electrons stop at different depths depending on their energy. Thus, absorbers collect some charge which can be measured by current integrators connected to corresponding plate. The set of measured values represents the depth-charge curve. 120 O. Baiev, V. Lazurik and Ie. Didenko f1 f2 f3 f4 f5 fn Fig. 1. Common schema of stack for depth-charge measurement Mathematical model of the measurement process is based on a semi-empirical model of the depth-charge distribution for monoenergetic electrons and model of charge measurement uncertainty. Direct problem describes relation between known beam spectrum and depth-charge curve through equation: ZER f (x) = Q(x, E)y(E) dE, x ∈ [0, xR ], (1) EL where y(E) - describes relation between number of particles and their energy (electrons spectrum), f (x) - describes depth distribution of charge, xR mea- surer full width, [EL , ER ] - operating energy range of accelerator, integral kernel Q(x, E) corresponds to radiation type (α, β, γ) and measurer internal char- acteristics (including absorbers material). Works [13, 14] describe appropriate relations for monoenergetic beam and depth-charge curve. In the research we neglect charge leakage and suppose that distance between absorbers is neglectfully small. It means that each particle from initial beam can stops in absorbers and pass through current integrator or can pass through whole device with no impact in depth-charge curve. The measurement results of charge distribution in absorbers is set f = {f1 , f2 , . . . , fn } (see Fig. 1), where n - number of absorbers, fi - integral of f (x) over the depth for i-th absorber: ER xiZ+∆xZ fi = Q(x, E)y(E) dE dx, (2) xi EL where ∆x - absorbers width. Equation (2) can be approximated as: fi = ∆x P E 2 pj yj [Q(xk + (i − 1)∆x, Ej )+ j (3) +Q(xk + i∆x, Ej )], where i = 1, n, j = 0, m, m = (ER − EL )/∆E - number of steps of function y(E) discretization over energy axis, ∆E - step of spectrum energy discretization, yj - value of y(E) in approximation nodes, coefficient pE j defines method and step Use of Neural Networks for Monitoring Beam Spectrum . . . 121 of function y(E) approximation. Then the measurement process can be shown as system of linear equations: Ay = f⇔ a1,1 a1,2 · · · a1,n y0 f1 a2,1 a2,2 · · · a2,n y1 f2 , (4) . . = . .. .. . . . . . .. .. .. am,1 am,2 · · · am,n ym fn where elements of matrix A are: ai,j = ∆x E 2 pj [Q(xk + (i − 1)∆x, Ej )+ (5) +Q(xk + i∆x, Ej )]. In order to approximate y(E) by method of trapezoids, coefficients pE j are: ∆E/2 j = 0 ∨ j = m pE j = . (6) ∆E otherwise It’s obvious that complexity of spectrum reconstruction grows with increasing of m (dimension of vector y). In order to reduce problem the we used parameter- ization of y(E). As mentioned above, the general practice is denoting spectrum by parameters: Ep , Eav , Ew . Therefore, it is reasonable to make model of the beam spectrum, which use three parameters. 2.2 Model of electrons spectrum Fig. 2 shows geometrical interpretation of electrons spectrum model considered in the present work. The graph of spectrum consists of two part: left exponential and right linear slopes. The parameters of this model are: – Emax – maximal particles energy in the beam, – Ep – most probable energy, – Es – energy of 10 times decreasing of the intensity compared to Ep electrons along left slope. In the future discussion the Π will denotes set of spectrum parameters, i.e. Π = {Es , Ep , Emax }. Parameters of the model correspond to characteristics of beam used in prac- tice according to: Ep = Ep , E −E Ew = ln0.5 − Es ) + max2 p ln0.1 (Ep (7) Emax −Ep 0.45(Es −Ep ) Eav = Es + ln 4 + ln0.1 and mathematical expression for spectrum is: 122 O. Baiev, V. Lazurik and Ie. Didenko h 0.5h 0.1h Es Ep Emax Fig. 2. Model of electron beam spectrum heµ(E−Ep ) , 0 < E ≤ Ep y(E) = k1 E + k2 , Ep < E ≤ Emax , (8) 0, Emax < E ln(0.1) h hEmax µ= , k1 = , k2 = , (9) Es − Ep Ep − Emax Emax − Ep where E ∈ [0; ∞], h = y(Ep ) - maximum of function y(E) and was obtained with supposition of EZmax y(E)dE = 1. (10) Es Therefore, maximum of energy distribution is: Es − Ep h = y(Ep ) = [0.9 + 0.5(Emax − Ep )]−1 . (11) ln(0.1) It should be mention, that in accordance to physical laws the function y(E) is positive or equal zero for all accepted E and parameters correlates as: 0 < Es < Ep ≤ Emax . (12) 2.3 Model of measurement In the real experiment measured fi differ with its real value. This error grounded on weaknesses of measurer and external influence. We will mark set of true values of f (x) as f , and use f˜ to mark set of values complemented with measurement uncertainty: f˜ = (1 + εξ)f, (13) where ε - value of standard deviation of measurement error, ξ - random variable distributed in accordance to standard normal distribution: Use of Neural Networks for Monitoring Beam Spectrum . . . 123 p ξ = cos(2πr1 ) −2ln(r2 ), (14) where r1 , r2 - random variables which ate distributed in accordance with stan- dard uniform distribution. We will use similar signature to denote evaluated parameters Π̃, Ẽs , Ẽp , and Ẽmax reconstructed spectrum ỹ instead their true values without tilde. 3 Methods for spectrum reconstruction 3.1 Neural networks In order to apply NN for solving spectrometry inverse problem reconstruction of spectrum can be represented as multivariable function fitting. Suppose that func- tion φ implements measurement process of depth-charge curve, i.e. f˜ = φ(Π). Therefore, inverse function Π̃ = φ−1 (f˜) realizes transformation from depth- charge curve to beam spectrum. So approximation of φ−1 can be used to get spectrum using depth-charge curve. In the work we used general regression neu- ral network (GRNN) [15] to fit φ−1 . This network needs set of precedence for supervised learning. Consider algorithm of training set creation. Implemented measurement models allow to create pairs s = (f˜, Π), where f˜ calculates from parameters set Π. The collection of s is based on different Π and represents a reference points for φ−1 fitting: f˜1 f˜2 · · · f˜N (s1 · · · sN ) = (15) Π1 Π2 · · · ΠN where N - number of elements in training set. For future discussion, we will denote each unique Π in training and testing sets as reference spectrum. Note, that each values of parameters for all Π from training set was normalized in accordance to [EL ; ER ] → [0; 1]. Of course, outputs of network were scaled back during testing. 3.2 Conventional methods Consider methods, which is traditionally used for measurement results evalua- tion. The data which were obtained by these methods is a base level to determine NN effectiveness for solving spectrum reconstruction problem. The method of least squares calculates parameters Π as: Π̃M LS = arg min AỹΠ − f˜ , (16) Π where k·k - Euclidian norm. Method of Tikhonov regularization expands MLS through additional stabilizer function: Π̃M T R = arg min AỹΠ − f˜ + α kỹΠ k , (17) Π 124 O. Baiev, V. Lazurik and Ie. Didenko where α > 0 - regularization parameter. It should be remind that using of math- ematical model of measurement process gives true values of electrons spectrum. So α can be calculated from [7]: ky − ỹα k α∗ = arg min . (18) α kyk In the work we applied Nelder-Mid simplex method numerical solution of (16), (17) and (18). 4 Algorithm for evaluation methods preparing and testing 4.1 Comparison approach Implemented models of spectrum, measurement process and methods for data evaluation compose computational experiment (Fig. 3 shows sequential dia- gram). The experiment aim is comparison of methods for spectrum reconstruc- tion. The approach which was used to build experiment uses Monte-Carlo tech- nique: system generate measurement results, each methods reconstruct spectra using samples of depth-charge cure, system calculates statistical characteristics of reconstruction error. Computational experiment consists of three steps: prepa- ration, main part (loop Common) and results interpretation. Preparation of an experiment includes setting parameters of models and methods. Main part is a series of subexperiments with varied measurement un- certainty ε. Each of them contains two steps: training of NN and selected meth- ods comparison. Both processes include generation of pairs s = (f˜, Π) which is based on predefined set of Π. But these sets of reference spectrum are different. Testing procedure (loop Data Evaluation) repeats sampling of f˜, evaluates ap- propriate Π by each method and collects reconstruction error based on truth and calculated spectra based on proposed set of indicators. The results processing step aims to build relationships that show correlations between accuracy of spec- trum reconstruction and varied error of measurement. Software for experiment execution implemented in MATLAB with Neural Network Toolbox (function newgrnn as NN), Optimization Toolbox (function fminsearch as MLS and MTR). In order to speed up computational experiment, software was executed on high performance cluster [16] with Distributed Com- puting Toolbox. 4.2 Comparison indicators In order to assess the effectiveness of methods for reconstruction of beam energy characteristics we suggested set of indicators. The set consists of the standard statistical estimates of data evaluation error and indicator of methods reliability. There are two indicators type: mismatch along energy axis (estimate shift of reconstructed spectrum along horizontal axis) and common indicator. Consider details of each indicators. Use of Neural Networks for Monitoring Beam Spectrum . . . 125 Fig. 3. Sequential diagram of computer experiment 1. Mismatch along energy axis. Average M (r) and standard deviation σr of distance along intensity axis between reconstructed and true spectra are based on: 1 r = (y − ỹ)2 ; (19) n 2. Common characteristics. Probability of method failure P . We suppose that the method failure is a case when applying mathematical methods leads to impossible (due to physical lows) solution, i.e. the solution brakes condition (12). It is obvious that value 1 − P characterize method reliability. 5 Results and discussions 5.1 Parameters of computation experiment In order to evaluate methods effectiveness with suggested indicators we made computational experiment with parameters shown in Table 1. The training and testing sets include reference spectra with parameters: 126 O. Baiev, V. Lazurik and Ie. Didenko Table 1. Common experiment parameters Parameter Value Characteristic for measurement depth-charge curve Absorbers material Aluminum (Z = 13, Am = 27) Absorber’s width (∆x) 0.4 g/cm2 Device total width (xR ) 6 g/cm2 Uncertainty (ε) Varied from 0% to 30%, step 1% [EL ; ER ] [0M eV ; 10.2M eV ] Ew of reference spectra Randomly from 2% to 10% of Ep y(E) discretization step (∆E) 0.05M eV Number of reference spectra 9000 (training) and 41000 (testing) Ep = r1 , r1 ∼ U [EL , ER ], Emax = Ep (1 + 2r2 ), r2 ∼ U [0.01, 0.02], (20) ln0.1 Es = Ep − ln0.5 r3 , r3 ∼ U [0.01, 0.08]. Fig. 4 shows examples of sampled reference spectra. Number of the spectra for training and testing sets is reduced, but proportion saved. As shown on Fig. 4 and in Table 1 the testing set is bigger than training set. It is necessary to get appropriate assessment of method based on NN with influence of retraining. Testing spectra Training spectra 1.8 2.5 1.6 1.4 2 1.2 y(E) − Intensity y(E) − Intensity 1.5 1 0.8 1 0.6 0.4 0.5 0.2 0 0 5 6 7 8 9 10 5 6 7 8 9 10 E − Energy E − Energy Fig. 4. Reference spectra for a) NN training and b) methods testing As shown in Table 1 the device consists of 15 absorbers. This configuration is chosen based on previous research [18] which was aimed to find optimal dis- cretization step of depth-charge curve for spectrum reconstruction by NN. It should be mention that in works [17, 18] sets for methods testing and prepara- tion based on reference spectra with fixed Ew parameter and same maximum h = 1. Therefore, seeking of optimal absorbers width is open for future research. Use of Neural Networks for Monitoring Beam Spectrum . . . 127 M(r) σr Failures 0.25 0.5 0.5 MLS 0.2 0.4 0.4 MTR GRNN M(r) 0.15 0.3 0.3 σr P 0.1 0.2 0.2 0.05 0.1 0.1 0 0 0 0 0.1 0.2 0.3 0 0.1 0.2 0.3 0 0.1 0.2 0.3 ε ε ε Fig. 5. Results of methods comparison 5.2 Results and discussion Fig. 5 contains obtained dependencies, which describe relation between methods evaluation error and measurement uncertainty. The charts 5a and 5b based on indicator (19). Chart 5c shows probability of method failure. For MLS and MTR experiment proves expected results. Methods are sensi- tive to uncertainty in input data. Fig. 5a and 5b show that error of MLS and MTR solutions rapidly grows with increasing of ε. With respect to probability of failure, both methods demonstrated almost equal inefficiency. It can be mean that stabilizing additions in MTR does not affect to the method reliability. It should be mention that the reason of MLS and MTR error for ε = 0% is dis- cretization inaccuracy which appears when transforming integral (1) to system (4). As an opposite to conventional methods, the solutions obtained by NN have smaller dependency between evaluation error and input data uncertainty. Fur- thermore as shown on Fig. 5a, 5b the GRNN evaluates spectra more accurate than MLS and MTR for measurement uncertainty more than 5-7%. The main advantage of NN method is that GRNN reconstruct beam spectrum parame- ters with no failures (see Fig 5c), i.e. all obtained solutions are compliance with physical lows. 6 Conclusion The work shows GRNN method effectiveness for solving inverse dosimetry prob- lem of electron spectrum reconstruction using depth-charge curve. The main ad- vantages of proposed technique compared to conventional methods is allowance to apply additional solutions conditions. It lids to getting robust evaluation method. As shown in the work methods based on NN can be used for building on-line energy monitoring systems in centers of radiation technologies. Furthermore, we proposed comparison approach based on Monte-Carlo tech- nique and set of effectiveness indicators. The approach allows testing different 128 O. Baiev, V. Lazurik and Ie. Didenko types of evaluation methods and can be used for methods optimization in order to select or apply technique for industrial problems solving. References 1. Standard ISO/ASTM 51649-2005(E). Practice for dosimetry in an electron beam facility for radiation processing at energies between 300 keV and 25 MeV. United States, 30 p. (2005) 2. ICRU Report 35. Electron beams with energies between 1 and 50 MeV. United States, 160 p. (1984) 3. Fuochi P.G., Lavalle M., Martelli A., Corda U., Kovacs A., Hargittai P., Mehta K., Electron energy device for process control, Radiation Physics and Chemistry, Volume 67, pp. 593-598 (2003) 4. Fuochi P.G., Lavalle M., Martelli A., Corda U., Kovacs A., Hargittai P., Mehta K., Energy device for monitoring 4-10 MeV industrial electron accelerators, Nuclear Instruments and Methods in Physics Research A, Volume 546, pp. 385-390 (2005) 5. M. Lavalle, P.G. Fuochi, A. Martelli, U. Corda, A. Kovacs, K. Mehta, and F. Kuntz, Energy Monitoring Device for Electron Beam Facilities, International Top- ical Meeting on Nuclear Research Applications and Utilization of Accelerators, Conference proceedings, Vienna (2009) 6. Vanzha S.A., Nikiforov V.I., Pomatsalyuk R.I., Tenishev A.Eh., Uvarov V.L., Shevchenko V.A., Shlyakhov I.N., Development “radiation shadow” technique for regime monitoring of product sterilization by electron beem, Problems of Atomic Science & Technology. Series “Nuclear Physics Investigations”, Volume 2(53), pp. 150–153 (2010) 7. Petrov Yu.P., Sizikov V. S., Well-Posed, Ill-Posed, and Intermediate Problems with Applications, V.S.P. Intl Science, Leiden, Netherlands, 234 p. (2005) 8. Haykin S, Neural networks: a comprehensive foundation, 2nd edn. Prentice Hall, Englewood Cliffs, United States, 936 p. (1999) 9. Michael M. Li, Brijesh Verma, Xiaolong Fan, Kevin Tickle, RBF neural networks for solving the inverse problem of backscattering spectra, Neural Computing and Applications, Volume 17, pp. 391-397 (2008) 10. Michael M. Li, William Guo, Brijesh Verma, Kevin Tickle, John OConnor, Intelli- gent methods for solving inverse problems of backscattering spectra with noise: a comparison between neural networks and simulated annealing, Neural Computing and Applications, Volume 18, pp. 423-430 (2009) 11. Barradas N. P., Vieira A., Artificial neural network algorithm for analysis of Rutherford backscattering data, Phys Rev E Stat Phys Plasmas Fluids Relat In- terdiscip Topics 62, pp. 58185829 (2000) 12. Barradas N.P., Patricio R.N., Pinho H.F.R., Vieira A., A general artificial neu- ral network for analysis of RBS data of any element with Z between 18 and 83 implanted into any lighter one- or two-element target, Nuclear Instruments and Methods in Physics Research B, Volumes 219-220, pp. 105-109 (2004) 13. Adadurov A., Lazurik V., Rogov Yu., Tokarevskii V., Shechenko S., Spectrome- try of Intense Fluxes of Gamma Radiation by Means of the Method of Capsule- Absorbers, IEEE Nuclear Science Symposium and Medical Imaging Conference, Conference Publications, Anaheim, Unated States, pp. 17 (1996) 14. Lazurik V.T., Lazurik V.M., Popov G., Rogov Yu., Zimek Z., Information System and Software for Quality Control of Radiation Processing. IAEA: Collaborating Use of Neural Networks for Monitoring Beam Spectrum . . . 129 Center for Radiation Processing and Industrial Dosimetry, Warsaw, Poland, 220 p. (2011) 15. Specht Donald F., A General regression neural network, IEEE Transactions on Neural Networks, Volume 2(6), pp. 568–576 (1991) 16. Baiev O., Didenko I., Lazurik V., Mishchenko V., Towards the questions on planing the development of the department compute cluster, Proceedings of ICTERI-2011, Kherson, Ukraine, pp. 27–28 (2011) 17. Baiev O., Lazurik V., Advantages of neural networks for deriving an electrons spectrum from depth-charge curve, “IEEE Nuclear Science Symposium and Medi- cal Imaging Conference”, Conference Publications, Valencia, Spain, pp. 1395-1397 (2011) 18. Baiev O.U., Lazurik V.T., Discretization grid of depth-charge curve selecting for electrons beam spectrum reconstruction problem, Bulletin Kherson National Tech- nical University, Value 3(42), pp. 62–66 (2011) Lazy Parallel Synchronous Composition of Infinite Transition Systems Yuliia Romenska and Frédéric Mallet Université Nice Sophia-Antipolis Aoste Team Project (INRIA/I3S), Sophia Antipolis, France Yuliia.Romenska@inria.fr, Frederic.Mallet@unice.fr Abstract. Embedded System Design is becoming a field of choice for Model-Driven Engineering techniques. On the engineering side, models bring an abstraction of the code that can then be generated (and regen- erated) at will. On the semantic side, they bring a reasoning framework to guarantee or verify properties on the generated code. We focus here on the Clock Constraint Specification Language, initially defined as a companion language of the uml Profile for marte. More specifically, we define a state-based representation of ccsl operators. To deal with unbounded operators, we propose to use lazy evaluation to represent in- tentionally infinite transition systems. We provide an algorithm to make the synchronized product of such transition systems and we study its complexity. Even though the transition systems are infinite, the result of the composition may become finite, in which case the (semi)algorithm terminates and exhaustive analysis becomes possible. Keywords. Multiform logical time, synchronized product, lazy evalua- tion, marte ccsl. Key terms. FormalMethod, VerificationProcess, MathematicalModel, SpecificationProcess. 1 Introduction. Context and Goal of the Project In the model-driven approach to embedded system engineering, application and architecture models are developed and refined concurrently, and then associated by allocation relationships. The representation of requirements and constraints in this context becomes itself an important issue, as they guide the search for optimal solutions inside the range of possible allocations. One of the important aspects of embedded system modeling is to capture the functional and non- functional requirements as well as the constraints (functional and non-functional) imposed by the execution platform through the allocation. Multiform logical time is a flexible notion of time suitable for both functional and extra-functional properties that supports an iterative refinement process. Logical time considers time bases that can be generated from sequences of events Lazy Parallel Synchronous Composition of Infinite Transition Systems 131 not necessarily regular in physical time (as the usual meaning suggest). Some of the essence of multiform logical time was captured and encapsulated into a dedicated language called the Clock Constraint Specification Language (ccsl) [1, 2]. ccsl was initially defined as a companion language of the uml profile for Modeling and Analysis of Real-Time and Embedded systems (marte) [3]. ccsl has arisen from different models in an attempt to abstract away the data and the algorithm and to focus on events and control. Even though ccsl was initially defined as the time model of the uml profile for marte, it has now become a full fledged domain-specific modeling language for capturing chrono- logical, causal and timed relationships and is now developed independently. It combines constructs from the general net theory and from the synchronous lan- guages [4]. It is based on the notion of clocks which is a general name to denote a totally ordered sequence of event occurrences. It defines a set of clock relations and expressions. Some ccsl operators are bounded, others are unbounded. The bounded operators can be represented with finite Boolean transition systems. Unbounded operators require a specific symbolic representation. Until then, the clock calculus on ccsl specification was performed step by step up to a predefined number of steps. This work is an attempt to support exhaustive analysis of ccsl specification. When ccsl operators are represented as transition systems, their composition is the synchronized product of the tran- sition systems. However, this causes termination problems when the transition systems have an infinite number of states. In this paper, an algorithm for the parallel execution of automata representing ccsl operators is proposed. It has been implemented in a prototype tool. This algorithm supports ccsl unbounded operators. The infinite data structure is unfolded on demand using a lazy evalua- tion technique. This is a significant evolution on previous verification techniques for ccsl [5,6] that were only considering a subset of operators a priori bounded. 2 Contribution The main contribution is to propose an encoding based on lazy evaluation to rep- resent ccsl unbounded operators. The second contribution is to propose an algo- rithm to build the synchronized product of such automata. The (semi)algorithm terminates when the composition of unbounded automata becomes bounded. In this work, the main operators of the clock constraint language were con- sidered. For each basic expression and each relations of the kernel language, a transition system is proposed. Each transition is labeled by the set of clocks that must tick for the transition to be taken. Clocks can stall, tick or be dead. When a clock is dead, it cannot tick anymore. A path in the automaton is then an infinite word on the alphabet of powersets of clock names. The automata representing the unbounded ccsl operators consist of an in- finite number of states and therefore transitions (even though each state has a finite number of outgoing transitions). For those operators, the lazy evalua- tion technique was applied. It allows postponing the construction of the state of an unbounded automaton to the moment when it is actually needed. In very 132 Yu. Romenska and F. Mallet frequent cases, the specification becomes bounded and the intentional infinite representation is never actually expanded. On these transition systems, we apply the classical synchronized product of transition systems [7]. In the worst case (when automata are independent) the composition is very costly, exponential in the number of clocks. In some (frequent) cases, the cost of composition is much better, even though we have not identified yet the exact cases where the composition is tractable. 3 Related work and inspirations 3.1 Timed automata The formalism of timed automata [8] has been designed to allow the specification and verification of real-time systems. It extends the concept of finite ω-automata by establishing time constraints. One of the main notions of the timed automata theory is a clock (not to be confused with the notion of clocks introduced in ccsl). In a timed transition table the selection of the next state depends on an input symbol and the time reading of the symbol. For this purpose each transition table is associated with the set of real-valued clocks. A clock can be set to zero simultaneously with the execution of one of the transition. At any instant the values of such clocks are equal to the time elapsed since the last time they were reset. The transition can be executed only if the value of the clock satisfies the constraint associated with this transition. Therefore in the theory of the timed automata a clock is an entity intended for determination of time which elapsed since the last execution of the transition and the setting the value of the clock to zero. To answer the question if this formalism can be applied for representation of the ccsl language basic constructions, the definition of a clock in ccsl time model must be considered. In terms of ccsl time model a clock is a set of ordered instants. Each clock has a lifetime limited by birth and death instants. Formally, a Clock c is a tuple (Ic , ≺c , c↑ , c↓ , ≡c↓ ) where Ic is a sequence of instants (it can be infinite), c↑ , c↓ are birth and death instants respectively such that Ic ∩ {c↑ , c↓ } = , ≡c↓ is a coincidence relation and ≺c is an order relation on Ic ∪ {c ↑, c ↓}. All instants of clocks are strictly ordered, the birth instant precedes all the other instants of the clock and every instant precedes the death. If the set of instants Ic is infinite then the death instant is not necessary. Ic represents the occurrences or ticks of the clock c. Thus we can see that the notions of clock in the terms of timed automata formal model and the clock constraint specification language are radically dif- ferent. Timed Automata, clocks captured physical real-valued time properties, whose value is within a dense time interval. All time clocks evolve at the same rate (without drift). In ccsl, clocks represent logical time properties. The un- bounded nature precisely comes from the relative (unbounded) drifts between the clocks that evolve at their own independent rhythm. Lazy Parallel Synchronous Composition of Infinite Transition Systems 133 3.2 Synchronous Data Flow, Marked Graphs Synchronous Data Flow (SDF) [9, 10] is a special case of the dataflow model of computation. Its main characteristic is that the flow of control is completely predictable at compile time. The main components of SDF are the actors, to- kens, and arcs. Production and consumption of tokens by actors allow modeling of relative rates of events. In synchronous dataflow numbers of consumed and produced tokens are constant throughout the execution. To avoid the overflow of resources and to maintain a balanced system, the scheduler must fire the source and destination components at different rates. Such systems can then be used to capture the relative drift between ccsl clocks. Safety analysis on SDF graphs is a way to determine whether the system remains bounded or not. Such techniques could be used to study boundness issues for ccsl specifications. However, this is not the concern of this paper. We assume that the composition is bounded and propose an algorithm to build the synchronized product. 3.3 Synchronized Product of Transition Systems When ccsl operators are expressed as transition systems, their parallel com- position simply is the synchronized product of the transition systems [7, 11, 12]. Synchronization vectors are used to decide which transition systems must syn- chronize on which transitions. Synchronization vectors allows the specification of purely asynchronous compositions (where only one single system is fired at each step) to purely synchronous compositions (where all the automata must fire one transition at each step), and all the intermediate synchronization schemes. The main difference here is that the number of states may be infinite and we use lazy evaluation to dynamically expand the states whenever they are required to build a new composite state. The composition algorithm terminates only when the synchronized product becomes finite. In [13], there was an initial attempt to build the synchronized product of unbounded ccsl operators. In that work, the automata were folded using extended automata (with unbounded integer variables) rather than lazy evaluation. Therefore, the algorithm to compute the synchronized product was always guaranteed to terminate. However, deciding whether the result was finite or not would then require using integer linear pro- gramming techniques. 4 The Clock Constraint Specification Language This section briefly introduces the logical time model of the Clock Constraint Specification Language (ccsl). A technical report [1] describes the syntax and the semantics of a kernel set of ccsl constraints. A clock c is a totally ordered set of instants, Ic . In the following, i and j are instants. S A time structure is a set of clocks C and a set of relations on instants I = c∈C IC . ccsl considers two kinds of relations: causal and temporal 134 Yu. Romenska and F. Mallet ones. The basic causal relation is causality/dependency, a binary relation on I :4 ⊂ I × I. i 4 j means i causes j or j depends on i. 4 is a pre-order on I, i.e., it is reflexive and transitive. The basic temporal relations are precedence (≺), coincidence (≡), and exclusion (#), three binary relations on I. For any pair of instants (i, j) ∈ I × I in a time structure, i ≺ j means that the only acceptable execution traces are those where i occurs strictly before j (i precedes j). ≺ is transitive and asymmetric (reflexive and asymmetric). i ≡ j imposes instants i and j to be coincident, i.e., they must occur at the same execution step, both of them or none of them. ≡ is an equivalence relation, i.e., it is reflexive, symmetric and transitive. i # j forbids the coincidence of the two instants, i.e., they cannot occur at the same execution step. # is irreflexive and symmetric. A consistency rule is enforced between causal and temporal relations. i 4 j can be refined either as iπj or i ≡j, but j can never precede i. We consider here discrete sets of instants only, so that the instants of a clock can be indexed by natural numbers. For a clock c ∈ C, and for any k ∈ N>0 , c[k] denotes the k th instant of c. Specifying a full time structure using only instant relations is not realistic since clocks are usually infinite sets of instants. Thus, an enumerative spec- ification of instant relations is forbidden. The Clock Constraint Specification Language (ccsl) defines a set of time patterns between clocks that apply to infinitely many instant relations. 4.1 The kernel relations Table 1 gives a full list of the basic clock relations provided in the ccsl kernel. For each of them the automaton was built. It is supposed that the automaton can fire only if one of the participant clocks in ccsl operator ticks. Table 1. Basic relations defined in the ccsl kernel (a and b are clocks, not instants). Ref Name Kind of relation Notation R1 Subclocking Synchronous a ⊂ b R2 Coincidence Synchronous a = b R3 Precedence Asynchronous, unbounded a 4 b R4 Strict Precedence Asynchronous, unbounded a ≺ b R5 Exclusion Asynchronous a # b Coincidence According to the considered relation the clocks a and b always tick simultaneously (a = b), it is defined as (∀k ∈ N? )(a[k] ≡ b[k]). Subclocking a ⊂ b defines a as being a subclock of its superclock b. Every instant of the subclock occurs synchronously with one of the instants of the superclock: (∀k ∈ N? )(∃i ∈ N? )(a[k] ≡ b[i]). Lazy Parallel Synchronous Composition of Infinite Transition Systems 135 Exclusion a # b. The clocks connected with this relation cannot have coinci- dence instants: (∀k ∈ N? )(∀i ∈ N? )(a[k] # b[i]). Precedence a ≺ b is the index-dependent relation. This operator is un- bounded. Every instant of clock a has to precede the instant of clock b with the same index (∀k ∈ N? )(a[k] ≺ b[k]). Strict Precedence a ≺ b. This relation is a severer version of the previous one in the sense that the instants of the clocks a, b with the same indices cannot be equal: (∀k ∈ N? )(a[k] 4 b[k]). 4.2 The kernel expressions A ccsl specification consists of clock declarations and conjunctions of clock relations between clock expressions. A clock expression defines a set of new clocks from existing ones. Most expressions deterministically define one single clock. Table 2 gives a list of the ccsl kernel expressions. Table 2. Basic expressions defined in the ccsl kernel. Ref Name Kind of expression Notation Textual form E1 Inf Mixed, unbounded a∧b a glb b E2 Sup Mixed, unbounded a∨b a lub b E3 Defer Mixed a ( ns ) b a def erred b f or ns E4 Sampling Mixed a 7→ b a sampling b E5 Strict sampling Mixed a→b a strictlySampled b E6 Intersection Synchronous a∗b a clockInter b E7 Union Synchronous a+b a clockU nion b E8 Concatenation Synchronous a•b a f ollowedBy b E9 Waiting Synchronous a fn b a wait n b E10 Preemption Synchronous a b a upto b (UpTo) 136 Yu. Romenska and F. Mallet Sampling a 7→ b. The sampling expression ticks in coincidence with the tick of the base clock immediately following a tick of the trigger clock and after it dies. In the considered case, the trigger clock is b and the base clock is a. The textual syntax of this expression is represented as c = a sampling b. In Figure 1 the automaton is given, where input symbol c is equal to the result clock of the expression. The notation {a, b, c} denotes that the automaton remains in state s2 if a, b and c tick all together simultaneously. if b and c tick simultaneously without a then, the automaton goes back to state s1 . If a ticks alone, it stays in s2 . All other cases are forbidden by the semantics of the operator. {b} {a} {a} s1 s2 {a, b, c} {b, c} {a, b} Fig. 1. The automaton for sampling expression Strict Sampling a → b. The expression is a strict version of the previous one where c is emitted when the automaton is in state s1 , and a and b tick simultaneously. Waiting a fn b. The resulting clock ticks only once after a special number given as a parameter of the base clock, and then the resulting clock dies. c = a wait n b, where n is a given parameter (it is a natural number). Preemption (UpTo) a b. The resulting clock ticks in coincidence with a, it dies as soon as b starts to tick: c = a upto b. Union This expression is non-terminating and index-independent. Its result is a clock with set of instants which is a union of the instants sets of the clocks- parameters that participate in the expression: c = a + b. Intersection The result of this index-independent expression is the clock which ticks each time when the clocks-parameters tick simultaneously: c = a ∗ b. Concatenation a • b . The expression is terminating. The resulting clock ticks in coincidence with the first clock-parameter a. After death of a it starts to tick in coincidence with the second clock-parameter b . It should be noted that this expression is valid only if the first clock eventually dies, i.e. a ↓ is specified. Lazy Parallel Synchronous Composition of Infinite Transition Systems 137 Defer (Delay) a ( ns ) b. The parameter of the expression are a (the base clock), b (the delay clock) and ns that represents a sequence of elements from N>0 . The sequence of the natural numbers can have an infinite part. Let ns[i] be the ith element of the sequence. We assume that if i > 0 then ns has an infinite part, if 0 6 i < p there are p elements in the sequence. Every tick of the base clock starts up the counter for respective element of the sequence. For every tick of the delay clock the relative counter is decreased. When the counter reaches 1 the respective instant of clock b occurs. The textual form of the expression is c = a def erred b f or ns. Sup a ∨ b. The expression is index-dependent. The expression a ∨ b defines a clock that is slower than both a and b and whose k th tick is coincident with the later of the k th ticks of a and b. The formal definition is presented as: (a 4 (a ∨ b))(b 4 (a ∨ b))(∀c ∈ C) : (a 4 c)&(b 4 c) ⇒ ((a ∨ b) 4 c). This is a typical example of unbounded transition system (with an infinite number of states). {a, b} {a, b} {a, b, c} {a, b} {a, b} {a, c} {a, c} {a} {a} ... s−1 s0 s1 ... {b} {b} {b, c} {b, c} Fig. 2. The automaton for sup expression Inf a ∧ b. This is index-dependent and unbounded. It is the dual of the previous one. The result of the expression is the slowest of faster clocks. It means that the defined clock is faster than both a and b , the k th tick of this clock occurs in coincidence with the earlier of the k th ticks of both a and b. ((a ∧ b) 4 a)((a ∧ b) 4 b)(∀c ∈ C) : (c 4 a)&(c 4 b) ⇒ (c 4 (a ∧ b)) is the formal definition. 4.3 Unbounded CCSL operators Lazy evaluation or call-by-needed is an evaluation strategy that delays the eval- uation of an expression until its value is actually needed. This approach allows construction of potentially infinite data structures and avoids repeated evalua- tions. To construct the algorithm of the parallel execution of several automata, it is necessary to have the possibility to work with infinite data structures (transi- tion systems with an infinite number of states). Lazy evaluation provides the ap- paratus for this task. The ccsl has four basic unbounded operators which can be 138 Yu. Romenska and F. Mallet represented as infinite automata: the precedence relation, the strict precedence relation, the inf expression (the fastest of slower clocks) and the sup expression (the slowest of faster clocks). Example 1. Let us consider as an example the automaton for the strict precedence relation (Fig. 3). Fig. 3. The automaton for strict precedence relation If we only consider the possible outgoing transitions of each state, we can distinguish the initial state (0 on Fig. 3) from the other ones. In the initial state, only a can tick and must tick alone. In the other states, if a ticks alone, we must progress to the right (increasing an unbounded counter), if b ticks alone, we must go back to the left (decreasing the counter). If a and b tick simultaneously, we remain in the same state. We assume that each state has a default transition to itself, when nothing happens. 5 The synchronized product: an algorithm We assume that the input for the algorithm is a given finite set of automata; each of them can be either finite or infinite. The output is the automaton that is the composition of the input ones. We denote the resulting automaton through the tuple R. Let us introduce the set St which is the set of all unconsidered states of the resulting automaton. At the beginning of the algorithm execution, the initial states of input automata are considered as the current ones. The first unconsidered state of the resulting automaton is the state formed from current ones. For each considered current state of input automata the set of available transitions from the given state is calculated. For all input symbols of each of the calculated transitions the consistent solution is computed, which is the set such that every original symbol is its subset. If the solution exists, the new state is calculated from the states of input automata, in which we can go on the appropriate input symbols that are the subset of the found solution. If the computed new state does not belong to St, it is added to the set of unconsidered states. The resulting automaton is completed by the new state Lazy Parallel Synchronous Composition of Infinite Transition Systems 139 and the appropriate transition, which has the input symbol that is equal to the respective solution. If the solution cannot be found, then we take a state from St, in the input automata the appropriate states are considered as current. The process is repeated while the set St is not empty. 5.1 The Formal Definitions We introduce the formal definitions that are used to describe the algorithm. C is the (finite) set of clocks and —C— is its cardinality; state : C → StateDomain = {ST ALL, T ICK, DEAD} that denotes the state of a clock. A = {Ai }: the ordered set of input automata for the composition. Ai = (Ci ; Si ; Ii ; moveOni ; si0 ∈ Si ; curi ∈ Si ; availT ransi ), where Ci ⊂ C: the ordered set of clocks of the ith automaton; Si : the set of states of the ith automaton; si0 ∈ Si : the initial state of the automaton; moveOni : Si × 2Ci → Si is the transition function; curi ∈ Si : the current considered state; Ci availT ransi : Si → 22 : gives a set of available configurations for a given state. The resulting composite automaton can be defined as follows: R = (C; SR ; moveOnR ; s0R ∈ SR ; curR ∈ SR ; availT ransR ), such that: C: the set of clocks of the composite automaton is equal to the set of global clocks; SR = {sR : sR = (s1 , ..., s|A| ), si ∈ Si , ∀i = 1...|A|}: each element of the set of states of the resulting automaton consists of the states of input automata; the element of the considered composite state corresponds to the automaton with the same number; moveOnR : SR ×2C → SR is the transition function for the resulting automaton; s0R ∈ SR : the initial state; curR ∈ SR : the current considered state; C availT ransR : SR → 22 : this function returns the set of input symbols for the available transitions of the composite automaton. The set of found solutions is represented by the set SOL = {sol : sol = (st1 , . . . , st|C| ), sti ∈ {ST ALL, T ICK, DEAD}, ∀i = 1...|C|}. St = {sR : sR ∈ SR } is the set of considered states of the resulting automa- ton. Get : Set → element, element ∈ Set: the function that returns an element of the given set. Register : C × {1..|A|} → 2C . This function returns an input symbol for the th i automaton based on the states of the global set clocks. index : 2C × C → N, the position of a clock in an ordered set of clocks. 5.2 The Composition Algorithm Global variables. 140 Yu. Romenska and F. Mallet indexTable:=0: indexTable ∈ N; array StateDomain temp[]: ∀i ∈ {0, ..., |C|-1}, temp[i] ∈ StateDomain; array StateDomain cur_sol[]: ∀i ∈ {0, ..., |C|-1}, cur_sol[i] ∈ StateDomain; array StateDomain OptionsTable[][]: ∀j ∈ {0, ..., |C|-1}, ∀i ∈ {0, ..., 3|C| -1}, OptionsTable[i][j] ∈ StateDomain; array Boolean SolutionsTable[][]: ∀j ∈ {0, ..., |A|-1}, ∀i ∈ {0, ..., 3|C| -1}, SolutionsTable[i][j] ∈ {true, false}; cur_solution ∈ SOL new stR ∈ SR The main purpose of the following function is to build the composite au- tomaton from the given set of the input ones. It uses three other functions: buildSolutions(), getSolutions() and buildOptionsTable(). function composition(){ // the initial state is the cartesian product of all the initial states |A|−1 1. s0R := s00 × ... × s0 ; St:={s0R }; 3. while(St6= ∅){ 4. curR :=GetElement(N); 5. SOL:=buildSolution(curR ); 6. while(SOL6= ){ //for each solution 7. cur_solution:=Get(SOL); // set clocks to the appropriate states 8. for(k:=0; k<|C|; k:=k+1){ C[k]:=cur_solution[k]; } //form a composite state from the states of the input automata in which // it is possible to go with the respective input sets 9. for(i:=0; i<|A|; i:=i+1){ new_sti := moveOn(curi , Register(C,i)); } // include the new created state if it is not yet included 10. if(new_stR ∈/ SR ){ 11. SR :=SR ∪new_stR ; 12. IR :=IR ∪cur_solution; 13. } //include the created state in the set of unconsidered states 14. if(new_stR ∈ / St){ St:=St ∪ new_stR ; } //set the current states to the previous positions 15. for(i:=0; i<|A|; i:=i+1){ curi :=curR [i]; } 16. SOL:=SOLcur_solution; 17. } 18. N:=NcurR ; 19. } } Lazy Parallel Synchronous Composition of Infinite Transition Systems 141 The following function calculates the set of possible solutions. It builds a table of all options (OptionsTable) for the states of the clocks. The number of rows of this table is equal to 3|C| . The base is equal to three because all three possible states are considered (ST ALL, T ICK, DEAD). The number of columns is equal to the number of the clocks in the global set. Also, the table for finding solutions is defined (SolutionsTable). The number of rows is equal to 3|C| , the number of columns corresponds to the number of input automata. If the input set defined in a row of OptionsTable is available for the input automaton, then the value of an element of SolutionsTable situated in the same row as the input set is set the value true. When the SolutionsTable is completed (all available input symbols of input automata have been considered), the function getSolutions() is invoked to find the set of solutions SOL using the data of SolutionsTable. function buildSolutions(sR ){ 20. buildOptionTable(0); //Initially there are no solutions 21. for(r:=0; r<3|C| ; r:=r+1){ 22. for(i:=0; i<|A|; i:=i+1){ SolutionsTable[r][i]:=false; } 23. } 24. for(i:=0; i<|A|; i:=i+1){ // receiving the set of input sets for all available transitions // of the current state of the appropriate ith // automaton n is a number of available transitions 25. Iin := availT ransi (sR [i]); //assignment of the value true to an element of SolutionTable 26. for(k:=0; k0, let us denote the ratio between Bn,k (m2 ) and Bn,k (m1 ) by a new function: Bn,k (m2 ) tn,k (m1 , m2 ) = . Bn,k (m1 ) Thus, by direct calculations we obtain the following formula for mutual in- formation: 2 ln 2 − 1 2 ln 2 − 1 X 1 X I(X(n) ; Θ) = − Bn,k (m1 ) − Bn,k (m1 )· 2 ln 2 2 ln 2 2 k∈An k∈An tn,k (m1 , m2 ) tn,k (m1 , m2 )2 · log + log (tn,k (m1 , m2 ) + 1) tn,k (m1 , m2 ) + 1 . (5) 1 − tn,k (m1 , m2 ) Let f (t) be a function defined as follows: t t2 · log + log(t + 1) t+1 f (t) = . 1−t Now it is easy to see that the formula (5) can be written as 2 ln 2 − 1 2 ln 2 − 1 X I(X(n) ; Θ) = − Bn,k (m1 )+ 2 ln 2 2 ln 2 k∈An ! 1 X Bn,k (m1 ) · f (tn,k (m1 , m2 )) . (6) 2 k∈An We claim that in the considered case (0 < m1 < m2 < 1) mutual information is less than 2 ln 2 − 1 . 2 ln 2 To prove this fact it is enough to notice that the variable part of expression (6) is always negative. Actually, we know that Bn,k (m1 ) > 0 and tn,k (m1 , m2 ) is a Asymptotical Information Bound of Consecutive Qubit Binary Testing 171 ratio of two positive values, so, it is also positive. In addition, it is easy to show in the classical way that for all t ∈ (0, 1) ∪ (1, ∞) function f (t) is greater than zero. Now we need to consider the special case, in which 0 < m1 < 1, m2 = 1. It is evident that the case when m1 = 0, 0 < m2 < 1 is similar to the latter. Suppose that 0 < m1 < 1, m2 = 1. Thus, on the one hand, for all k we have Bn,k (m1 ) > 0. On the other hand, for all k < n Bn,k (m2 ) = 0, and only for k = n Bn,k (m2 ) = 1. It is easy to see that in this case we obtain 2 ln 2 − 1 I(X(n) ; Θ) = + 2 ln 2 Bn,n (m2 )2 log Bn,n (m2 ) − (Bn,n (m2 )2 − 1) · log (1 + Bn,n (m2 )) 1 · = 2 Bn,n (m2 ) − 1 2 ln 2 − 1 1 2 ln 2 − 1 − · f (Bn,n (m2 )) < . 2 ln 2 2 2 ln 2 Thus, now we know that 2 ln 2 − 1 I(X(n) ; Θ) ≤ , 2 ln 2 and, in particular, the equality is held if and only if the considered binary test is projective. 4 Asymptotic Properties of Consecutive Measurements In the previous section we have considered the case when the given qubit binary test is a projective measurement. We have proved that only this type of measure- ment allows to achieve the maximum of information about the initial state. As far as the measurement is projective, repeating of the measuring procedure does not provide any extra information. In addition, we have found the maximum value of the accessible information: 2 ln 2 − 1 max {I(X(n) ; Θ)} = . T∈M, n∈N 2 ln 2 In this section we return to considering of the general view of a qubit test, and we work with consecutive qubit testing. So, this time we investigate the dependence of the amount of information on n – the number of iterations. The objective of this section is to prove that the maximum of accessible information can be reached asymptotically by performing consecutive measurements using an arbitrary qubit binary test. More strictly, our aim is to prove the next theorem: Theorem 1. Suppose we have a pure qubit state and we perform consecutive qubit binary testing using the given test T = {M1 , M2 }. Then for arbitrary 172 A. Varava and G. Zholtkevych ε > 0 there exists a corresponding number of iterations n(ε) such that for all subsequent iterations (n > n(ε)) the following inequality is held: max {I(X(m) ; Θ)} − I(X(n) ; Θ) < ε . T∈M, m∈N In other words, as far as the mutual information can be written as 2 ln 2 − 1 2 ln 2 − 1 X I(X(n) ; Θ) = − Bn,k (m1 )+ 2 ln 2 2 ln 2 k∈An ! 1 X Bn,k (m1 ) · f (tn,k (m1 , m2 )) , 2 k∈An we need to find n1 (ε) such that for all n > n1 (ε) X ε Bn,k (m1 ) · f (tn,k (m1 , m2 )) < , (7) 2 k∈An and n2 (ε) such that for all n > n2 (ε) X ε Bn,k (m1 ) < . (8) 2 k∈An Therefore, for n > n(ε) = max{n1 (ε), n2 (ε)} both of these inequalities are held. Let us fix a certain positive value of ε. At first we consider the left side of inequality (7). Let us divide the set An into two non-intersecting subsets: k Γn (m1 ) = {k ∈ An } : − m1 < δ̃(m1 , m2 ) , n k ∆n (m1 ) = {k ∈ An } : − m1 ≥ δ̃(m1 , m2 ) , n where δ̃(m1 , m2 ) is a certain positive function. It was demonstrated in [7], that for 0 < m1 < 1 and δ > 0: X 1 Bn,k (m1 ) ≤ . (9) 4nδ 2 k∈∆n (m1 ) On the one hand, it is easy to see that f (t) is a bounded function. Suppose that for all t ∈ (0, 1) ∪ (1, ∞) f (t) < C, where C is a certain positive constant. Thus we have X C Bn,k (m1 ) · f (tn,k (m1 , m2 )) ≤ . 4n δ̃ 2 (m , m ) k∈∆ (m ) 1 2 n 1 So, we can choose a value n1,1 (ε) such that for all n > n1,1 (ε) X ε Bn,k (m1 ) · f (tn,k (m1 , m2 )) < . 4 k∈∆n (m1 ) Asymptotical Information Bound of Consecutive Qubit Binary Testing 173 On the other hand, we can see that when k is close to n · m1 , the value of tn,k (m1 , m2 ) goes to zero as n goes to infinity. As far as lim f (t) = 0, there exists t→0 a value n1,2 (ε) such that for all n > n1,2 (ε) X ε Bn,k (m1 ) · f (tn,k (m1 , m2 )) < . 4 k∈Γn (m1 ) Now let n1 (ε) be a maximum of values n1,1 (ε) and n1,2 (ε). Thus, for all n > n1 (ε) inequality (7) is held. Finally, let us consider inequality (8). Note that in the case of inequality m1 < m2 the set An contains at most one element. Actually, by construction k ∈ An if and only if Bn,k (m1 ) = Bn,k (m2 ). Solving this equation for the variable k, we have ln ((1 − m1 )/(1 − m2 )) k0 = n · . ln (((1 − m1 ) · m2 )/((1 − m2 ) · m1 )) If n, m1 and m2 are such that k0 is an integer then An = {k0 }. If not, the set An is empty. It is easy to show that lim Bn,k (m1 ) = 0, so, we can easily find n→∞ n2 (ε) such that for all n > n2 (ε) inequality (8) is held. Now let us build a rigorous proof of the considering statement using this heuristic consideration. To do it, we firstly need several trivial propositions. Proposition 1. The following statements are correct: 1. for all x ∈ (0, 1) the inequality x > ln (x + 1) is true; 2. for all x > 0 the inequality ln(x/(x + 1)) < −1/(x + 1) is true too. This proposition can be easily proved using Tailor series expansion of ln (x) and Euler’s transform applied to this expansion. Proposition 2. Let x, y ∈ (0, 1) and x 6= y then the following inequality is held: y (1−y) x 1−x <1. y 1−y The proof is omitted. Now we can prove the above formulated theorem. Proof (of Theorem 1). As we already know, the mutual information can be presented as 2 ln 2 − 1 2 ln 2 − 1 X I(X(n) ; Θ) = − Bn,k (m1 )+ 2 ln 2 2 ln 2 k∈An ! 1 X Bn,k (m1 ) · f (tn,k (m1 , m2 )) . 2 k∈An 174 A. Varava and G. Zholtkevych We also know that 2 ln 2 − 1 max {I(X(n) ; Θ)} = . T∈M, n∈N 2 ln 2 To prove the theorem it is enough to show that there exists n(ε) such that for all n > n(ε) X X Bn,k (m1 ) · f (tn,k (m1 , m2 )) + Bn,k (m1 ) < ε . k∈An k∈An Consider an arbitrary ε > 0. Let us divide the set An into subsets Γn (m1 ) and ∆n (m1 ) in the following way: k Γn (m1 ) = {k ∈ An } : − m1 < δ̃(m1 , m2 ) , n k ∆n (m1 ) = {k ∈ An } : − m1 ≥ δ̃(m1 , m2 ) , n where δ̃(m1 , m2 ) is a certain positive function of m1 and m2 defined further. Our aim is to prove that there exists such n(ε) that for all n > n(ε): X ε Bn,k (m1 ) · f (tn,k (m1 , m2 )) < , (10) 4 k∈∆n (m1 ) X ε Bn,k (m1 ) · f (tn,k (m1 , m2 )) < , (11) 4 k∈Γn (m1 ) X ε Bn,k (m1 ) < . (12) 2 k∈An Firstly, let us consider inequality (10). We had already mentioned that for all t ∈ (0, 1) ∪ (1, ∞) f (t) ≥ 0, and it is easy to see that for considered values of t f (t) < 2 . So, using the above mentioned property of Bernstein basis polynomials (9), we have: X 1 Bn,k (m1 ) · f (tn,k (m1 , m2 )) ≤ . k∈∆n (m1 ) 2nδ̃ 2 (m1 , m2 ) Let n1,1 (ε) be sufficiently great. Then for all n > n1,1 (ε) the considering inequality (10) is held. Secondly, let us consider inequality (11). On the one hand, it is not hard to find such δ(ε) that t t2 · log + log(t + 1) t+1 ε ∀t ∈ (0, δ(ε)) : f (t) = < . 1−t 4 Asymptotical Information Bound of Consecutive Qubit Binary Testing 175 On the other hand, for sufficiently great values of n for all k ∈ Γn (m1 ) we have tn,k (m1 , m2 ) < δ(ε). It follows that for great values of n we obtain n X ε X ε Bn,k (m1 ) · f (tn,k (m1 , m2 )) < · Bn,k (m1 ) = . 4 4 k∈Γn (m1 ) k=0 √ 1+(ε/2·ln(2))2 −1 1 Let δ(ε) = min ε/2·ln(2) ;2 . Then for t ∈ (0, δ(ε)) we have ε · ln(2) t< (1 − t2 ). As far as 0 < t ≤ 12 < 1, we obtain 4 2 t t − t+1 t ε = < . (1 − t) · ln(2) (1 − t2 ) · ln(2) 4 If we combine this with Proposition 1, then for all t ∈ (0, δ(ε)) we have t t2 · ln + ln(t + 1) t + t 2 · − 1 t2 t+1 t+1 t − t+1 ε f (t) = < = < . (1 − t) · ln(2) (1 − t) · ln(2) (1 − t) · ln(2) 4 Now we need to find such n1,2 (ε) that for all k ∈ Γn (m1 ) : tn,k (m1 , m2 ) < δ(ε). By definition, k n−k m2 1 − m2 tn,k (m1 , m2 ) = · . m1 1 − m1 m2 As far as > 1, tn,k (m1 , m2 ) strictly increases with respect to k. So, if m1 k − m1 < δ̃(m1 , m2 ) then n m1 +δ̃(m1 ,m2 ) 1−m1 −δ̃(m1 ,m2 ) !n m2 1 − m2 tn,k (m1 , m2 ) < · . m1 1 − m1 Consider the right side of this inequality. Note that m1 +δ̃(m1 ,m2 ) 1−m1 −δ̃(m1 ,m2 ) m2 1 − m2 · m1 1 − m1 srtictly increases with respect to δ̃(m1 , m2 ). It is equal to 1 when δ̃(m1 , m2 ) = δ̃ ∗ (m1 , m2 ), where ∗ ln ((1 − m2 )/(1 − m1 )) δ̃ (m1 , m2 ) = − m1 . ln ((1 − m2 )/(1 − m1 ) · m1 /m2 ) According to Proposition 2, m1 1−m1 m2 1 − m2 · <1, m1 1 − m1 176 A. Varava and G. Zholtkevych we see that for δ̃(m1 , m2 ) ∈ (0, δ̃ ∗ (m1 , m2 )) we have m1 +δ̃(m1 ,m2 ) 1−m1 −δ̃(m1 ,m2 ) m2 1 − m2 · <1. m1 1 − m1 δ̃ ∗ (m1 , m2 ) Let δ̃(m1 , m2 ) = . Now it is easy to put n1,2 (ε) such that for all 2 n > n1,2 (ε) inequality (11) is held. Finally, let us find such n2 (ε) that for n > n2 (ε) condition (12) is satisfied. We have already seen that |An | ≤ 1, and the equality is held when ln ((1 − m1 )/(1 − m2 )) k0 = n · ; ln (((1 − m1 ) · m2 )/((1 − m2 ) · m1 )) is an integer. Let us denote the right side as n · c(m1 , m2 ) and write k0 as k0 = n · c(m1 , m2 ). Referring to the standard way of proving the Stirling’s formula, we can write the following inequality: √ n n √ n n 2πn · ≤ n! ≤ e · n · . e e It follows that n e r n n k n n−k ≤ · · . k 2π k(n − k) k n−k So, it is now easy to see that e r n n · m k n · (1 − m ) n−k 1 1 Bn,k (m1 ) ≤ · · . 2π k(n − k) k n−k Substituting k0 = n · c(m1 , m2 ) for k in the last inequality, we get s e 1 Bn,k0 (m1 ) ≤ · 2π n · c(m1 , m2 )(1 − c(m1 , m2 )) c(m1 ,m2 ) 1−c(m1 ,m2 ) !n m1 1 − m1 · . (13) c(m1 , m2 ) 1 − c(m1 , m2 ) It follows from Proposition 2 that c(m1 ,m2 ) 1−c(m1 ,m2 ) m1 1 − m1 · <1. c(m1 , m2 ) 1 − c(m1 , m2 ) Now using (13) it is not hard to put n2 (ε) such great that for n > n2 (ε) inequality (12) is held. Finally, let n(ε) = max{n1,1 (ε), n1,2 (ε), n2 (ε)} . Now for all n > n(ε) max {I(X(m) ; Θ)} − I(X(n) ; Θ) < ε . T∈M, m∈N The theorem is proved. t u Asymptotical Information Bound of Consecutive Qubit Binary Testing 177 5 Conclusions In the paper the problem of obtaining classical information about the pure qubit state using a single qubit binary test has been considered. It has been demon- strated that the maximum of information is reached if and only if the using measurement is projective. The maximum value of information has been calcu- lated: n o 2 ln 2 − 1 max I(X(n) ; Θ) = . T∈M, n∈N 2 ln 2 It follows, in particular, that to distinguish two arbitrary pure qubit states using a single binary test it is necessary to have at least four pairs of qubits prepared in the considered states. It has been shown that the maximum of reachable information can be at- tained asymptotically using an arbitrary consecutive qubit binary test. Thus, if we have a certain measuring instrument performing a qubit binary test, we can obtain an amount of information arbitrary close to the maximum. As known [3, 6], Yu. Manin and R. Feynman proposed to use quantum sys- tems to simulate others quantum systems. The results obtained in the paper show that this idea should be refined: one should take into account all dependences between an input data, a behaviour of a simulating system, and a structure of an output data. Our further research will deal with generalizing results of the paper for the case of an n-level quantum system and a measurement with m outcomes. References 1. Cover, T.M., Thomas, J.A.: Elements of Information Theory. John Wiley & Sons, Inc. (1991) 2. Davies, E.B.: Information and Quantum Measurement. IEEE Trans. Inf. Theory IT-24, 596–599 (1978) 3. Feynman, R.P.: Simulating Physics with Computer. Int. J. Theor. Phys., 21, 467– 488 (1982) 4. Holevo, A.S.: Bounds for the Quantity of Information Transmitted by a Quantum Communication Channel (in Russian). Problemy Peredachi Informatsii, vol. 9, 3, 3–11 (1973) 5. Holevo, A.S.: Statistical Structure of Quantum Theory. Springer-Verlag, Berlin (2001) 6. Manin, Yu.I.: Mathematics as metaphor: selected essays of Yuri I. Manin. AMS (2007) 7. Natanson, I.P.: Constructive function theory. Vol. 1, Ungar, New York (1964) 8. Nielsen, M.A., Chuang, I.L.: Quantum Computation and Quantum Information, 10th Anniversary edition. Cambridge University Press, Cambridge (2010) 9. Shannon C.E., Weaver W.: The Mathematical Theory of Communication. Univer- sity of Illinois Press, Urbana (1949) 10. Thawi M., Zholtkevych G.M.: About One Model of Consecutive Qubit Binary Testing. Bull. Khark. Nat. Univ., vol. 890, 13, 71–81 (2010) A Data Transfer Model of Computer-Aided Vehicle Traffic Coordination System for the Rail Transport in Ukraine Denis B. Arkatov1 1 National Technical University "Kharkov Polytechnic Institute", Kharkov, Ukraine denarkatov@gmail.com Abstract. This paper gives a general layout of subsystem operation used for the rolling stock traffic coordination. A principle of selection of information tech- nologies applied for the realization of the communication and data transfer sys- tem has been described. GSM and GPRS technologies that are used for the transmission of navigation information on a vehicle locus and for the informa- tion exchange in the system have been described. To provide a required level of transmission capacity through the frequency reuse the guard period between the base transceiver stations has been calculated. To calculate the GPRS channel capacity and that of information exchange via the Internet GPRS network an al- gorithm for the simulation modeling of packet arrival in the prescribed time space has been constructed. Using this algorithm an experiment was carried out for different intensity values of packet arrival to the system. Keywords. Traffic coordination, information technologies, transmission capac- ity, algorithm, simulation model, communication channel Key terms. Development, Model, MathematicalModel 1 Introduction The traffic control automation is a topical and up-to-date problem of the rail transport in Ukraine. An important part of this problem is the development of algorithms for the coordination and control of the large amount of rolling stock, situated in a zone of railway traffic control points. An important requirement set to such algorithms is to provide safe and regular traffic of the whole collection of trains and to make optimal decisions from the economic standpoint. To provide safe traffic for rolling stocks at the stage of departure or arrival a cer- tain (permissible minimum) time interval should be provided, which is always taken into consideration while making a train timetable. Despite this fact the train timetable is sometimes disturbed due to many reasons and situations arise when the rolling stocks are found to be undivided by the safe time interval and the groups of conflict- A Data Transfer Model of Computer-Aided Vehicle Traffic Coordination … 179 ing trains are formed. In the case of high traffic intensity the groups of conflicting trains at stations or at a train overtaking locus can be rather large. On the other part the malfunction of train timetable can result in subsequent traffic disturbance (train delay, train overtaking conflicts, etc.) The above problems can be resolved through the automation of operative traffic control of rail transport. The experience gained by foreign countries [1,2] shows that the efficient solution of traffic control with regard to the rail transport is only possible if the latest information technologies are used. The use of inexhaustible potential of railway information systems for the benefit of entire transport system of the country allows to reduce control costs required for the management and realization of domestic and international traffic. This also provides a considerable improvement in the quality of transport and logistics-related services and safety of traffic. The examples of computer-aided systems that can fully or partially solve the above problems are the European Train Control System (ETCS) [3], American Positive Train Control (PTC) [4] and ILSD-U system [5] used by Russia. A specific feature of the developed computer-aided system used for the rolling stock (RS) traffic coordination is the introduction of contemporary satellite technolo- gies, communication and data transfer systems into a routine work of rail transport in Ukraine. The information exchange technology can be described as follows. A GPS/GPRS –modem receives the navigation information from satellites, after that using the GPRS technology the coordinates, current speed and other information are transmitted via the operator’s server of mobile communication to the database server. The RS locus data are transmitted to the workstation of railway station dis- patcher and to the workstation of railway dispatcher. The data should be processed in a real time mode. An important aspect is that a time interval during which these data gain currency should be taken into considera- tion. This fact considerably constrains not only the algorithm used for the solution of coordination problem but also data transfer technologies. This paper consists of several sections. First of all, the technologies used for the data collection, processing and transfer will be described. Then, we will estimate the channel data transfer capacity and do appropriate computations that prove the ob- tained results. 2 Data Transfer, Processing, and Collection Technology At the present time the railway adopts the computer-aided system (CAS), which in- cludes the following basic subsystems: onboard intellectual system, which provides positioning, control and information support for the rolling stock; surface intellectual system (SIS), which provides control and coordination in a real time mode; communi- cation and data transfer system based on the mobile GSM communication; navigation charts that reflect a real railway infrastructure. The SIS structure and CAS system on the whole are given in Fig.1. A special place in SIS is occupied by the information system designed for the roll- ing stock traffic coordination, which is the subject of this research. The navigation 180 D. B. Arkatov data collection system is used for the automatic identification of a rolling stock locus and also for securing the safety of traffic. The information about a rolling stock locus is required for the optimal use of traffic and carrying capacity of railways and for the elimination of dangerous situations (dangerous passing approach, passing the traffic lights with forbidden signals, siding motion, exceeding the allowable speed in the places of its restriction, etc.) A locus of each RS is transmitted to the traffic control service via the navigation system for further data processing and traffic control. The basic principles of monitoring are the safety of traffic, time fulfillment of a transporta- tion schedule and costs minimization. System of connection and data Special navigation maps transmission Fig. 1. CAS and SIS structures In order to create the appropriate computer-aided communication and data transfer system for the rail transport the adoption of communication standard is required to meet the entire system operation requirements, in addition to the use of satellite navi- gation. The required communication system should provide high safety, reliability and solutions proved in practice and it should also be an innovative and high perform- ance system. The costs required for the total system deployment should be reduced to a minimum. A Data Transfer Model of Computer-Aided Vehicle Traffic Coordination … 181 Taking into consideration the European experience gained in the realization of similar data transfer systems telecommunication technologies should be developed meeting the GSM-R standard (Global System for Mobile Communications- railway)[6,7], which engineering and economic efficiency has been proved not only by tests but also by real application for different railway systems in developed Euro- pean countries. However, at this stage of the development of information technologies in Ukraine we can make to the conclusion that the realization of formulated problem is possible and it is highly recommended to use the GSM mobile communication standard. The GPRS technology is recommended for the data transfer. The GPRS networks split the transmitted information into individual packets that are delivered from a transmitter to a receiver. If errors have been detected the received packets can be transmitted once again. The original message is designed by the re- ceiver party using the obtained packets. The data transfer in packet- switching net- works differs from the data transfer in channel-switching networks in the way that the required channel resource is allocated exclusively for the time of transmission of ap- propriate information packets. The rest of the time it is at disposal of a network. In the case of GSM/GPRS networks this allows us to use one physical channel for the transmission of packets to several subscribers and to simultaneously allocate several physical channels for the transmission of packets to one subscriber. The packets are transmitted aside from each other in different directions. The GPRS defines the effective use of a channel resource. The same physical channel is provided for the group of subscribers to receive messages sent from the base transceiver station (BTS) to a mobile station (MS) and packets are transmitted as soon as they arrive depending on the volume of information and the priority of sub- scribers. Each packet contains an identifier or address, which is used for the delivery. A subscriber is continuously connected to the packet network, which provides for him a virtual channel. It becomes a real (physical) radio channel during the packet trans- mission. The rest of the time this physical channel is used for the transmission of packets of other users. Due to the fact that the same channel resource is used by several subscribers and during the communication session the packets of different users can simultaneously arrive the waiting list of transmitted packets can be originated, which will result in the communication delay. The allowable value of packets delay is one of the attributes defining the quality of a subscriber service. The paper [8] defines three classes of delay depending on the delay norms and packet length. The top priority is assigned to the Class 1, the normal priority is given to the class 2 and the class 3 enjoys the least priority. The delays have not been de- fined for the Class 4, because the service of packets of this class is performed adher- ing to the “best effort” principle. The intensity values of packets arrival can be selected on the basis of statistical re- search. A statistics shows that a number of packets arriving per time units to the input of GPRS circuit changer can vary in a wide range from hundreds of packets /s at input sites to several thousands of packets/s at backbone sites. To provide the appropriate level of data throughput we use the basic principle of the construction of cellular communication networks, which is based on the frequency 182 D. B. Arkatov reuse [8]. The main essence of it consists in that the neighboring (adjacent) cells of a mobile communication system use different frequency systems and in non-adjacent cells located at sufficient distance from each other the used frequency bands are re- peated. In practice the cities and regions with a solid cellular coating use clusters in which each cell is divided into the three sectors, using the directional radiation an- tenna with the directional pattern width of 120. The base stations that allow frequency reuse are located at a distance D from each other. This D distance is measured between the centers of hexagonal cells and it is called a guard interval. Proceeding from geometrical reasons the parameter D can be defined as follows: D R 3 , where R is a radius of circle circumscribed around the regular hexagon; is a coeffi- cient of the frequency reuse. The D ratio is defined as a reduction factor of channel R noises. Thus, for the optimal frequency reuse and an increase in the GPRS channel capac- ity in Ukraine the guard interval should be: D 15 3 3 45 (km) 3 Case-Study for Capacity Evaluation The developed computer-aided system used for the rolling stock traffic coordination should take into consideration, while transmitting the navigation information to the mobile operator server, not only the capacity of GPRS channels but also that of cable or fiber-optic channel of the Internet network, which is used for the transmission of data to the database server. By analogy to the mobile communication it is necessary to consider a problem re- lated to the average duration of delays in the packet switch. The term “packet switch” implies here a concentrator (statistical multiplexor), a virtual packet switch (network X*25, Frame Relay, and ATM network) and a router (IP network).The packet switch can be represented as an element with many input and output channels (switch/router). Using the Kendall notation such network elements can be presented by queuing systems of G/G/1 or G/G/n type (arbitrary probabilistic distributions describing the incoming stream of customers (in our case packets or protocol blocks) and the time of their serving time. (Let us note that models with one server, i.e. G/G/1 are often used for the analysis of packet switches). Let’s assume that in the general case applications arrive to the system input in compliance with the Poisson distribution law, whose service time is an arbitrary value. Then the average queue length in the system with the infinite buffer size (M/G/1) is calculated using the classic Khintchine-Pollaczek formula [9]: A Data Transfer Model of Computer-Aided Vehicle Traffic Coordination … 183 q 1 C s2 tq t s 1 (1) 2(1 ) where q is an average queue length in the considered system (including protocol blocks PB); is the M/G/1 system load intensity, <1; , are the intensity values of the PB arrival and service in the system, accordingly; t s is an average time of PB service in the system; Dt s C s2 is a quadratic coefficient of service time variation equal to the ratio of t 2 2 service time variance and squared expectation value. The values required for computations using formula (1) were obtained through the simulation modeling of data transfer in the Internet network, which algorithm is given in [1]. A one-channel queuing system (QS) of the M/G/1 type has been taken as an exam- ple. The arrival of the Poisson stream of applications with the constant service time j has been simulated. The numerical experiment, which was carried out, allows us to come to the conclusion that there is no loss of data packets. The constructed graphs show the time of the arrival of customers in a queue and the time of channel teardown from the previous application at the application arrival 1 intensities of 1 (see Fig.2.a) and (see Fig.2.b) 10 20 (a) (b) Array 1 – The estimated time of the channel clearing Array 2 – Actual time of the channel clearing 1 1 Fig. 2. QS simulation with an intensity of and 10 20 The given Figures show that an increase in intensity of arriving customers does not result in the lack of channel capacity. This proves the correct choice of information 184 D. B. Arkatov technologies for the realization of computer-aided system used for the rolling stock traffic coordination. To define the data transfer channel capacity in the Internet network the network duration and delay should be calculated. Due to the fact that the data transfer channel-switching center services packets it can be simulated using the system with a constant service time of a M/D/1 type. tq ts 1 . 21 At 1 and 1 10 5 1 2 Let t in the given example be equal to 0,05. As a result we get: 1 tq 0,05 1 0,075 , 2 which corresponds to the first service class. At 1 and 1 20 15 3 4 Let t in the given example be equal to 0,05. As a result we get: 3 tq 0,05 1 0,125 , 2 which corresponds to the first service class. 4 Conclusions Today the railways are the major branch of the economy of Ukraine and they serve as a basis of the Ukrainian transportation system. Due to the rapidly changing demands for freight services and carriage of passengers a permanent control over the required amount of rolling stocks should be investigated. The proper amount of such vehicles can be defined through the traffic analysis. The information on the timetable of railway traffic and actual amount of rolling stocks involved in traffic serves as a source information for further computations of the capacity of a communication and data transfer system of the computer-aided sys- tem. This will provide the fulfillment of actuality condition of obtained data and the solution of the traffic coordination problem in a real time mode. A Data Transfer Model of Computer-Aided Vehicle Traffic Coordination … 185 To provide the functioning of computer-aided system on the whole the following algorithms should be realized. The algorithm used for the determination of the amount of conflicting trains is the first step towards the problem solution. It allows for the detection of those rolling stocks for which conflict-free conditions are not observed due to different reasons (timetable violation, technical malfunction, etc.). A decompo- sition algorithm is required for the reduction of dimension of a solved problem that would allow the reduction of time interval of data acquisition. The obtained data are used for the generation of control action and for the reduction of loading for the used transmission channel, providing thus the observance of principles of control in a real time mode. An algorithm of the problem related to the coordination of rolling stock traffic allows for the elaboration of such control actions that provide not only the elimination of a definite conflict but also the fulfillment of conflict-free conditions for the definite rolling stock in the future. An algorithm of data transfer in a real time mode is the basis of information exchange between the rolling stock and the database server. Our further research is targeted at the development of algorithmic support of communication and data transfer system to solve the problem on the coordination of traffic of rolling stocks. This paper gives the description of GSM and GPRS technologies that are used for the transmission of navigation information on the locus of rolling stocks and also for the information exchange in the system. It has been noted that in order to provide the required capacity level we applied the main principle used for the construction of cellular communication networks, i.e. the frequency reuse. The cluster structure with a template of 3/9 has been constructed and the guard interval between the BSs used for the mobile communication in Ukraine has been calculated. To calculate the capacity of GPRS channels and that of information exchange channel in the GPRS Internet network the algorithm of simulation modeling of arriv- ing packets during the prescribed time interval has been constructed. On the basis of this algorithm we carried out an experiment for different intensity values of packets arrival to the system. Using the analysis data we can come to the conclusion that an increase in the intensity of arriving customers causes no lack of channel capacity. Our further research is targeted at the development of an algorithm for the data processing in a real time mode, and also at the development and testing of the infor- mation system designed for the rolling stock traffic coordination. We will also delve into the computation of efficiency estimates to evaluate the introduction of informa- tion technologies into the rail transport operation. References 1. Arkatov, D. B.: Models and methods for automation of dispatching management for rail- ways of Ukraine. Modeli i metodi avtomatizacii dispetcherskogo upravleniya dlya zheleznodorozhnogo transporta Ukraini, Vostochno-Evropeyskiy zhurnal peredovikh tekhnologiy, Eastern European journal of Enterprise Technologies, N 1/10 (61), pp. 61–63 (2013) 2. Arkatov, D. B.: Synthesis models of coordination of movement of mobile railway transport of Ukraine. Sintez modeley koordinacii dvizheniya podvizhnikh sredstv 186 D. B. Arkatov zheleznodorozhnogo transporta Ukraini, Vostochno-Evropeyskiy zhurnal peredovikh tekhnologiy, Eastern European journal of Enterprise Technologies, N 4/3 (58), pp. 58–60 (2012) 3. ETCS requirements specification and validation: the methodology, http://www.era.europa.eu/Document-Register/Documents/ETCS_methodology_v_1_2.pdf (accessed 20 February 2013) 4. Positive train control (2013), http://en.wikipedia.org/wiki/Positive_train_control (accessed 20 February 2013) 5. Integrated locomotive safety device – unified, http://www.irz.ru/products/20/70.htm (2012) (accessed 20 February 2013) 6. GSM-R (2013), http://en.wikipedia.org/wiki/GSM-R (accessed 20 February 2013) 7. GSM technical specification, http://www.ttfn.net/techno/smartcards/gsm11-11.pdf (1995) (accessed 20 February 2013) 8. QoS in GPRS, http://doc.utwente.nl/18117/1/00000039.pdf (1999) (accessed 20 February 2013) 9. Tijms, H. C. A first course in stochastic models / Henk C., Tijms.p. cm. Includes biblio- graphical references and index. Quantitative Estimation of Competency as a Fuzzy Set Leonid Vasylevych1 and Ivan Iurtyn1 1 Borys Grinchenko Kyiv University, Department of Information Technology and Mathematical Sciences lvasilevich@mail.ru, yurtyn@ukr.net Abstract. The authors of this paper have used the assessment of competence as a fuzzy discrete set consisting of essential capacities. There has been proposed a procedure of competence quantitative estimation on the basis of discrimination index of discrete fuzzy sets fixed on one totality. A linguistic variable “Compe- tency coefficient” has been used here for making appropriate decisions on the grounds of competency quantitative estimation. Assessment of a person’s com- petency is proposed as a fuzzy discrete set consisting of necessary abilities as its values. Using such competency assessment allows to estimate persons’ compe- tency quantitatively and to compare them. Keywords. Competency-oriented education, competence, capacities, fuzzy dis- crete set, linguistic variable, membership functions, fuzzification, scalar capac- ity of any fuzzy discrete set Key terms. MathematicalModelling, MathematicalModel, FormalMethod 1 Introduction The analysis of world education development tendencies demonstrates [1,3] compe- tency-oriented education trend increase. Moreover, competency, which is not only defined by knowledge, abilities, skills but also by considerably greater quantity of factors (coefficients), becomes a major category both in education system and in the job-market. Competency also includes the ability to obtain, to analyze and to revise information; to learn through one’s lifetime; to change in compliance with the job- market demands [1]. Thus, the quantitative estimation of competency necessary for making appropriate decisions is a multicriterial problem, and therefore we need here to derive an integral estimation of competency. Since there is no methodology of working out this prob- lem, it makes the article topical for in it an integral index of a person’s competency coefficient is estimated on the basis of the new competency assessment as a fuzzy discrete set of which essential capacities are values. Published works analysis. In the work [1], the key competencies concept has been considered and three key competencies have been analyzed (specified by the Organi- zation of Economic Cooperation and Development (OECD) representatives), which are: autonomous activity; interactive facility use; ability to work in socially hetero- 188 L. Vasylevych and I. Iurtyn genic groups. Federal Statistics Department of Switzerland and National Center of Education Statistics of the USA and Canada within the program named Definition and Selection of Competencies-- Theoretical and Conceptual principles (“DeSeCo”)” summarized respective scientific results and different countries’ practices. In the work [3] we give a review of works on the topic. But in all those works the qualitative ap- proach to the named subject is solely used , but methods of quantitative competency estimation have never been given. Thus, the aim of this work is to develop methods of quantitative estimation of competency on the grounds of its assessment as a fuzzy discrete set [2] consisting of essential capacities. Main results. Competency is defined in UNESCO publications as a combination of knowledge, abilities, values and attitudes used in everyday life. Therefore qualitative assessment of a person’s competency means his (her) ability to perform professional duties or some functions efficiently. But this definition does not give a possibility to estimate expert’s competency on quantitative basis. That is why we proposed to use the following person’s competency definition. Definition 1. A person’s competency is a finite discrete fuzzy set consisting of abilities necessary for a job position or functions necessary for a respective position. Membership functions of the set elements characterize the level of this competency innateness to the person. Definition 2. Abilities are necessary features, characteristics, faculties, qualities, knowledge, techniques, skills and other traits which a person needs to perform duties or functions at a respective position efficiently. Thereby, in the beginning, we need to define at the discrete set of abilities Y { y j : j 1, m} membership functions D yi 0 ; 1 of the fuzzy set D “Require- ments necessary to perform duties or functions at a respective position efficiently”. These membership functions characterize credibility, priority and importance of a respective ability for a respective position or function. Further we will use the notation of the discrete fuzzy set D in the form [2]: Table 3. Designation D discrete fuzzy set yi y1 y2 y3 … yn D= D(yi) D(y1) D(y2) D(y3) … D(yn) or D y1 / D y1 ; y 2 / D y 2 ; y3 / D y3 ; ... y1 / D y n . The set of abilities and respective membership functions will be different for each position. When we specify the Y set we need to apply the Pareto principle, which points that 20% of factors define 80% of the result. In practice, implementation of this principle will lead to the effect that abilities with membership functions less than 0.5 will not be included into the D set. The task of specifying the set of abilities and respective membership functions refers to the task of knowledge estimation by ex- perts and demands creation of respective questionnaires. Quantitative Estimation of Competency as a Fuzzy Set 189 As an example, let us specify an IT teacher’s information technology competency in the form of a fuzzy set D: Table 4. Example D representation of discrete fuzzy set Yi y1 y2 y3 y4 D= µD(yi) 1 0.9 0.7 0.8 in which y1 - ability to work in Word environment; y2 is the technique of work in Excel environment; y3 is the technique of work in Excess environment; y4 is special software skills (e.g, working out optimization tasks). To perform a quantitative estimation of a particular teacher’s competency it is nec- essary to estimate his abilities yi. To do so, tests, interviews, exams, respective lessons control and other means can be recommended. Competency grades (their membership functions estimation) can be shown on the scale from 0 to 1. This process is called fuzzification. Hereby, for each person (teacher), we can define in the form of a fuzzy set his personal fuzzy vector of abilities, which defines his competency. To define μA(yi) a group of experts can be used who, after analyzing the person, answer the question: “Is ability yi attributable to the person?” If the LD expert of L experts give a positive answer, then LD μ А yi . (2) L As a rule this question does not have a single-value answer, so, experts can use both binary logic (μAγ(yi) is either 0 or 1, where γ is an expert’s number) and fuzzy logic (multiple-valued verity scale). In so doing they index the value of μAγ(yi) 0 ; 1 (subjective estimate). If quantity of the experts is L, then in the capacity of μA(yi) we accept weighted arithmetic mean value of these estimates: k А ( yi ) μ А yi L , (3) k 1 where kγ is the γ expert’s competency estimate. For quantitative comparison of different persons’ competencies we need, firstly, to compare in pairs finite discrete fuzzy sets D “Demands necessary to perform duties or functions at a particular position efficiently” and Aj “the j person’s competency” which are specified at one totality Y. To compare these finite discrete fuzzy sets in pairs it is possible to use the estimate P(D,Aj) of difference between D and Aj , which is reduced to the estimate of the trav- erse of D Aj or D AJ [2]: D Aj D (4) РD , AJ , D 190 L. Vasylevych and I. Iurtyn where the ... sign means scalar capacity of any fuzzy discrete set B [2]: B μ B yi (5) х Х operation B of complementing the fuzzy set B is defined by the membership func- tion [2] μ _ y 1 μ B y , y Y (6) B operation of two fuzzy sets unification (C B K) has the membership function [2] μ С y maxμ B y ; μ K y , y Y . (7) In so doing P(D,Aj) as a rule is not equal to P(Aj,D). This attribute is used to com- pare fuzzy sets specified at one totality: if P(D,Aj) > P(Aj,D), then the fuzzy set D < Aj and vice-versa. A person’s abilities, which have some membership functions’ value greater than the value of respective abilities’ membership functions in the D set, must not compen- sate small values of the Aj set membership functions. To avoid this it is necessary to perform the Aj set normalization: membership functions values of the Aj set which exceed respective values in the D set have to be equated to respective membership functions’ values of the D set. Thereby it is necessary to insert the normalized fuzzy set Ajн into the (3) formula. Let us perform a comparison of two persons’ competencies. Let us define one per- son’s competency by means of a fuzzy set А1 (y1/0.6); (y2/0.9); (y3/0.7); (y4/0.9) and the other person’s competency by means of a fuzzy set А2 (y1/0.8); (y2/1); (y3/0.5); (y4/0.9). After normalization we have: А1н =А1 (y1/0.6); (y2/0.9); (y3/0.7); (y4/0.9); А2н (y1/0.8); (y2/0.9); (y3/0.5); (y4/0.8). The estimate of the difference P(D,A1) is equal to (3): 0,6 0.9 0.7 0.9 0.6 PD, A1 0.735. 1 0.9 0.7 0.9 The estimate of the difference P(A1,D) is equal to: 1 0 .9 0 .7 0 .8 0 .9 P А1 , D 0.806 . 0 .6 0 .9 0 .7 0 .9 We propose to calculate competency coefficient K is as the normalized estimate of differences: minP A, D ; PD, A (8) K P A, D Quantitative Estimation of Competency as a Fuzzy Set 191 This coefficient always belongs to [0;1] interval. If P(A,D) > P(D,A) then K<1, and if P(A,D) < P(D,A) then K=1. After inserting computed estimates into the (7) formula we have: min0.806;0.735 K 0.912 . 0.806 Calculation of competence coefficient K for the second person will give values described below: 3.2 0.6 3.4 0.8 PD, A2 0.765 ; P A2 , D 0.813; 3.4 3.2 minP A, D ; PD, A min0.813;0.765 K 0.941. P A, D 0.813 Thereby, we can conclude that the second person’s competency is greater than the first one’s. To define a person’s competency level basing on the competency coefficient value it is necessary to specify a linguistic variable (LV) [2] “A person’s competency coef- ficient”, which we will determine by means of a tuple E , E j , j 1,5; E j ( x) 0;1; x K 0;1; . Terms of “Competency” LV can be: E1 – very low competency; E2 – low compe- tency; E3 – medium competency; E4 – high competency; E5 – very high competency. Trapezoidal membership functions of terms can be defined by experts by means of four numbers a : b : c : d , which define each term. Using trapezoidal membership functions of terms and considering Harrington’s scale it is possible to specify “Competency” LV as follows: E1 0 : 0 : 0.1 : 0.2 ; E 2 0 .1 : 0 .2 : 0 .3 : 0 .4 ; E 3 0 . 3 : 0 . 4 : 0, 6 : 0, 7 ; E4 0.6 : 0.7 : 0,8 : 0,9 ; E5 0.8 : 0.9 : 1 : 1 . Let estimate Ej of a term by an γ expert amounts E j aj ; bj ; cj ; dj , then in the capacity of membership function Ej of the term we accept a fuzzy quantity 1 L 1 L 1 L 1 L Ej aj ; bj ; cj ; dj . (9) L 1 L 1 L 1 L 1 To specify terms more appropriately Delphi technique can be applied. Specifying membership functions lateral branches by straight line segments does not reduce persons’ competency estimate’s generality but simplifies mathematical operations over fuzzy quantities considerably [4]. In so doing the left l (x) and the right r (x) lateral branches of the membership linear function have analytical form respectively: xa l ( x) ; x a; b , (10) ba 192 L. Vasylevych and I. Iurtyn dx r ( x) ; x c; d . (11) d c For the just made example, we have ascertained that the competence coefficient K1=x=0.912 belongs to E5 term (very high competency) with membership function (verity) one, and the competence coefficient K2=x=0.88 belongs to E4 term (high competence with membership function 0.2 and to E5 term with membership function 0.8. Algorithm of a person’s competency estimation consists of six stages: the prepara- tory (1 to 4) and operational (5, 6) ones. Y { y j : j 1, m} 1. Specifying the set of abilities for a position or functions. 2. Abilities’ membership functions assessment (Specifying D fuzzy set) ( (1) and (2) formulae are applied). 3. Ai fuzzy set assessment – “A person’s competency”. j E, E , j 1,5; j ( x) 0;1; x 0;1; E 4. “A person’s competency” LV assessment: . 5. A person’s competency coefficient computing ((3); (4); (5); (6) and (7) formulae are applied). Computing a person’s competency coefficient’s membership functions to respec- tive terms LV “A person’s competency” (formulae (9) and (10)). At the preparatory stage experts are used, who define the notion “Competency” as a discrete fuzzy set, the values of which are abilities necessary for a particular posi- tion or functions. Point 3 demands creating respective techniques, tests, problems and tasks that al- low estimating various abilities of a person (to find membership functions of each ability). At the stage of receiving a person’s competency quantitative estimate points 5 and 6 are performed. To specify A fuzzy set of an expert’s antecedent characteristics the expert’s ques- tionnaire data, his (her) tests, interviews can be used. The examined competency estimation methodology based upon using fuzzy sets and a linguistic variable allows resolving several problems: conversion from current qualitative competency assessments to quantitative estimation; multicriteriality of competency estimation problem; impossibility of quantitative measuring certain par- ticular indexes of competency; impossibility of real experiments to estimate different persons’ competency. 2 Conclusions 1. Assessment of a person’s competency is proposed as a fuzzy discrete set consisting of necessary abilities as its values. Using such competency assessment allows to estimate persons’ competency quantitatively and to compare them. 2. A methodology of a person’s competency quantitative estimation is proposed. Quantitative Estimation of Competency as a Fuzzy Set 193 3. It is proposed to estimate quantitatively persons’ competency on the basis of dif- ference coefficient of finite discrete fuzzy sets D “Demands necessary to perform duties or functions at a particular position efficiently”. 4. It is proposed to specify an expert’s competency coefficient in the form of linguis- tic variable “Competency”. References 1. Key Competencies: A Developing Concept in General Compulsory Education. EURYDICE. The Information Network on Education in Europe, p. 224 (2002) 2. Pospelov B.A. (ed.): Fuzzy Sets in Management and Artificial Intelligence Models. Sci- ence, Moscow (1986) 3. Sysoyeva S.O.: Education and Personality in Post-Industrial World. Monograph. KSPA, Khmelnytsky (2008) 4. Vasylevych, L.F. Malovik, K.N., Smirnov, S.B.: Quantitative Methods of Making Deci- sions in Terms of Risk. SNUNEP, Sevastopol (2007) 1.4 Methodological and Didactical Aspects of Teaching ICT and Using ICT in Education New Approaches of Teaching ICT to Meet Educational Needs of Net Students Generation Nataliya Kushnir1, Anna Manzhula1 and Nataliya Valko1 1 Kherson State University, 27, 40 rokiv Zhovtnya St., 73000 Kherson, Ukraine kushnir@ksu.ks.ua ilovetrees@mail.ru valko@ksu.ks.ua Abstract. The paper describes the educational needs of modern students as generation Net representatives, highlights the contradiction between their char- acteristics and traditional ways of teaching ICT disciplines. The paper reports teaching experience and poll results for three years at Kherson State University, Ukraine. The purpose of the paper is to offer new teaching approaches to cope with a generation gap and way to improve the quality of ICT teaching. Keywords. Generation gap, Generation Net, teaching approaches, ICT disci- pline Key terms. ICTTool, TeachingPattern, TeachingMethodology, Capability 1 Introduction The UNESCO report on Information Technology in Education [1] informs that Ukraine is on the way of "the rapid advancement (progress) of ICT in education that leads to the constant updating of the educational content and the quality of ICT train- ing". However, there is a great amount of problems. Primary it is connected with the fact that educational institutions and teachers in particular are not ready to the transi- tion to the information society: “increased demands for flexibility, mobility and adaptability to the education management system, educational institutions and teach- ers in the context of rapid changes make it difficult to maintain and improve the qual- ity of educational services." 2 Related Work Using our teaching experience we identified the common trend of learning styles among the students. Further research is required to investigate digital competence formation among future teachers. Our attempt to find out an unknown factor that has a significant impact on the pedagogical process and to understand the nature of this phenomenon was based on the considering modern students as the representatives of the new generation. 196 N. Kushnir, A. Manzhula and N. Valko To date, about 40 books and scores of articles and papers have been written on this generation that report the results of international surveys and other research and de- scribe their characteristics. Their impact on education at all levels has been of major interest to researchers and educators. There are about 10 terms to describe the current generation of students [2]: Millennials (Howe and Strauss), Generation Y or Gen Y (Nader), Echo Boomers (Tapscott), Net Generation (Tapscott), Digital Aboriginals (Tarlow and Tarlow), Digital Natives (Prensky), Nexters (Raines and Filipczak), Dot Com Generation (Stein and Craig). The representatives of this generation were born in 1982-2003. Today’s students and post-graduates are aged from 10 to 30. It means that all teachers and institutions that are involved in the education of the students who have grown up in the world of new digital, mobile and high-tech, digital technologies. The technology itself has had a profound effect on this new generation, unlike on any previous one. In the classroom, students can chat on Skype or write SMS to their friends, take notes on ipad, surf the Internet and read a book on the ReadBook. This behavior can not be fully appreciated by their teachers: it’s considered that electronic instruments and digital devices distract students from the "real" study [2]. Majority of today’s teachers are representatives of the previous generations. They are using learn- ing models fitted for the teachers themselves but not for the new generation of stu- dents. It was found that representatives of the new generation inherent a wide range of characteristics that are defined a predisposition for becoming a successful educator. Realizing their significance for the education and considering themselves to be an instrument of world changes, they will be strongly motivated to improve the quality of life of the society. Some research study and develop strategies for a retention them in definite professional sphere including education. Moreover, the representatives of Net generation have solid moral values connected with a family and our society. They are highly motivated to create an open and toler- ant society. It’s important for new gens’ educators to consolidate and maintain youths’ system of values. [15]. The results of the 3rd year students’ polling showed the low level of professional awareness (Preschool and Elementary School faculty). Only 50% of students have an intention to become teachers [16]. The lack of professional focus among students sets a hard task for educators to make teaching a valuable and desired profession. Among the characteristics of generation should be noted that Gen Y workers are usually educationally focused and attribute their success to their educational capabili- ties. They want to have successful careers. They do not like the dress code, demand ICT equipped workplaces and want have a flexible work schedule. Ronald A. Berk synthesized pertinent research evidence based on ten national and international surveys: EDUCAUSE [4], College Students‘ Perceptions of Libraries and Information Resources Survey, Greenberg Millennials Study [5], Education Re- search Institute (UCLA) [3] American Freshman Survey [11], National Center for Education Statistics [9], Net Generation Survey [8],The Net Generation: A Strategic Investigation [13], Nielsen Net View Audience Measurement Survey [2, 10], Pew Internet and American Life Project [6, 7] и Technological preparedness among enter- New Approaches of Teaching ICT to Meet Educational Needs … 197 ing freshman [12]. The research results from the surveys and aforementioned books has yielded twenty learner characteristics typical for most Net Geners: technology savvy, relies on search engines for information, interest in multimedia, create Internet content, operate at ―twitch speed, learn by inductive discovery, learns by trial and error, multitask on everything, short attention span, communicate visually, crave so- cial face-to face interaction, emotionally open: Embrace diversity and multicultural- ism, prefers teamwork and collaboration, strive for lifestyle fit, feels pressure to suc- ceed, constantly seeks feedback, thrives on instant gratification, respond quickly and expect rapid responses in return, prefers typing to handwriting. The research results from the surveys and aforementioned books have yielded twenty learner characteris- tics typical for the most Net Geners. Fig. 1. Improvement discipline background We have identified some teaching approaches that are contradictory to the contem- porary students’ needs. This gap is especially obvious in teaching computer related disciplines. Complete comprehensive step-by-step instructions and exclusively indi- vidual learning are no longer efficient. This context led educators to a revision of present teaching strategies. 3 Setting up the Pedagogical Experiment The only discipline - “New Information Technologies and Technical Facilities of Education” – that concerns the formation of ICT skills is taught for the future teachers of all specialties at Kherson State University. This year the discipline was renamed in "Information Technology" in most curricula. The teaching experience of teaching the discipline "New Information Technologies and Technical Facilities of Education” in 2011 (109 students) and in 2012 (112 stu- dents) at Faculty of Pre-School and Elementary Education (FPEE) allowed us to iden- tify problems that are mainly related to the mentioned contradictions [17]. 198 N. Kushnir, A. Manzhula and N. Valko Table 1. The contradictions of the present ICT teaching approaches to the generation Net char- acteristics ICT teaching approach Generation Poll results Net charac- teristic ICT teaching "from scratch": dis- Tech savvy 84% of respondents have started to use regard (neglect) the actual level the computer for learning 7 years ago or of the student's ICT skills. As a earlier result, school ICT discipline as- signments are duplicated, lab manuals are detailed and tend to be comprehensive. Teaching materials are not inter- Relying on About 26% of the students classified active and update search (attributed, mentioned, placed, noted) engines for search engines to (as) the most fre- information quently used sites on the Internet Weak level of visualization of Interest in Movies and computer games have the teaching materials, lack of inter- multimedia, second position among the purposes of activity including hypertext "visual" using computer ranked by students communica- tion Step-by-step manuals that pre- Creation of Social nets, wiki-sites and forums have suppose learning by copying the Internet the third position among the most fre- sample, the absence of the origi- content quently visited sites on the Internet. All nal product as a result of the stu- of them are a platform for a creation of dents’ work. their own content, express their opinion, share things made by themselves. 13% of respondents mentioned the creative ac- tivity as a major purpose for using the computer Students are constrained by one Multitasks The sum of hours for different every- plotline (storyline) in learning, on every- day life activities informed by student the absence of immersion, prob- thing are about 28 hours a day. lem-solving and decision-making tasks and enough freedom for actions in realization students’ learning trajectory. Weakly realized person-centered Emotionaly Social nets were ranked as the second approach open position by students Individual fulfillment and per- Teamwork Using computer for communication formance of learning results and coop- among students is placed on the third po- eration sition after learning and entertainment purposes by students We interviewed students of Faculty of Pre-School and Elementary Education in 2012-2013 academic year. The results confirm last year statistics that quality of teaching materials on KsuOnline was highly appreciated by students, average score was 9.29 in 2011 and 9.16 in 2012 out of 10. New Approaches of Teaching ICT to Meet Educational Needs … 199 The results of the entrance poll of other faculties in 2011-2013 academic years showed the following trends: In 2011, 89% of the respondents owned a computer or a laptop. This academic year, 100% of students are the owners of a computer or a similar device regardless of the discipline. 70% (2011), 68% (2012) and 69% (2013) students have an access to the Internet outside of the university – thereby, this rate stays the same. However, a number of students who recognize themselves addicted to the Internet has grown from 24% in 2011 to 32% in 2013. An interesting result was found by visiting University website by students from different disciplines. 17% of the third year students said they had never visited the university website at the departments where teachers hardly use ICT in teaching. On the other hand the rate was 0 (since 2011 to date) at the departments where most of the teachers regularly use ICT. The number of students who have a positive attitude towards the use of ICT in education (inter alia, at lectures) has increased (see Fig. 2). Fig. 2. Students Attitude Towards the Use ICT During Lectures The purposes of using the computer by students have almost the same rating, regardless of year and the faculty (see Fig. 3). 200 N. Kushnir, A. Manzhula and N. Valko 35% 30% 25% 20% 15% 10% 5% 0% To do my lessons To communicate by the To learn something new To entertain themselve I have no computer Internet (distance learning (games, watching courses, video tutorials, movies, etc.). workshops, etc.) FFP, FT, FPHS (2012‐2013) FPSEE (2012‐2013) FPSEE (2011‐2012) Fig. 3. The Purpose for Using a Computer and Internet by Students Social net is becoming more popular among services for communication (see Figure), IME and email are also frequently used. Fig. 4. Student Communication Services Modern students overestimate their skills for online searching according to Berk. The results of our research have confirmed this fact. The results of entrance poll and test in Informatics are contradictory. The majority of students estimated their level of ICT proficiency as excellent and good. The rate is depicted below. New Approaches of Teaching ICT to Meet Educational Needs … 201 Fig. 5. Results of self-appraisal a computer user by students (FFP, FT FPHS, 2013) The entrance test contains 45 comprehension questions and is designed by analogy with ECDL test. The aim of testing is to check residual knowledge of school computer discipline. The results showed that only 3.28% of students had a score higher than 4.5, 82% had a score higher than 3 but less than 4,42 points, 62% had from 2 to 3 points and 1.64% scored less than 2 points. During the recitations, the following problems were identified that are related to organizational issues and students attitudes: Students have no skills to manage their time and work efficiently without strict teacher control, regardless deadlines. Only few students uploaded their works in time. Not all students use email regularly. Many of them chronically forget their email user names and passwords. Students are confused by various browsers or other version applications. As a result, the login process to email or distance learning system regularly takes up to 7 minutes. There is no clear understanding of terms among students. Being able to work with some programs, often they cannot deal with similar ones. It is necessary for a teacher to consider different speed of students work while planning lessons. Some students do not regularly use computers. Thus, typing, mouse control, and searching for a command causes a delay. Plagiarism. Some students prefer not “to waste” their time to do lessons. They easily copy their colleagues’ results or download similar ones from the net. It is important at the first lesson to highlight the value of students own creative work and punish the students who attempt to copy someone else work. Under strict control, the number of plagiarisms decreases, but does not disappear. In fact, we notice that plagiarism is growing with task level. The reason is that the students desire to get a mark, not knowledge. The students do not like reading long instructions. Usually they prefer to ask a teacher or colleagues than to read step-by-step manual. It is also associated with different systems prevailing perception among students. The feature of the course 202 N. Kushnir, A. Manzhula and N. Valko is that the most of the teaching materials are stored in digital form (in digital form). Creating e-learning course does not require additional financial costs, as opposed to hard copy, faster and easier to edit and update. However, reading from the screen does not give effective results. Perceived only about 40% of the information. Lack of collaborative activities: in pairs, groups and teams, social assessment, problem-solving and decision-making tasks. A student presents results of an individual work only to one person–his teacher, and feels subordinated under such conditions. In addition, it is the reason of the absent of the critical view on his own work. This results in frustration and dissatisfaction of the student to any criticism. The teacher comments are perceived in a negative or indifferent way. Educators should pay special attention to students feedback about effectiveness of teaching activities and the level of their satisfaction. For example, monitoring and evaluation allow a teacher to obtain an information about teaching incomes and quality, that would improve his work. It should reflect the level the students knowledge. It’s vice versa in practice: evaluations become a sign of knowledge / ignorance of the material, the student rating of the group. The presented requirements to discipline designed in previous work on digital lit- eracy formation [17] have been adapted to Berk’s pedagogical strategies. As a result, we have formulated new approaches considering educational needs of the Net stu- dents’ generation (see Table 2). Table 2. The approaches of teaching ICT discipline to meet educational needs of generation Net students The teaching element Requirements (Description) Discipline content A discipline content should reflect current research and encour- ages students to use new approaches and technologies, highlights current trends in ICT development. Tasks presuppose creative activity, form skills of self-learning and further development. Examples are inspiriting. Task execution result is useful, valuable and applicable product in professional practice. All elements of the discipline are focused on future professional activities. Students motivation All elements of the course (assignments, surveys, etc.) help students to see themselves in their future profession, particularly in teaching profession. It is important to emphasize interest to the students’ opinion, to make possible to their contribution in doing collective projects, for example, to create a bank of teacher’s materials; Student wants to get a feedback from his colleagues about his work. Any result of creative task should pass following stages: creation, publication and social assessment with definite criteria. Student wants to evaluate teacher’s work too, so a teacher should organize the ways to impact on the discipline development for students, to express their wishes. New Approaches of Teaching ICT to Meet Educational Needs … 203 The teaching element Requirements (Description) Organizational issues Tasks presuppose collaboration, facilitate communication and interaction to develop personal aspect of student and help to realize his individuality. Clear structure, planning, to do list and deadline system. Active use of formative assessment techniques. Ice-breaking, team-building and communicative exercises at the beginning and at the end of the lesson to form communicative skills, values, relationship. Use no more than two new environments at the lesson. Teacher should consider in the selection of services to work at the lesson: registration absence or its simplicity, functionality easiness, necessity to install additional software opportunities to use in profession. The main aim of updating the discipline was to help future teachers to create and organize their own learning space in the Internet. Therefore online interactive services that can be used for communication and teaching pupils were included (creating word clouds, mind map, open online documents, site, etc.). We also considered Berk’s strategies while designing the discipline. The updated version of "Information Tech- nology" discipline was taught at the Faculty of Foreign Philology (FFP) - 104 stu- dents, Faculty of Translation (FT) – 61 students, Faculty of Psychology, History and Sociology (FPHS) – 28 students. Table 3. The implementation of Berk’s teaching strategies in the course "Information Technol- ogy" Characteristic The implementation of educational strategy of Net Generation Tech savvy The virtual discipline environment was created with the system of distance learning ksuonline.ksu.ks.ua located on a MOODLE plat- form. This system allows developing a course with such elements as glossaries, wikis, multimedia clips, presentations, tests, blogs, fo- rums, etc. Teamwork and coop- Some tasks are complex and presuppose collaboration, while their eration execution the cognitive interpersonal communication and interaction of all participants progress. An important stage is to prepare a group of students to work together. To ensure about students’ readiness for cooperation teacher should arrange Ice braking, team-building and communicative exercise (up to 5 minutes) at the beginning and at the end of lesson. Teacher, who works in the computer classroom, usu- ally recognizes his students primarily "from the back." Therefore, such exercise facilitates personal contact with the group, which is especially important while teaching short disciplines. Interested in multime- The discipline content was developed and designed considering 204 N. Kushnir, A. Manzhula and N. Valko Characteristic The implementation of educational strategy of Net Generation dia, "visual" communi- students' interest in "visual communication" and multimedia. Video- cation clips and presentations were added to each theme. The tasks and manuals contain minimum of text information and maximum of graphics and illustrations. Teachers included such elements as blogs and forums. Rely on search engines. In adding to use search engines during the course students active Multitasks on every- work with Google services as Google.Drive, Google.sites, etc. This thing satisfies the interest in creating online content and allows simultane- ously work with several documents. It’s also an efficient tool of collective activity organization. Retention in the profes- Organization of students’ activity is a process of consistent model- sion ing of professional activity of specialists under learning conditions. The students of teaching specialties create educational games, tests, documents, tables that are useful in their professional practice. The results of the work are evaluated not only technical element, but also educational one. Also such task as creating a presentation "My choice" (students aimed to describe situation of their professional choice, analyze their present achievements and capacities, goals and dreams, ways to attain them) and collective writing of mind map "Modern teacher” were added. Emotionally open Tasks are person-centered and presuppose creation unique results that will describe student’s own lifestyle, attitude and etc. This ap- proach helps decrease the quantity of plagiarisms and contributes to the development of their creative abilities. Each student’s work passes through formative assessment: self- appraisal, social assessment, teacher’s assessment. Such tasks as making clouds of words, mind maps, playing learning games are relevant to professional and educational discipline orientation. In addition, they presuppose creative activity and can be easily adapted to a group work. Poll Results of 2012-2013 academic year at Faculty of Pre-School and Elementary Education are comparable with poll results published one year earlier. The quality of teaching materials on KsuOnline highly appreciated by students, average score was 9.29 in 2011 and 9.16 in 2012-2013 out of 10. The novelty of lectures was assessed as 7.59 in 2011 and 6.84 in 2012-2013, the novelty of practical tasks had 7.64 and 7.39 respectively. The students’ preferences about the most interesting and useful for future professional tasks have changed. In both of these categories the task "Creating didactic game" leads. Making a poster "It's interesting to know," creation of "Site Class" and creation of online poll follow in the rating. These tasks we preserved in the updated version of the discipline "Information Technology". As a result, new version of "Information Technology" discipline has the following structure: Lesson 1. Registration on KSUOnline, taking an entrance poll and test. Presentation "My choice" with the following structure: 1. My strengths and weaknesses (academic, creative, personal, individual aspects). New Approaches of Teaching ICT to Meet Educational Needs … 205 2. My goals, objectives and deadlines (to 2020 year). 3. My completed tasks. 4. A letter from You yourself in 2020. Lesson 2. Creating a class site on Google Sites (Students add, delete pages, edit them and insert pictures and text, change site’s design). Taking a poll “NET Gener profile Scale". Lesson 3. Creating a poll on Google Drive and inserting it to the site. Lesson 4. Creating a cloud of words (Tagxedo service), publishing pictures with a cloud of words and writing an assignment for the pupils on the "Homework” site page. Changing assignments with the word clouds and its execution. Lesson 5. Creation educational games with triggers or hyperlinks in MS PowerPoint. Lesson 6. Social expertise of the games (organized with Google Drive – a spreadsheet). Watching the movie "The image of the modern student” by Michael Wesch (Kansas State University, 2007). Creating a collective essay "How I like to learn" in online document on Google drive. Lesson 7. Co-creation of mind map "Modern teacher." Taking final poll and test. Making a poster “It’s interesting to know” and contributing to wiki page “Communication in the network” are for independent work. Returning to the problem of assessment, it should be noted that in presentation "My Choice" 68 students described their school successes and achievements considering them as the next step in their professional development, but some students evaluate these achievements as formal, not real: "To the moment of leaving school I had about 60 letters of honor in my "collection". But I understand they only have formed my ability to learn and now mean nothing". It’s necessary to change assessment system shifting the emphasis from control to formative function for strengthen the motivation. In addition it’s necessary to vary teacher’s assessment with self-assessment and social assessment due to teacher’s criteria. Students can determine the quality of the work due to criteria and assess technical realization, structure, relevance to age of pupils, subject, design, quality of illustrations, literacy, etc. This forced students to think about the quality of their work while creation it. Social expertise has stimulating, diagnostic and formational functions. A wish to get social appreciation motivates to create high-quality bright individual works. Such way of assessment forms critical perception of the information. A student with low self-esteem at first is afraid that his group mates will assess not his work but his person, but it has never happen, all students are aimed to be objective and independent judges. The social aspect is a priority for a new generation. They tend to have a relationship regardless of the field of interaction. Education is not an exception. If the teacher does not build the relationship consciously, it usually takes the worst form. A teacher should know that a grade is no longer a sufficient motivation for the positive attitude of the student to the discipline, especially in the pass-fail system. It is clear that students often associate the discipline with teacher’s personality. The high quality 206 N. Kushnir, A. Manzhula and N. Valko executed tasks show not only an attempt to get a high grade but sympathy to a teacher. In computer disciplines social aspect is often nearly absent or poorly developed: most of the time students work individually on the computers. Fig. 6. Increasing students’ achievements by the method of formative assessment The times of detached teacher have passed. Today’s successful teacher can easily contact with the audience in a short time, acts as a partner, friend and a leader. During the course, we found that the most third-year students have lack of communication. For example, at first lesson after the game "Introduction", we asked all students to tell us about their three small victories. Such "victories" were often met with utter surprise of the group. It was uncovered that the students' ability to learn is directly connected with their sense of self-confidence and positive mood. Obviously, a success of training approach strongly depends on a teacher personality and his skills in training and teambuilding. Therefore, the authors of this paper have selected games and exercises that do not require special teacher’s skills and knowledge or any additional devices. Training elements in students’ groups of different teachers’ specialties and years of study had different reaction. Some students expressed stunning and rejection: "Why should I do that? It is not serious." despite the fact that they had finished Innovative Teaching Methods and Technologies discipline, training exercise was an extraordinary situation. One of the teachers said describing her experience: "Third- year students look so serious, that sometimes I feel scared to come to the classroom, as if teachers and students have switched their roles...”. We find it particularly important to use the elements of training and communicative techniques working with students of teaching specialties. Present students are future teachers who are taught traditionally. In several years, they’ll copy one of teaching styles they have seen. It is important to implement innovative techniques and give to students a chance to test them in practice. Another important improvement in the organization of the discipline was using of open online document on GoogleDrive among teachers. Each of the teachers contributed to planning, selection of training exercise and other notes. After each lesson, teachers made a record in “a discipline online diary” to describe in free form results of the lesson, such as the most common student mistakes, teaching and New Approaches of Teaching ICT to Meet Educational Needs … 207 technical problems, positive situations, questions from the students, etc. Keeping a discipline diary had a great success. Teaching has become more effective and convenient, the level of awareness and collaboration among teachers has risen, so now teachers create and use it in other disciplines. 4 Results and Discussion Developed approaches allowed us to improve a range of disciplines including "Information Technology", which is taught for students of all teacher specialties. We also applied them to following disciplines: "Introduction to Information Technology" (for future teachers of elementary school and Computer Science, the 1st year of studiy), "Fundamentals of Computer Science and Applied Linguistics" (translators, the 2nd year of study), and “Office Computer Technology” (programmers, the 1st year of studies). Students expressed their positive attitude verbally in the classroom and several students sent e-mails with gratitude after finishing the discipline. The results of the final poll confirmed their appreciation. Fig. 7. Word cloud composed of students' comments about IT discipline 5 Conclusions and Outlook Thus we have obtained the following results: 1. Discussed and analyzed characteristics of generation Net students; 2. Highlighted the contradiction between the characteristic of today's students and traditional ways of ICT teaching; 3. The approaches to resolve these contradictions are found and implemented in teaching practice. In the future we plan to expand the list of pedagogical specialties to which the up- dated version of the course "Information Technology" will be taught and improve other disciplines according to the proposed approaches. 208 N. Kushnir, A. Manzhula and N. Valko References 1. ICT in Higher Education in CIS and Baltic States: State-of-the-Art, Challenges and Pros- pects for Development. Analytical Survey, 6, GUAP, St Petersburg (2009) 2. Berk, R. A. Teaching Strategies for the Net Generation. Transformative Dialogues: Teach- ing & Learning Journal, 3(2), 1–23 (2009) 3. Cashmore, P.: Stats Confirm it: Teens don‘t Tweet. Nielsen NetView Audience Measure- ment Survey, July 2009, http://mashable.com/2009/08/05/teens-dont-tweet (2009) 4. DeAngelo, L., Hurtado, S. H., Pryor, J. H., Kelly, K. R., Santos, J. L., Korn, W. S.: The American College Teacher: National Norms for the 2007–2008 HERI Faculty Survey. Higher Education Research Institute, UCLA, Los Angeles (2009) 5. Frand, J. L.: The Information Age Mindset: Changes in Students and Implications for Higher Education. EDUCAUSE Review, 35, 15–24 (2000) 6. Greenberg, E. H., Weber, K.: Generation We: How Millennial Youth are Taking over America and Changing our World Forever. Pachatusan, Emeryville, CA (2008) 7. Horrigan, J. B.: Home Broadband Adoption. Washington, DC: Pew Internet and American Life Project (2006) 8. Horrigan, J. B., Rainie, L. Internet: the Mainstreaming of Online Life. Washington, DC: Pew Internet and American Life Project (2005) 9. Junco, R., Mastrodicasa, J.: Connecting to the .Net Generation: What Higher Education Professionals Need to Know about Today’s Students. Washington, DC: Student Affairs Administrators in Higher Education (NASPA) (2007) 10. Kridl, B.: The condition of education. National Center for Education Statistics (NCES), Washington, DC, U.S. Department of Education, Office of Educational Research and Im- provement, National Center for Education Statistics (2002) 11. Ostrow, A. Stats: Facebook Traffic up 117%, Veoh Soars 346%. Nielsen Net Ratings, Au- gust 2007, http://mashable.com/2007/09/13/nielsen-august (2009) 12. Pryor, J. H., Hurtado, S., DeAngelo, L., Sharkness, J., Romero, L. C., Korn, W. S., Tran, S.: The American Freshman: National Norms for fall 2008. Higher Education Research In- stitute, UCLA, Los Angeles (2009) 13. Sax, L. J., Ceja, M., Terenishi, R. T.: Technological Preparedness among Entering Fresh- man: the Role of Race, Class, and Gender. Journal of Educational Computing Research, 24(4), 363–383 (2001) 14. Tapscott, D.: Growing up Digital: How the Net Generation is Changing your World. McGraw-Hill, NY, (2009) 15. Behrstok, E., Clifford, M.: Leading Gen Y Teachers: Emerging Strategies for School Leaders. TQ Research&Policy BRIEF, Washington, DC, USA (2009) 16. Petuhova, L.E.: Theoretical and Methodological Background of Informational competen- cies Formation of Elementary School Teachers. Doctoral Dissertation, Specialty 13.00.04 - Theory and Methods of Professional Education. The South Ukrainian National Pedagogi- cal University named after K.D. Ushynsky, Odesa (2009) 17. Kushnir, N., Manzhula, A.: Formation of Digital Competence of Future Teachers of Ele- mentary School. In: Ermolayev, V. et al. (eds.) ICT in Education, Research, and Industrial Applications. Revised Extended Papers of ICTERI 2012, CCIS 347, рp. 230-243, Springer Verlag, Berlin Heidelberg (2013) Pedagogical Diagnostics with Use of Computer Technologies Lyudmyla Bilousova1, Oleksandr Kolgatin1 and Larisa Kolgatina1 1 Kharkiv National Pedagogical University named after G.S.Skovoroda, Kharkiv, Ukraine lib215@list.ru, kolgatin@ukr.net, larakl@ukr.net Abstract. The technology of the automated pedagogical diagnostics is ana- lysed. The testing strategy, oriented for pedagogical diagnostics purpose, and grading algorithm, which corresponds to Ukrainian school grading standards, are suggested. "Expert 3.05”software for automated pedagogical testing is de- signed. The methods of administration of the database of the test items are pro- posed. Some tests on the mathematical topics are prepared with "Expert 3.05". The approbation of these tests in the educational process of Kharkov National Pedagogical University named after G.S.Skovoroda is analysed. Keywords. E-learning, Diagnostics, Test Key terms. InformationCommunicationTechnology, Teaching Process 1 Introduction Pedagogical diagnostics is the integral part of adaptive E-learning courses. The un- conditional quality of testing is its high informative abilities. However, in practice the large part of the test information is often not used. Computer technologies give us possibility to organize the qualitative pedagogical diagnostics at new level. Modern automated systems which can be qualified as expert systems are capable to supply comprehensive algorithms of testing and analysis of the test results. Testing with use of computers allows a teacher to obtain the summary characteristics of knowledge and skills of the pupils' group and to use this information to choose the teaching methods. A study of such algorithms is a wide field of the scientific work. Therefore, the aim of our paper is to design methods of the pedagogical diagnostics, which satisfy following demands: Different forms of the intellectual activities of an examinee are attracted in process of testing; The automated system of the pedagogical diagnostics ensures its diagnostic abili- ties at wide differences of the examinees mastering; Processing of the test results provides maximum information for an examinee and a teacher to correct the educational process. 210 L. Bilousova, O. Kolgatin and L. Kolgatina 2 Objectives The first stage of pedagogical diagnostics organising is a construction of an idealised pedagogical model that is allocation of basic elements of knowledge and skills, as well as detecting the level of its mastering. The second stage represents creation of the problems system which covers all ele- ments of knowledge and skills and all levels of their mastering. We cannot design test as a system of test items of equal difficulty, in spite of rec- ommendation of the classic test theory. Such approach gave the best tests for dis- crimination of examinees into several groups. However, the test with equal items has low validity for examinees with bad mastering because of guessing answers. Validity of such a test is also low for good mastering examinees because of lack of attention. Therefore, it is certainly necessary to include problems of different difficulty to the test. How to design a test item of advanced difficulty? What is difficulty? Why the most of examinees do not solve some problems? We cannot use problems of the reproductive level as items of advanced difficulty. There are not difficult facts and easy facts. Our educational process should be organ- ised to provide steady knowledge of all compulsory facts. If the most of examinees do not know some compulsory facts, it means that we should correct our teaching. We are against using items which correspond to facts that are fragmentary studied and which are not basic for the tested topic. Therefore, all problems corresponding to the reproductive knowledge must have equal difficulty. Someone can increase the difficulty of an item by combining several operations in this item. Such approach leads to increasing influence of lack of attention on the test results, as well as to necessity of using weight coefficients and to decreasing of the measuring accuracy of the test. We are also against using problems which correspond to facts that are fragmentary studied and do not form basis of the topic being tested. In our opinion, the item of advanced difficulty should be connected with use of more difficult, not reproductive kinds of the intellectual activities [1], [2]. Full and high-qualitative pedagogical diagnostics should be built on the system of test items of all levels: reproductive and productive. By analogy with levels of educa- tional achievements [2], which are standardised by the Ukrainian Ministry of Educa- tion and Science [3] we propose the following levels of the test items: 1. Initial level - it is the very simple problems which assume the reproductive charac- ter of the student’s activities, mainly distinguishing. The difficulty index of these test items is about 1, the most of the examined students execute these items cor- rectly. 2. Average level - it is the problems which assume the reproductive activities, these problems cover all basic facts and unary skills according to curriculum. A database of items of this level is designed the most naturally. According to the Ukrainian standards [3], the student can continue education, if he (she) knows not less than 50% of compulsory facts determined by curriculum. Therefore, by linear estima- tion, average index of difficulty must be near 75% for the reproductive items. Pedagogical Diagnostics with Use of Computer Technologies 211 3. Sufficient level - these items assume the examinee applies his knowledge and skills for solving problems in standard situation. 4. High level - these test items are practical problems which assume executing of new algorithm, carrying knowledge into new, non-standard situation, etc. These items can lose creative nature, if the method of its solving was explained in the process of learning. Therefore, the database of the items of the level 4 requires continuous analysis and modernisation. We propose the vector processing of the test results – separate calculation of the score for items of every level. It allows to avoid the use of the artificial weight coefficients and to provide the comprehensive algorithm of adaptive strategy of testing and grad- ing. We also propose the separate processing of the results for the test items according the elements of knowledge and skills. Using computer for test administration allows to analyse the examinee's results di- rectly in process of testing and to suggest an examinee the items, which mostly corre- spond to his (her) level of educational achievements. Such approach is often called adaptive or quasi-adaptive testing [4]. 3 Model and Algorithm A choice of the items level for start of testing is an important question of the adaptive strategy. Testing usually starts from the simplest items. Such approach provides de- creasing of psychological discomfort and creates the atmosphere of competition, the feeling of growth according to complication of the problems. Taking into account this consideration we propose to start testing just from items of the level 2 which are the simplest for the examinees that will obtain positive grade. There is additional argument to choose the level 2 as the start level of testing. The test items of the level 2 reflect the compulsory facts of the study topic; these problems cannot be excluded from the test process. It is not worthwhile to start test from the items of the level 3, because productive and, especially, creative problems are based on sufficiently wide spectrum of knowledge, and it is not always possible to detect, which exactly element of curriculum is not mastered by an examinee. The items of the level 1 are intended for students whose mastering is not satisfactory; therefore, there is no need to suggest these items to all examinees. Our testing strategy and algorithm of grading are presented on the fig. 1. Here are some comments to fig. 1. The testing starts with the items of the level 2. An examinee solves the compulsory minimum of the items on the level 2, automated system calculates S2 - his (her) score on the level 2 and estimates the error of the score. If accuracy is enough, the automated system chooses a grade or increases the level of items, which are being suggested to the examinee. Otherwise, the items of the level 2 are being suggested to the examinee until the accuracy become satisfactory. It should be underlined that accuracy depends not only on the number of items, but it depends on the individual test score [5]. The necessary accuracy is also conditioned 212 L. Bilousova, O. Kolgatin and L. Kolgatina by the differences between the test score and the key points for decision about grading or a rise of a level. Begin Solving of problems of the level 2 N Yes S2>=0.5 Solving of problems N S2>=0.8 Ye of the S1>=0.8 Yes N Grade 3 N Ye S1>=0.5 А Grade 1 Grade 2 А А N Ye Solving of S2>=0.6 problems Grade 5 of the N S2>=0.5 Ye А N S3>=0.8 Ye Grade 3 Grade 4 Solving of А А problems of the N S4>=0.7 Ye N Ye N Ye S3>=0.5 S4>=0.5 Grade12 Grade 8 N Ye Grade11 А N Ye S4>=0.3 S3>=0.3 А А Grade 6 Grade 7 Grade 9 Grade10 А А А А А End Fig. 2. Grading algorithm Testing on the levels 3, 4 and 1 is carried out by analogy with level 2 with scores S3, S4 and S1 accordingly. Such algorithm of testing improves accordance between the level of items being suggested and the level of the examinee's mastering. Influence of a lack of attention on the test score for examinees with an excellent mastering is decreased. The exami- Pedagogical Diagnostics with Use of Computer Technologies 213 nees with bad mastering solve easy problems, which correspond to the most important parts of the educational content. The psychological discomfort, connected with con- stantly incorrect answers, is excluded, but such easy items give a possibility to deter- mine the structure of the knowledge and skills, as well as to distinguish examinees, who have not mastered the compulsory minimum according to curriculum. In any case the result of the pedagogical diagnostics will be more careful in com- parison to testing without special selection of items. 4 Software The computer support of offered technology is provided with information system "Expert 3.05" designed by us as a distributed database in the MicrosoftAccess2003 environment. The important advantage of our information system in our opinion is a modular principle of its construction which allows the author of the test items (proba- bly, with the help of the programmer) to create and to add to the database new forms of the test items. The central database of information system is the test items database. The test items are grouped by topics for convenience of viewing. The elements of an educational content are picked out in each topic. To each element the author provides the comment for a student who has not mastered this content. Some blocks of the test items of a different level of difficulty are offered to verify student's mastering in each element of an educational content. The author specifies for the every block such pa- rameters: a test items level (0-4), a weight factor and maximum time of exposition of one item. All items of the block should be of one type, that is, the identical dialogue form. The student will be offered one or several items from each block by a casual choice in a process of testing. Quantity of items blocks and filling these blocks are determined by required quality of diagnostics. The database, which contains the information on the answers of each examinee on each item, is formed by results of testing. This database includes such fields: the code of the test item, the correctness of the given answer, the level of the item, probability of casual guessing of a correct answer, time of item solving. The additional service information (time and date of testing, examinee's grade etc.) is also being stored. The examinee receives (fig. 2) the diagnostic data on each element of knowledge; chart, which reconstructs a structure of his (her) knowledge; recommendations for independent work. The author receives the statistical analysis of each item difficulty and discrimination, correlation with the test score (grade). The diagram shows de- pendence of item difficulty on the examinee's test score (grade) and is very useful (fig. 3). The author has an opportunity to generate database query by means of Access environment and to pass the data for the further analysis in spreadsheets. 214 L. Bilousova, O. Kolgatin and L. Kolgatina Fig. 3. Diagnostic data for the examinee Fig. 4. Statistical analysis of a test item for the author 5 Experience of Diagnostics with “Expert 3.05” We have prepared the system of the test items with "Expert 3.05" on some courses: “Mathematical methods in psychology”; “Theoretical basis of informatics”; “Archi- tecture of a personal computer”. Now we are able to start the third stage of the preparation of the system of the pedagogical diagnostics - approbation and verification of the test items. Pedagogical Diagnostics with Use of Computer Technologies 215 Our technology of verification is based on the requirements of the Standard of the Ukrainian Ministry of the education and science [6] and takes into account features of the automated pedagogical diagnostics. The analysis of the test begins with detecting of a level of educational achieve- ments of the students based on the expert's rating, for example, it may be a traditional examination. The complete verification procedure assumes that the experts determine such rating irrespective of the verifying test. Approbation should be organised with enough number of examinees to guarantee sufficient number of answers for any test item. Determination of the student's rating by experts cannot be organised so often, as it is necessary for constant updating of the test items database. Therefore, for the cur- rent verification it is possible to offer detecting of a level of student’s educational achievements with the help of the same automated system of pedagogical testing and to check a correlation of the separate item with the test score. In such a case, the ap- probation data are accumulated continuously, including the independent work of the students with the automated system. Validity of the automated system of testing as a whole is checked through comparison of integrated results of testing with results of other kinds of the control: interview, examination, execution of practical works etc. For maintenance of reliability of the current verification, the automated system does not include into the analysis the answers of not registered examinees: teachers and other users, whose names are not in the lists of the students groups. The answers of the test pass, which has not been finished, are not analysed too. The answers are not taken into account, if the time of its execution is smaller, than it is necessary for ac- quaintance with the text of the problem. There is an opportunity to specify additional conditions of selection of the valid answers with use of the Access environmental (for example, date and time of testing, educational group, variant of the test etc.). After distribution of the students according to their educational achievements, the conformity of a level of the test item and its empirical index of difficulty is checked. According to the requirements of the Ukrainian educational standards, the students with an average level of educational achievement (the grade of 4 on the grading scale with 12 grades) "... knows about half of educational content, is capable to reproduce it, to repeat after the model the certain operation..." [3]. Just the items of the level 2 in our classification have such contents. Thus, the difficulty index of the items of the level 2 can not be less than 0.5 for such students and we consider it to be within the range 0.5-0.9. It is necessary to note that such a range of difficulty is not convenient from the viewpoint of improvement of test statistical parameters. However, the items of the level 2 represent a set of educational content facts, which are obligatory for mastering. Therefore, the problems author cannot change its difficulty without chang- ing the curriculum. Thus, for the items of the level 2 on the sample of students with an average level of educational achievements we have the following algorithm of analysis of the item quality: An item does not require correction, if its difficulty index within 0.5-0.9, the dis- crimination index is higher than 0.25, that is, discrimination index satisfies the re- quirements of the standard [6] (fig. 4.). 216 L. Bilousova, O. Kolgatin and L. Kolgatina The item difficulty index is more than 0.9. This item should be analysed with the help of the diagram of dependence of difficulty from educational achievements of the students. It should be determined, whether this item has the discrimination abil- ity for the students with the initial level of educational achievements, accordingly, this item level should be changed (fig. 5). Otherwise, this item should be removed from the test. The difficulty index is less than 0.5, it is necessary to analyse the content of the item, such situations are possible: ─ The item is not reproductive, its discrimination ability satisfies the requirements of the standard. It is necessary to increase the item level (fig. 6). ─ The item has low discrimination ability for all students; it signifies that some mistakes in the formulation of the item take place. This item should be removed or corrected (fig. 7). There is a possibility of the situation, when all experts agree that the problem is correctly designed and satisfies the curriculum, in such a case mastering of the students should be checked by some another method and, may be, the quality of educational process should be analysed. The best range of difficulty index for items of levels 1, 3, 4 will be within 0.5-0.6 (allowable 0.3-0.7) on the sample of the examinees of appropriate level of mastering. The analysis of these items on a difficulty is carried out by analogy with level 2. Fig. 5. A typical item of the level 2 Pedagogical Diagnostics with Use of Computer Technologies 217 Fig. 6. A typical item of the level 1 Fig. 7. A typical item of the level 3. It is not a reproductive problem, if the students do not learn the table of the binary codes for numbers until 15. After finishing three stages of preparation the system of pedagogical diagnostics is ready for practical use. The stage of practical application of the system combines procedures of testing and statistical processing of the obtained results including the interpretation of the results for students, teachers and authors of the test problems. The expert system of pedagogical diagnostics needs continuous modernisation of its database. Naturally, it requires returning to previous stages of the work with the sys- tem. 218 L. Bilousova, O. Kolgatin and L. Kolgatina Fig. 8. An unsuccessful item The “Expert 1.01-3.05” software is used in Kharkiv National Pedagogical Univer- sity named after G.S.Skovoroda since 2001. Here are some latest results on approba- tion. The test on mathematical methods of statistical analysis of the pedagogical diag- nostics data was suggested to future teachers of informatics, mathematics, chemistry as an element of courseware "Information systems in pedagogical activities” in 2012- 2013 academic year. The purpose of testing was the self-diagnostics of students. So, the students could pass the test many times, studying the problem elements of the learning content and improving their results. We took into account the best results of testing and compared its with the examination results. The Pearson correlation was 0.7 at sample of 51 students and we can consider it as the test validity. The test results gave us possibility to study the structure of students’ knowledge at basic questions of statistical analysis of the pedagogical diagnostics data (fig. 8). The error estimation for the data on fig. 8 was evaluated as a half of the 95% confidence interval 1.95s , y n where s is the estimation of the standard deviation and n is the number of test items of the given learning element, which were passed by students. The errors are different for every point on fig. 8, because the number of test items on various elements is dif- ferent, so we show the ranges of errors in table 1. The results (fig. 8) show that problems of choosing the scales for the pedagogical evaluations are the most difficult for students. The problems of reproductive level, where student should to choose the method or formula for estimation of some parame- ters of statistical distribution, are the easiest. But on the productive level, when stu- dents should explain the influence of the values and number of variants in a sample on the estimated parameters, such problems are the most difficult. Pedagogical Diagnostics with Use of Computer Technologies 219 Fig. 8. Difficulty index (probability of correct answer) as a function of the student’s grade for problems of various elements of the learning content 220 L. Bilousova, O. Kolgatin and L. Kolgatina Table 5. Errors of difficulty index estimation at different student’s grades Student’s grade Estimation error 1 0.01-0.04 2 0.04-0.1 3 0.02-0.05 4-5 0.01-0.07 6-7 0.02-0.16 8 0.03-0.18 9 0-0.4 10-12 0-0.6 6 Conclusions 1. New comprehensive algorithm of testing and grading is suggested. This algorithm takes into account possibilities of the computer technologies and requirements of Ukrainian standards. 2. The automated system of the pedagogical diagnostics is designed. 3. The methods of the items database administrating is proposed and used in practice in the educational process of the Kharkov National Pedagogical University. References 1. Bloom, B.S.: Taxonomy of Educational Objectives. Book 1: Cognitive Domain. Longman, Inc., New York (1956) 2. Bespalko, V.P.: Basis of the Pedagogical Systems Theory: Problems and Methods of Psy- chological and Pedagogical Providing of Technical Teaching Systems, Voronezh Univer- sity Press, Voronezh (1977) 3. Criterions of Grading of the Educational Achievements of Pupils in the System of Secon- dary Education. Education of Ukraine, 6 (2001) 4. Zaitseva, L.V.; Prokofyeva, N.O.: Models and Methods of Adaptive Knowledge Diagnos- tics. Educational Technology & Society, 7, http://ifets.ieee.org/russian/depository/v7_i4/ pdf/1.pdf (2003) 5. Kolgatin, O. G.: The Statistical Analysis of the Test with Different Forms of Items. Means of Teaching and Research Work, 20, KhSPU, Kharkiv (2003) 6. Means of Diagnostics of a Level of Educational and Professional Training. The Tests of the Objective Assessment of a Level of Educational and Professional Training. Order of the Ministry of Education and Science of Ukraine, № 285, 31 July 1998 (1998) The Use of Distributed Version Control Systems in Advanced Programming Courses Michael Cochez, Ville Isomöttönen, Ville Tirronen and Jonne Itkonen Department of Mathematical Information Technology University of Jyväskylä P.O. Box 35 (Agora), 40014, Jyväskylä, Finland {michael.cochez, ville.isomottonen, ville.tirronen, jonne.itkonen}@jyu.fi Abstract. Version Control Systems are essential tools in software devel- opment. Educational institutions offering education to future computer scientists should embed the use of such systems in their curricula in or- der to prepare the student for real life situations. The use of a version control system also has several potential benefits for the teacher. The teacher might, for instance, use the tool to monitor students’ progress and to give feedback efficiently. This study analyzes how students used the distributed version control system Git in advanced programming re- lated courses. We also have data from a second year course, which enables us to compare between introductory level and master’s level students. We found out that students do not use the system in an optimal way; they do not commit changes often enough and regard the version control system as file storage. They also often write commitmessages which are mean- ingless. Further, it seems that in group work settings there is usually one dominant user of the system. Keywords. Programming Education, Version Control System, Git Key terms. ICTTool, TeachingProcess, ICTEnvironment, Technology Introduction Version control systems (VCSs) have a decades-long history in professional soft- ware engineering with early systems like Source Code Control System (SCCS) and Revision Control System (RCS) developed in the seventies and eighties re- spectively. These pioneering systems only supported storage of the versions on the file system, while later systems also allowed for remote and mostly centralized storage of the versions. The most well-known centralized systems are Concurrent Versions System (CVS) and Subversion (SVN). Currently, there is a trend to- wards the use of distributed version control systems (DVCS) where each user has a local copy of the repository which can be synchronized with other repositories. Systems such as Git and Mercurial exemplify this type of present-day decen- tralized technology. These DVCSs enable flexible change tracking, reversibility, 222 M. Cochez, V. Isomöttönen, V. Tirronen and J. Itkonen and manageable collaborative work, which are valuable for both small and large projects. There are many arguments for incorporating VCSs into an educational set- ting. From a teacher’s point of view, using VCSs increases the possibility of monitoring how students make progress with their assignments and eases the feedback process. The teacher could, for instance, include corrections and sug- gestions directly into the students’ program code [1]. More generally, educators acknowledge that the use of VCSs relates to effective team work and that it is a crucial skill to be taught to prepare a competent workforce for present-day distributed workplaces [2]. An educational concern of interest to us is how students actually use VCSs. This has been previously studied by Mierle et al. [3] who investigated VCS usage patterns in a second-year course, hoping to find a correlation between an effective use of VCSs and study success. No clear patterns could be identified in the data which the authors attributed to the fact that beginner students climb their learning curve at different rates; see [3, 4]. These authors call for more research on VCS usage patterns in particular in upper-year courses [4], which motivates the present study. We have collected data about students’ use of the distributed version control system (Git) from three different courses: Introduction to Software Engineering (second-year bachelor), Functional Programming (master’s level), and Service oriented architectures and cloud computing for developers (master’s level). A hypothesis arising from teacher observations during these courses is that students use VCS principally as a submission system rather than what it is intended to be. By this we mean that – students commit at the end of the class sessions or right before the deadline, or there is only one commit per week/task, – only one group member commits everything, – students do not consider what file types to commit, – overall, with no specific training, student do not use VCS efficiently. We study these issues quantitatively exploring version control commit fre- quencies, commit sizes and the activity of individual students. Our specific re- search interest is the potential usage patterns identifiable in the commit log data of Git repositories. 1 Version control systems in education Clifton et al. [5] summarize that in educational settings VCSs have been adopted to enable more realistic software development experiences for students [6], as a tool to monitor or visualize team and individual contributions [7], and for non- code artifacts such as creative writing [8]. Clifton et al. themselves, as well as many others, use a VCS for course management purposes. Further, some authors regard VCSs as a valuable tool to monitor and understand how students develop code [9]. Unsurprisingly, one of the most usual educational targets appears to The Use of Distributed Version Control Systems ... 223 be courses with project work where VCSs both foster team work and facilitate course management tasks such as assessment and grading [10]. Milentijevic et al. [11] go as far as to propose a generalized model for the adoption of VCSs as support in a variety of project-based learning scenarios. All in all, we find that there is a general consensus of the benefits of VCSs as an integral part of computing curricula, one key argument being that they measure up to the requirements of globally distributed workplaces [2]. There are also challenges in the educational use of VCSs. Reid and Wilson [4], who used the CVS system, report on the confusion in judging which of the students’ assignment versions was the final one. Glassy [9] found that students tend to put off working on assignments for as long as possible, even though a VCS is proposed to them with the hope of iterative work processes. Issues of this kind relate to inefficient use of VCSs. Furthermore, Reid and Wilson [4] noticed that some students mixed the functionalities of the CVS check out and update commands, and that also teaching assistants encountered problems if they had not properly familiarized themselves with the tool. These issues were considered to be due to a lack of a mental model of the VCS system used. Yet another challenge Reid and Wilson [4] raise is increased teacher workload when repositories are initiated and managed by teachers. In a more recent study, Xu [10] points out that there can be a long and rough learning curve before students feel comfortable using Git. Accordingly, Milentijevic et al. [11], who used CVS, report that students find a VCS to be a useful tool after they are sufficiently familiar with it. In the paper by Glassy [9] and Xu [10], the value of informative commit log messages is raised as a topic to be emphasized to the students. Rocco and Lloyd [12] in turn observed that some student have difficulties in understanding what constitutes “a significant change” to be committed. It is much more difficult to find systematic empirical studies on issues such as how frequently students make commits and how they share the work. Rocco and Lloyd [12] found in their data that over 80% of a CS1 course population could adopt an iterative work process with the Mercurial system (50.0% did 7– 21 commits and 33.3% more than 21 commits). On another course the authors defined a minimum commit frequency for one assignment and no requirements for the assignment that followed. With the first assignment, 75% of the students obtained a reasonable commit frequency, while with the latter this was 81%, altogether indicating that informing students of proper VCS usage can have a positive effect on their work processes. The authors note that not only were the students able to grasp the basics of the VCS (Mercurial), but they tended to continue to take advantage of the tool later on. The present study focusing on the students’ usage patterns with the Git system in both a second-year course and master’s level courses complements the studies such as the ones by Rocco et al. [12] and Mierle et al. & Reid and Wilson [3, 4]. 224 M. Cochez, V. Isomöttönen, V. Tirronen and J. Itkonen 2 The courses Introduction to Software Engineering (SE) is a 3-credit course consisting of lec- tures, a course assignment, and an end-of-course exam. The lectures introduce students to the basic concepts of software engineering, while the mandatory course assignment is the preparation of a project plan. The assignment is done in small groups and consists of four larger phases that need to be accepted by the lecturer. Mandatory supervision sessions on version control were arranged at the beginning of the course in order to encourage all the students to use the distributed version control system Git for the group assignment. The course had altogether 72 students in 33 groups (2.18 ± 0.76 students per group). Functional programming (FP) is a 6-credit course implemented without tra- ditional lectures and exams. The course is run in week-long cycles such that each week a new set of exercises is announced for the students. Students work in small groups and all of their study time is devoted to programming the weekly exercises. Two contact sessions are held each week. The first one is devoted to supporting the students’ work and answering their questions. During the second weekly contact session there is a review of the student-written code. Overall, the course emphasizes self-direction on the part of students, similar to recently discussed course models such as the flipped classroom; see more details in [13– 15]. Git was proposed for students as their primary group work tool and all of the exercises had to be returned via it. Thirty-six students where active in the course divided over 13 groups. (2.77 ± 0.89 students per group) The last master’s level courses studied, Service oriented architectures and cloud computing for developers (SOA&CC ), introduces students to the use of digital services and the concept of cloud computing. A format similar to the FP course is used during the first (5 credits) part of the course. During that part of the course students undertake independent group work on a set of assign- ments each week. Two weekly sessions are arranged for the group work and one mandatory contact session focusing on reflective program review is arranged at the end of each week. An analysis of how the course model used in this course attempts to motivate students can be found in [16]. During the course Git is not only used as a version control system; it is also used as a tool to deploy code to Platform as a Service (PaaS) providers. Nine groups of students were formed with altogether 36 students (4 ± 0.82 students per group). All three courses utilize the Faculty’s YouSource 1 system. Similar to staff members, students can use their university credentials to log in to this system and create projects and Git repositories to manage collaborative work. The projects and repositories can be defined to be either private or public and collaborators can be added to them with a variety of permissions. This system has been in use at the department since mid-2010 and has been used in many courses and research projects. It should be noted that in the remainder of this paper we are specifically concerned with the Git version control system, which belongs to the third gen- 1 https://yousource.it.jyu.fi/ The Use of Distributed Version Control Systems ... 225 eration of version control systems (DVCSs). Students are free in their choice of environment for interacting with the version control system. Students can for instance use the git command line tool, tools with a graphical user interface, or tools included in their integrated development environments. 3 Data analysis The Git repositories which students or course teachers created for the respective courses on the above-mentioned YouSource system are the source of all data analysis in this paper. One limitation of this data source is that we cannot see any data related to branches which a student did not push to the central Git server. However, if work of one of these so called local branches got merged into a branch which is synchronized with the central Git server, we are able to see its history as well. Further, this limitation is of minor importance since we are mainly interested in how students use the version control system in group work settings. Another limitation, which is inherent to the Git DVCS, is that we cannot know for sure whether time stamps on commits are truthful. It is technically possible to tamper with the date of the commits, but since there is no benefit for students to do so, we make the assumption that the time stamps are correct. To study our research hypothesis, we will perform five different analyses, the first four of which are based on commits to the repository and the last one on the content of the repositories. For each commit we extracted the number of insertions and number of additions in accordance with the short status log of each commit2 . We added these two numbers together to form what we will call the number of changes of that commit. The tools used in the analysis have been developed by the authors of the paper and consume output produced by the diverse git commands. For the first analysis we will, for each course, look at the commit activity over the whole course. To be concrete, we will visualize the commit activity by plotting the estimated probability density function of the total number of changes, i.e. for all students, over the span of the course. The density is estimated via the standard kernel density estimator, using a Gaussian kernel with bandwidth of 6 hours.[17] The height of the plot then shows the relative likelihood of a commit at a specific point in time. The second analysis focuses on students’ activity during the implementation sessions. This is done only for the FP and SOA&CC courses since the SE course does not have distinct sessions during which students get time to implement their work. We use a similar method as in the first part, but accumulate all commits that were made during the implementation sessions in the same plot. This plot shows when the students commit their code during the contact sessions. In the figure the far left of the x-axis represents the start of the session and the far right 15 minutes after the end. This is done in order to account for commits 2 http://www.kernel.org/pub/software/scm/git/docs/git-log.html 226 M. Cochez, V. Isomöttönen, V. Tirronen and J. Itkonen right after the sessions. In this case we use a bandwidth which is one tenth of the total length of the session. In the third part we perform an analysis of the commit messages in the different courses by classifying them in three categories : useful, trivial, and nonsense. A message is placed in the nonsense category if its content is not anyhow related to what is being committed. An example of this type of messages are these which contain only a couple of random letters, needed because the git system does not allow for empty commit messages. A trivial message is one which has no information beyond what is immediately visible from the commit meta-data. This category includes, for instance, a message consisting of a list of changed files or one saying that a given commit is a merge of two branches. All other commits are classified as useful. It should be noted that being in the useful class does not directly imply that the message is of high quality. It only means that the message is not trivial or nonsense. The classification was done manually by the respective teachers of the courses. We do not try to make a comparison between the courses, because the bias caused by having different raters is difficult to estimate. In the fourth part we try to measure whether the version control system is used equally among the students in the group. If the system would be used by all students in a group, we would expect that the most active student in a group of n students performs (1/n) ∗ 100% of the commits. To represent this number for all groups in the different courses we first find the students with the highest number of commits in their respective groups. Then we calculate their individual share in the total number of commits of their group. We then create an overview of the obtained percentages where we show different graphs for different group sizes since comparison among unequal group sizes would lead to biased results. It only makes sense to measure this for groups with more than one person. The SE course had a few single-person groups, hence only 28 groups from that course are included. For the fifth and last part we investigate the types of files which students put under version control. First, teachers of each of the courses listed the file types and limits which they would expect a normal repository to contain. We started out from the files included in the HEAD of the master branch. For the FP and SOA&CC course we determined the type of each file using the BSD file 3 command. The SE repositories required a manual analysis to decide the type of the files because the file command is unable to distinguish between the file types in use in the course. Then we counted the number of files of each file type. Then for each count, we compared it to the number of files expected by the teacher and any surplus was counted as garbage. The final number which we calculated for each group is the fraction of garbage in the total number of files. 3 http://www.openbsd.org/cgi-bin/man.cgi?query=file The Use of Distributed Version Control Systems ... 227 4 Results This section describes the results of our analyses. The first subsection shows the results for the analysis of the student activity during the whole course. In the second subsection, we focus on the implementation sessions only. The results of the analysis of the commit messages is shown in subsection three. Then we consider the activity distribution among students in the fourth subsection. Lastly, we look at the types of files which students submit to the version control system. 4.1 Commit activity over the whole course The student activity in the SE, FP and SOA&CC courses is presented in the Figures 1, 2, and 3, respectively. In the SE course, we draw thick vertical lines to indicate the end of each of the four phases of the course assignment. As can be seen in that figure, there seems to be no correspondence between these dead- lines and the student activity. In this course where VCS training was provided, students appeared to commit rather evenly throughout the course. We attribute the activity spike at the start of this course to the students trying out and get- ting familiar with the version control system at the point in time of the training sessions. In the graphs of the FP and the SOA&CC courses (figure 2 and 3), we indicated with thin vertical lines the sessions during which students get time in the classroom to work on the assignments. The thicker vertical lines indicate deadlines for the weekly assignments. The dates of the sessions are displayed on the x-axis in a month/day format. In contrast to the SE course, these graphs show a closer correlation between student activity and the implementation sessions and the deadlines. The graphs suggest that most of the work was committed during the contact sessions, which again suggests that students bring their work to the sessions to be committed there. This prompts us to study the student behaviour during the sessions separately in the next section. It is also clearly visible that the students have a very low activity during the weekends. 4.2 Commit activity during the sessions In the FP and SOA&CC courses students were more active during sessions than at other times. The graphs in figures 4 and 5 show the students activity during the sessions and 15 minutes after the session. We normalized the duration of the session (90 minutes) and the 15 minutes overtime between zero and one. Interestingly, we notice a similar behavior in both courses. There seem to be three periods of higher activity. The first moment of higher activity is in the beginning of the session after about 10 minutes. The second one, which last longer, is between 20 and 40 minutes after the start of the session and lastly, the activity peaks shortly after the session. The first period of activity is most likely because individual students have been implementing parts at home. These students then decide to commit only 228 M. Cochez, V. Isomöttönen, V. Tirronen and J. Itkonen Fig. 1. Commit activity during the SE course Fig. 2. Commit activity during the FP course Fig. 3. Commit activity during the SOA&CC course The Use of Distributed Version Control Systems ... 229 after receiving consent from other group members. This is an indication that students do not know how to use the version control system efficiently, as in principle they could have used a separate branch for their local development and merged their changes to their shared version of the exercises. Also, speculating based on student dialogue, some students might have feared ’losing face’ by making their preliminary versions visible to others, including the teacher. During the second period of activity students are using the system as they are supposed to, committing changes regularly. Then the activity drops for quite some time before reviving shortly after the session. We attribute this last peak to those groups who have been working during the whole session without com- mitting many changes. At the end of the session they want to store their work for later continuation and decide to put all their work in the system. Fig. 4. Commit activity during the sessions of the FP course Fig. 5. Commit activity during the sessions of the SOA&CC course 230 M. Cochez, V. Isomöttönen, V. Tirronen and J. Itkonen 4.3 Commit message analysis The classification of the commit messages was performed for the SOA&CC and FP courses and yielded the results shown in table 2. Table 1. Categorization of the commit messages per course useful trivial nonsense SE 996 (67%) 430 (29%) 59 (4%) FP 1422 (78%) 276 (15%) 129 (7%) SOA&CC 289 (74%) 65 (17%) 37 (9%) Table 2. Categorization of the commit messages per course In an ideal repository we would not find any trivial and nonsense messages. What we see from the table however is that there is a significant amount of these types of messages. It is not visible from the table, but the teachers classifying the messages shared the opinion that the messages in the useful category where not all that descriptive. Some commit messages could be regarded as ‘locally sensible’, mean- ing that they could be useful for communication during a short time span, but offer not much for later inspection. Many of the commit messages are clear in- dicators that the students regard the system as an answer submission system. Examples include “Answer for week 12” and “exercise 4a”. We also noticed some messages related to problems in using the git system. The amount was however not as significant as the teacher had expected. It is also observed that the quality of the messages is depending on the group, indicating that some groups use the messages for communicating, while others do not. 4.4 Differences in student activity To show the differences in activity among students we assembled the charts in figure 6. The figure contains one pie chart for each course and each group size or none if there are no groups of the given size in the course. Each pie chart illustrates the fraction of the groups which have a given percentage of commits for their most active committer. The last column shows the expected fraction, i.e. the chart which would be obtained if all students in the groups do an equal number of commits. What we see from the charts is that the most active committer in a group, most of the times, commits significantly more as the expected percentage. Put another way, the most active committer in each group is very often much more active as the average which one might expect. This can be due to that student having a dominating role in the group. In the FP and SOA&CC course we tried to mitigate this effect by grouping students according to their skill level.[13, 15, 16] We however think that the main differences are caused by a different level The Use of Distributed Version Control Systems ... 231 of familiarity with the version control system between the group members. The person with the most experience will commit more frequently or is given the task of submitting the work of others to the system. 4.5 Which files did the students commit to VCS? The presence of files which do not belong in the version control system, such as executable programs, temporary compilation files and copied documentation, suggests that the students used the VCS as plain file storage. Figure 7 shows the fraction of the groups with a given percentage of redundant files in their final repository. We see a big difference between the figures for the respective courses. In the graph for the SE course we see that most groups did not include many compiled files in their repositories, as was explicitly instructed in the course. During the functional programming course, students can often test their code without actually compiling it, which could explain the low amount of garbage in the repositories. The garbage that is committed consists entirely of compiled binaries and other compiler generated files. During the SOA&CC course many students use integrated development en- vironments (IDE) which do the compilation automatically for the user. It seems like many students have included all files which the IDE produced to the version control system. It seems that if students are not made aware of the fact that they should not include this kind of files to the VCS, they tend to include everything that happens to be present in their local directory. We should do further research to see whether this behavior changes if students are made aware of the best practices. Conclusion In this article, we focused on students’ usage patterns in advanced courses re- lated to programming while using the distributed version control system Git. We first looked at when students commit their work during the course and in more detail at their committing pattern during the implementation sessions. We con- cluded that most students commit changes regularly during the implementation sessions, but do not commit changes of work which they have been doing before the session itself. Some groups commit rarely during the session and make a big commit at the end of the session. We did some effort in classifying commit mes- sages and noticed that students do often write messages which are either trivial or even sheer nonsense. Further, we looked at how the usage of the system is divided inside groups and found out that the activity of the most active user in a group is significantly higher than what would be expected if each group member would use the system equally much. As the last part of our analysis we considered the types of files which students put under version control. We concluded that if students are not told explicitly that they should not include certain types of files, they will just do so. 232 M. Cochez, V. Isomöttönen, V. Tirronen and J. Itkonen Fig. 6. Fractions of the groups with a given percentage of commits performed by its most active student The Use of Distributed Version Control Systems ... 233 Fig. 7. Fractions of the groups with a given percentage of ’non-versionable’ files in their repositories With regard to the hypothesis put forward in the introduction, our findings suggest that using a VCS as a submission management tool may result in stu- dents adopting the tool as “just a required manner to return the assignments” — instead of a professional tool by which collaborative and distributed work is managed. Indications for this were that across all three courses studied the version control system was not too evenly used by the team members and the fact that the quality of the commit messages was quite low. While VCSs are pronounced as useful course management tools in the literature, we would like to note that professional use of VCSs requires support and demonstration of their usefulness. In our future undertakings, we could make a distinction between submissions returns and use of VCS, and add VCS training to the beginnings of the courses. Further, the existing classroom setup where there is sometimes only a single computer for the whole group could be replaced with settings where each student would use a separate computer. Performing these practical changes could reveal whether a more intense and shared use of a VCS can be prompted among the students. Promisingly, in our second-year course, training sessions were provided and there were no observable commit peaks near the deadlines of assignment phases but a rather constant commit curve. Further research could also point out how effective the students can use the system and how the organization of the group work influences the use of the system. It might be that some students know very well how to use the system, but do not see any reason to use their skills up to a full extent in the given settings. Acknowledgments: We would like to thank the department of Mathemat- ical Information Technology of the University of Jyväskylä for both financial 234 M. Cochez, V. Isomöttönen, V. Tirronen and J. Itkonen and material support, without which this research would not have been possi- ble. We would also like to thank the reviewers for their useful corrections and suggestions, which greatly improved the quality of this paper. References 1. Laadan, O., Nieh, J., Viennot, N.: Teaching operating systems using virtual appli- ances and distributed version control. In: Proceedings of the 41st ACM technical symposium on Computer science education. SIGCSE ’10, New York, NY, ACM 480–484 (2010) 2. Meneely, A., Williams, L.: On preparing students for distributed software devel- opment with a synchronous, collaborative development platform. In: Proceedings of the 40th ACM technical symposium on Computer science education. SIGCSE ’09, New York, NY, ACM 529–533 (2009) 3. Mierle, K.B., Roweis, S.T., Wilson, G.V.: CVS data extraction and analysis: A case study. technical report utml tr 2004-002. Technical report (2004) 4. Reid, K.L., Wilson, G.V.: Learning by doing: Introducing version control as a way to manage student assignments. In: Proceedings of the 36th SIGCSE technical symposium on Computer science education. SIGCSE ’05, New York, NY, ACM 272–276 (2005) 5. Clifton, C., Kaczmarczyk, L.C., Mrozek, M.: Subverting the fundamentals se- quence: Using version control to enhance course management. SIGCSE Bull. 39(1) 86–90 (March 2007) 6. Hartness, K.T.N.: Eclipse and CVS for group projects. J. Comput. Sci. Coll. 21(4) 217–222 (April 2006) 7. Liu, Y., Stroulia, E., Wong, K., German, D.: Using CVS historical information to understand how students develop software. In: MRS 2004: International Workshop on Mining Software Repositories. (2004) 8. Lee, B.G., Chang, K.H., Narayanan, N.H.: An integrated approach to version control management in computer supported collaborative writing. In: Proceedings of the 36th annual Southeast regional conference. ACM-SE 36, New York, NY, ACM 34–43 (1998) 9. Glassy, L.: Using version control to observe student software development pro- cesses. J. Comput. Sci. Coll. 21(3) 99–106 (February 2006) 10. Xu, Z.: Using git to manage capstone software projects. 159–164 (2012) 11. Milentijevic, I., Ciric, V., Vojinovic, O.: Version control in project-based learning. Computers & Education 50(4) 1331–1338 (2008) 12. Rocco, D., Lloyd, W.: Distributed version control in the classroom. In: Proceedings of the 42nd ACM technical symposium on Computer science education. SIGCSE ’11, New York, NY, ACM 637–642 (2011) 13. Tirronen, V., Isomöttönen, V.: Making teaching of programming learning-oriented and learner-directed. In: Proceedings of the 11th Koli Calling International Con- ference on Computing Education Research. Koli Calling ’11, New York, NY, ACM 60–65 (2011) 14. Tirronen, V., Isomöttönen, V.: On the design of effective learning materials for supporting self-directed learning of programming. In: Proceedings of the 12th Koli Calling International Conference on Computing Education Research. Koli Calling ’12, New York, NY, ACM 74–82 (2012) The Use of Distributed Version Control Systems ... 235 15. Isomöttönen, V., Tirronen, V.: Teaching programming by emphasizing self- direction: How did students react to active role required of them? ACM Transac- tions on Computing Education Research (accepted) 16. Isomöttönen, V., Tirronen, V., Cochez, M.: Issues with a course that emphasizes self-direction, submitted. (2013) 17. Parzen, E.: On estimation of a probability density function and mode. The annals of mathematical statistics 33(3) 1065–1076 (1962) Comparative Analysis of Learning in Three-Subjective Didactic Model Aleksander Spivakovskiy1, Lyubov Petukhova 1, Evgeniya Spivakovska1, Vera Kotkova1 and Hennadiy Kravtsov1 1 Kherson State University, 27, 40 RokivZhovtnya St., Kherson, 73000, Ukraine {spivakovsky, petuhova, spivakovska, veras, kgm}@ksu.ks.ua Abstract. The article theoretically shows transformation of modern didactic model into three-subjective (Student - Teacher - Information and communica- tion pedagogical environment). Active components of new subject, which are the most evident in a learning process, are analyzed in the article. The require- ment block of information and communication environment as a subject of the educational process is described. The comparative characteristic of the main components of traditional and innovative teaching systems is presented in the article. The authors have made a comparative description of the main forms of university studies in different didactic models: object-subject, subject-subject and three-subject training. The measurement of cogency of each of these three study subjects and their significance in the process of major educational opera- tions (collection, processing, storage, transmission) in various forms of training: lectures, practical classes and individual work were presented. Keywords. Didactics, information society, information and communication pedagogical environment, three-subjective didactics, forms of training organiza- tion at university Key terms. KnowledgeEvolution, KnowledgeManagementMethodology, Di- dactics, KnowledgeManagementProcess, ICTInfrastructure 1 Introduction Education is an institute of social experience transmission and human socialization in society. Naturally it depends on the level of social development and labor market needs. Modern university education is in crisis, according to UNESCO specialists’ defini- tion, it helpless and failure of modern education may bring to global problems of hu- manity. These are irregular development of different countries in the context of glob- alization, education inactivity caused by the relative conservatism of human resources brings to constantly fast-moving knowledge renewal. Weak sides of university education are the following students’ training instead of Comparative Analysis of Learning in Three-Subjective Didactic Model 237 general cultural development, low professional motivation and responsibility, strict regulation of students’ activities, provided graduates’ inactivity, not much attention to the levels of training, etc. According to this fact, it’s said about global educational crisis, the paradigm shift in pedagogical thinking [1]. We are going to trace the change of professional education at different stages of society’s development to overcome the crisis of modern university education. Down the ages human language existed primarily in form of sound speech. Its main limitation was space-time limit: spoken word spread out the territory limited by physical laws of sound and in form of material reality actually existed only while pronouncing, straight after that passing into the history and vanishes in it. The era of word was characterized by a certain lack of knowledge acquirement and the Institute of transmission, as the main source of the word transmission process from one gen- eration to another appeared to be a man. The increase of information amount became the background for writing nascence as it was difficult to keep the information in mind without losing its content. Writing unlike the sound speech turned to be the technology of knowledge transfer. The invention of writing (i.e. the possibility of fixing speech using a specially de- veloped system of graphic symbols) allowed to transmit voice information to an unlimited distance and extremely broadened its existence in time. Beyond dispute, the appearance of writing created new additional conditions and opportunities to realize a potential of human culture. But at the same time, writing leads to limitation and nar- rowing of informational content of speech. An issue is that writing is a sign system and it is shown as a representative of the signified so it reproduces only a half of properties and meanings of what it means. In this case, word transfers only a part of the properties and meanings contained in the "live" speech. Thus, written language is actually completely lost in the so-called prosodic information contained in the "live" speech that sounds. The case is that the graphical definition is losing information that is expressed and transmitted in direct speech by means of phonetics which plays a divisionary role. Year 1450 AD (500 years ago) is marked by the appearance of new information technology, the third one in a row. Only then printing technology appeared which we consider a knowledge distribution technology. We call this phase an era of books. Definitely the appearance of books allowed creation of an effective and mass educa- tion system, to organize public libraries, to ensure the development of universities. The appearance of books as a mean of transmitting knowledge, promoted the human- kind’s achievement of those heights which it has now. An important consequence of definite social development turned to be an under- standing of purposeful activity of social skills transfer from one generation to another as a connection between two organized activities – teaching and learning, their con- crete reflection in the learning process. Humanity cumulative experience of the learn- ing process has found expression in didactics, one of the pedagogy’s section that ex- amines general theory of education and training. It is believed that the term was intro- duced by German pedagogue Ratko in his lectures “Rahitiy’s summary of didactics and art education”, meaning a scientific discipline which studies theory and practice of teaching [2]. 238 A. Spivakovskiy, L. Petukhova, E. Spivakovska, V. Kotkova and H. Kravtsov Efforts to make an educational process intelligently organized and purposeful are presented in many Jan Komenskiy’s works, especially in "Great didactics," which covers almost all the issues that present the subject of modern pedagogy. Jan Komen- skiy was the first one who developed didactics as a system of scientific knowledge, giving a reasoned exposition of principles and rules for children's education. He ex- amined the most important questions of the learning theory: educational content, teaching visualization principles, sequence of education, organization of class-and- lesson system, etc. Object-subject teaching relations between teachers and students of that time were the most prolonged in pedagogy. The subject is the teacher who works actively to educate students as his objects of influence through informative-educational environ- ment (word, book, equipment). Schematically such didactic relationship is depicted in Fig.1. Teacher Student Teacher Student Information and communi- Information and communica- cation environment tion environment Fig. 1. Schematic model of the object-subject Fig. 2. Schematic model of the subject- relations subject relations Gradual development of public experience of mankind has increased to such an ex- tent that a person using only natural abilities was not capable to learn and operate with informational resources. As the result, the person begins to use technological tools to optimize working process with information. According to it, labor market no longer corresponds to specialists’ "conveyor" training, as received mental vocabulary knowl- edge quickly becomes obsolete, an employee must make a "decision" in unusual situations. Naturally, a student becomes an equal subject of the educational process. The transformation of an object into a subject in the educational process is the result of the democratization of education, dissemination of differentiation and individuali- zation of education. Schematically, the new relationship is depicted in Fig. 2. Intensification of information processes, introduced into science, economics, pro- duction, requires the development of new models of education, a variety of informa- tion and communication environments in which people could reveal their creativity fully, develop skills and cultivate a necessity for self-improvement and responsibility for their education and development. The traditional paradigm considered education as training of younger generation to work and life by consuming material valuables created in other areas. The new para- digm foresees independent values in education. Comparative Analysis of Learning in Three-Subjective Didactic Model 239 2 Innovative methodical system The purpose of creating a new education paradigm is to provide conditions for educa- tion, training and development for independent, smart person to satisfy the require- ments on a market economy, capable to improve continuously his own level of knowledge and culture, integrated into the global informative space. Thus, today, we are talking about innovative methodical system which unlike tradi- tional one, corresponds to professional education demands in an informative society. The comparison of the main components of these two systems is presented in Table 1 [3]. Table 1. Comparative characterization of traditional and innovative teaching systems. Name of Traditional methodical system Innovative methodological system component Learning Seizure and adoption of educa- Provide students with knowledge, objectives tional material. skills and practice. Provide students with knowledge, Creation of modern information and skills and practice. communication teaching environ- ment. Purposeful development of creative self-sufficing person. Formation of professional competence, leadership skills, ability to work in group. Principles of The scientific character principle. The principle of the activity learning learning The principle of systematicity and environment. consistency. The principle of organic unity be- The principle of visibility. tween the changing requirements at The principle of studying direction labor market and conserved features in accordance with issues of edu- of the educational system. cation, training and development. The principle of necessity for contin- ual self-study. Contents of Classical learning, techno- An integrated approach to fundamen- training cratic one. tal and applied activity aspects of a specialist-to-be. Study Reproductive, explanatory, illus- Problem-search, research. methods trative. Study Visual tools. The teacher’s word - Facilities. Information and communi- means for knowledge transfer, books, cation technologies. Hypertext, mul- movies, tape, training devices, timedia training materials. Databases pictures, maps, tables, machines, for educational purposes. Networking devices, models, collections, tools, means for videoconferencing and and historical schemes, charts, video lecture. An effective system of 240 A. Spivakovskiy, L. Petukhova, E. Spivakovska, V. Kotkova and H. Kravtsov Name of Traditional methodical system Innovative methodological system component diagrams, etc. Technology. Video- monitoring training activities. Re- recordings, radio and television, mote devices for self-work. Com- filmstrips, slides, transparencies, puter testing in on-and off-line projectors, televisions. modes. Study Lectures, seminars and Dispute, seminars, conferences, forms practical lessons. "round table", symposium, debates, colloquium, distance learning, teach- ing and business games, role-play game. Control External process operations con- Strict current control of individual forms trol within strictly defined rules is learning of each student by means of dominated. testing in on-and off-line mode. A teacher assessment result (flow, Rating control knowledge. final control) is dominated. Creating an effective environment Lack of balance between control according to Jean Piaget for easy and self-control. convenient self-organization that Lack of effective control for indi- motivates students in learning activi- vidual learning methods of each ties. student. In addition, society today has faced the phenomena which require answers: 1. Teacher has lost the monopoly on knowledge; 2. Students have unlimited access to information resources; 3. The phenomenon of "red shift" in expanding informative and communicative space; 4. Availability of qualitatively and quantitatively different ICT competencies of young and older generations. For that matter, an educational paradigm transforms, which is characterized by the following principles: Globalization of knowledge, free access to educational resources Integration of learning resources Organization of global educational audiences WEB-multimedia presentation of educational resources Multilingual educational space Asynchrony of modern models for learning management Harmonization of social and educational environment Formation of social identity of information system Divergence in the implementation of their own educational way Thus, an evolution of modern education, information studies, mass computeriza- tion of educational establishments, constant upgrade of hardware, and development of Comparative Analysis of Learning in Three-Subjective Didactic Model 241 computer networks, expanding of personal computerization of society, increasing of software products designed for use in an educational process – these are conditions that create new information and communication pedagogical environment (ICPE). This environment constantly and aggressively increases student’s motivation to con- sume content that circulates in it, creating a new didactic model – three-subjective relations, which include three subjects of study - students, teachers and an environ- ment. However, is it legitimate to consider ICPE a possessing equal rights subject for learning along with a teacher and a student? 3 Model of three-subjective relations Consideration of information and communication teaching environment as a subject, in our opinion, is possible because its components are not only technology but human resources as well, which continuously update them at the constantly growing speed. In this sense, it is necessary to point out an existing qualitatively new learning envi- ronment as opposed to which one that was 15-20 years ago. The question deals with the obtaining of today's educational environment the status of an equal partner. Sir Ken Robinson in The Third Teacher (2010) says, “The physical environment of the building is critically important in terms of curriculum” [4]. Within this approach, we implement an important target triangle: a natural integra- tion of teaching, research and labor market needs. After all, ignoring the environment as a subject of education, we will prepare specialists for inadequate reality. The inevitability of the transition of the education system to consider three- subjective relationship is reflected in the following three stages of didactic changes: Stage I – the subject-object instruction (a teacher provides students with knowl- edge). Characterized by one-dimensional linear model, the volume of processed data – megabytes; Stage II – subject-subject didactics (a teacher and a student are equal competent training partners). Characterized by two-dimensional polylinear model, the volume of processed data-gigabytes; Stage III – three-subjective pedagogy (a teacher – a student – ICPE). The interac- tion of all subjects of the learning process (a teacher – a student – ICPE) obeys to the common goal which is formation of a competitive specialist and is characterized by a three-dimensional nonlinear model, the volume of processed data – terabytes. Thus, we have the right to talk about three-subjective didactics as one of the areas of pedagogical science of the most general regularities, principles and means for or- ganization of studying, providing a firm and conscious assimilation of knowledge and skills within peer relations pupil (student), teacher (teacher) and information and communication teaching environment. It is important to underline that in this process, status and general condition of those who learn and teach and ICPE are constantly changing. In this context, we un- derstand these learning activities with assimilation of knowledge and skills, and teaching - the knowledge message or source of knowledge to students, as well as in- 242 A. Spivakovskiy, L. Petukhova, E. Spivakovska, V. Kotkova and H. Kravtsov struction on ways and methods of work, coordinating training activities, particularly organization of active forms (discussion, round table, project activities, etc.) and monitoring of students mastering knowledge, skills and experience obtaining. Unlike traditional views, we consider, that it’s necessary to introduce the one who teaches into the learning process, the changes that are ICPE (for example, by means of pub- lishing of educational materials in the Internet). We have to mention, that new innova- tive forms of teaching activity are connected remotely or, as they say, distance man- agement software training activities, both in time and space. Within this definition naturally occurring three-subject relations, which we under- stand as the continuous and constant (both in space and time) interactions between students, teachers and information and communication pedagogical environment di- rected for satisfaction of students educational needs (Fig. 3). Student Teacher • has some educational needs • forms relevant competence •requires to reveal and develop • coordinates students’ self- personal abilities development •strikes for information culture • coordinates students’ thorough development Information and communication pedagogi- cal environment Fig. 3. Schematic model of three-subjective relations As we are examining an environment as a separate element, it’s important to men- tion its operation characteristics, which are the most evident in the learning process: environment constantly and more aggressively increases motivation of the younger generation for content use that circulates in it environment provides access to resources at any convenient time environment has comfortable, flexible, friendly, intelligent service, that helps peo- ple to find informational resources, data or knowledge they need environment is not a negative emotional one, it corresponds people’s demands as much as it is needed environment permanently is filled up with information, data, knowledge with a constantly increasing speed environment offers an opportunity to organize practically free, time convenient contacts between any number of people to provide suitable and flexible informa- tion exchange (in any form) between them Comparative Analysis of Learning in Three-Subjective Didactic Model 243 environment, step by step, standardize, and then integrates the functionality of all previous, so-called traditional ways of receiving, storing, processing and presenting of the required information, data and knowledge to mankind environment undertakes more and more routine operations connected with humans operating activities (which is one of the greatest challenges for humans to expect in the future - "the more commissions - the more responsibility - the greater risks re- main without resources") environment receives more and more control over the data, and operational man- kind’s activities [3] Due to our three-subjective didactics, we can answer the above-listed vital ques- tions connected to modern educational system: the teacher’s role and place in the new didactic model correlation between virtual and visual forms of subjects’ relationships in didactic system development of technological management providing rights for subjects of didactic system to login informational resources organization of modern control systems over learning activities assurance of the organic unity between changing requirements of labor market and conservative educational system potential organization of modern, and most importantly, systematic and constantly active system of re-training and professional qualification upgrade of teachers Step by step, it is getting clear that technology that produces modern industry, to- day, not only affects the technology transfer of knowledge, but in fact, it determines qualitatively new forms of its organization for their mastering. At this stage, we can see the following problems: 1. Heterogeneity of distribution of computer and communication facilities 2. Huge differences in the process of training and constant re-training of staff, both academic and administrative ones 3. Inertia of education system 4. Constantly growing volume of technological renewal of learning environment that includes all the tools, both for teacher and learners 5. Imposing of different learning paradigms that make substantial confusion in the teachers’ presentation of their new role in the process of knowledge transmission, development of abilities and skills 6. Stereotype of the philistine attitude to pedagogy in whole, as a descriptive section of human knowledge, in which every citizen is a knowledgeable expert 7. Absence of formal systems which describe different models of learning [3] Active learning environment contains the following units, which are procedural, substantive and control. The environment begins to play a more important role and assumes some part of teacher’s functions. There is no doubt that, certain requirements must be done in the process of setting up a proper learning environment, which will provide active learning environment. Working with the program, both a student and a teacher will be limited by a system of actions, which was laid out in the program, 244 A. Spivakovskiy, L. Petukhova, E. Spivakovska, V. Kotkova and H. Kravtsov that’s why development of the system requirements of ICPE is very important. Ac- cording to our research, information and communication learning environment can serve as the subject of the educational process if it meets the following group re- quirements: 1. Hardware requirements: multimedia computers in classrooms are networked with the obligatory access to the Internet resources. In addition, an important aspect creates opportunities to access educational electronic resources (Wi-Fi technol- ogy) for students in any convenient place, for example, library, dormitory, canteen etc. 2. Software requirements: software environment should resolve security issues (reg- istration, personalization, delineation of access rights to get to resources), to be in- tegrated (all educational components should be in its natural form), easy for ex- ploitation, filling and modification, to provide opportunities for interaction, com- munication, monitoring for learning process, to contain an output mode out of the complicated situations (expert), to offer opportunities for distance learning (on- and off-line modes). 3. Academic requirements refer to methods of filling information and communica- tion teaching environment. 4. Social demands. Special attention should be paid to a specified group of claims which, in our opinion, contains cultural, ethical and legal aspects, because users of information and communication pedagogical environment create some commu- nity. First of all, it is about the rules of communication in the network and use of the reworks of other authors. Requirements to Human Resources. Construction of the educational process on the basis of information and communication technologies implies specialists- pro- grammers and accordingly well-trained teachers. ICPE correspondence to these requirements can be achieved by using management system of the quality of educational information resources [5]. Introduction of new subject of learning process naturally transforms existing ele- ments of training, including forms of teaching at higher educational establishments. Today, educational resources are open and distance forms for studying are actively developing and integrating into traditional forms of teaching: lectures, workshops, laboratory classes, independent, individual work of students, forms of control. Let's try to analyze basic traditional forms of training organization in different didactic models (Table 2). Table 2. Forms of learning in didactic models Subject-object study Subject-subject study Three-subject study Teacher The source of educational A teacher presents difficult A teacher and students in the information is a teacher; educational material, stu- debate form discuss problematic students are forced to put dents selectively put down issues due to free access to open down a limited amount of the information that is nec- lecture and other information Comparative Analysis of Learning in Three-Subjective Didactic Model 245 Subject-object study Subject-subject study Three-subject study information, static visibil- essary for each personally, sources. Students write down ity is used additionally. use additional sources, in- the required information at will. cluding the Internet. Dy- namic visibility is domina- tive. Practice Reproductive methods of Part-search training methods Search and creative methods are teaching material devel- prevail. directed at forming experience opment are used. of training materials, particu- larly under unusual circum- stances. Independent work It consists of lectures, Studying unwrought amount The main part of teaching mate- practical exercises execu- of teaching material. rial is studied individually. tion. Forms of control A control requires pres- Students’ readiness to use Monitoring can be conducted ence of a teacher who received knowledge in con- without teacher’s presence, and relates a student's knowl- dition of life situation is also the result – unconventional edge with the volume of under control. approach and creative thinking lectures material. of students are estimated. 4 Measurement of the importance of training subjects However, is ICPE a significant, important subject of learning in practice of University operation? The theoretical conjectures study was conducted at the Faculty of pre- school and primary education of Kherson State University in order to confirm or re- fute it. The research required a questionnaire of future primary education teachers, as they acquire an integrated system of philology, humanities, exact, natural and artistic sciences, which, in our opinion, reduce a risk of results’ obtaining only from certain cycle of training. The main task of the questionnaire is to evaluate the significance of subjects of modern educational process, including ICPE. Determining the validity of each of the three subjects of the educational process was made by expert evaluation method. 27 qualified experts (university teachers, graduate students, methodologists) joined the independent expert committee. To define a point of evaluation for each subject Delphi method (for members of the expert committee conditions for an independent individual work were created) was used [6]. Maximum and minimum estimates depended on a number of subjects, in which, there are three. Thus, minimum score for one of three components –1 point, an average score – 2 points and maximum – 3 points. Then, the statistical processing of the results, which were presented to experts for final approval, had been conducted. The cycle of expertise was repeated three times. 246 A. Spivakovskiy, L. Petukhova, E. Spivakovska, V. Kotkova and H. Kravtsov Below are the results of an independent expert committee (Table 3). Table 3. Determination of cogency of training subjects (V) Subjects of the educa- Number of points ∑ V tional process 1 2 3 Teacher 12 8 7 49 0,30 Student 2 10 15 67 0,41 ICPE 12 11 4 46 0,29 According to the results of expert reviews cogency V (in fractions of a unit) for each of these three specified subjects, according to experts, is approximately the same, with a slight advantage "student" (0.11 larger compared to "ICPE" and 0.12 larger compared to the "teacher"). Results summarizing the data are shown in Fig. 4. 0.29 0.3 Teacher Student ICPE 0.41 Fig. 4. The importance of the subjects of the educational process 214 students, as the most significant subject of didactic system according to ex- perts’ definition, were asked to rate on a 10-point scale the importance of the three subjects of the educational process: students, a teacher and ICPE – in the process of operating with information (collecting, processing, storing, transmission) in various forms of training organization: lectures, practical classes and independent work. To do this, students were asked to determine the importance of each component of didac- tic models: Student - Teacher - ICPE a five-point scale (1, ..., 5). Results of the student’s questionnaire are shown in Table 4. Assessment Eij (i, j = 1,2,3) for each i-component of the didactic system operations in terms of j-forms of training organization is given by (1): Eij=Kij1+Kij2+Kij3+Kij4, (1) where Eij– total score in terms of transactions weight, Kijk - i-score weighting compo- nents didactic system, j-teaching forms and k-rate transactions, %. The overall assessment of Vi (i = 1,2,3) for each component of the didactic system is given by (2): Comparative Analysis of Learning in Three-Subjective Didactic Model 247 Table 4. The results of the student’s questionnaire Components Student Teacher ICPE Form of Indicators training organization points % points % points % Collecting 1182 5,5 2187 10,1 334 1,5 Processing 2151 10,0 2630 12,2 2411 11,2 Lecture Storing 2625 12,2 1455 6,7 2402 11,1 Transmission 830 3,9 2004 9,3 1355 6,3 ∑ 6788 8276 6502 E 31,6 38,3 30,1 Collecting 1674 7,7 2006 9,2 1235 5,7 Processing 1885 8,7 3121 14,3 815 3,8 Practice Storing 1241 5,7 1663 7,6 2421 11,1 Transmission 2178 10,0 1198 5,5 2322 10,7 ∑ 6978 7988 6793 E 32,1 36,6 31,3 Collecting 2214 9,7 2119 9,3 2033 8,9 Processing 2366 10,4 2882 12,6 1938 8,5 Independent Storing 2154 9,4 2007 8,8 2013 8,8 Transmission 994 4,4 384 1,7 1703 7,5 ∑ 7728 7392 7687 work E 33,9 32,4 33,7 Vi= (Ei1+Ei2+Ei3)/3. (2) Let’s analyze the results. It is generally known that, lecture – is the main form of teaching, prepared for the adoption of theoretical material. Table 4 shows that while gathering information dur- ing lectures (17.1%), the most significant entity of the educational process is a teacher (10.1%), 5.5% of operation is performed by a student, 1.5% – ICPE. It’s explained by identification of the content and material of lectures, in its selection, the main role is occupied by a teacher, but the lecture provides not passive acceptance of students’ knowledge but their active involvement into the learning process, preparation for lectures, which is provided with ICPE use. We should note that active cognitive activ- ity of students during lectures is possible for basic training, which includes familiari- zation with the theme of the lecture and its plan, the main content of the theme for the tutorial, content repetition of the previous themes etc. According to the survey results, information processing on the lecture (33.4%) sub- jects’ contribution is approximately the same: 12.2% - Teacher, 11.2% - ICPE, 10% - Student. This is because the teacher coordinates educational information processing, and an active entity involved in this process may be a student. ICPE activity due to a shift in emphasis onto the use of methods and means of processing students - from note-taking information material: full or theses synopsis for computer processing of the information received. 248 A. Spivakovskiy, L. Petukhova, E. Spivakovska, V. Kotkova and H. Kravtsov Storing educational information of the lecture (30%) between the subjects of the educational process was distributed: 6.7% - Teacher, 11.1% - ICPE, 12.2% - Student. According to received questioning results, transfer of educational information of the lecture (19.5%) is implemented by a teacher (9.3%) and ICPE (6.3%), although students (3.9%) provide additional information, interesting facts and problematic issues. The task of the teacher, at this stage, is to transfer the adapted information disclosing a nature of scientific concepts, genesis of scientific theories, ideas, etc.; aggregated information is transmitted using pedagogical software, e-presentations, etc. The results of the distribution of three-subjects training, in the process of operation with information during a lecture are presented in Fig. 5. Fig. 5. The significance of the subjects of the educational process during operations with in- formation on the lecture Thus, the most important subject of the lecture organization according to students is a teacher (38%), a student (31.6%) and ICPE (29%) provide processing and preser- vation of educational information. In general, during the lecture principal place of work with information take processing operations (33,44%) and storing (30%), fol- lowed by transfer (19.5%) and collection (17.1%). Let’s analyze the data from Table 4 according to the importance of the subjects of study during the preparation and conduction of practice. As it is known, practical lesson is a class that involves organizing teacher’s detailed study of individual theo- retical positions discipline and development of skills in their practical application by individual performance to related tasks. Analyzing the data in Table 4 concerning the collection of information (22.6%) during the practical sessions was revealed that students’ contribution is 7.7%, but teacher’s and ICPE respectively 9.2% and 5.7%. Comparing with a lecture, students’ activity increased by 2.2%, due to test theoretical knowledge of students, develop- ment of skills based on acquired knowledge and, as a result, a detailed collection of information for further processing. In the process of collection (22.6%) and processing (26.8%) of the information during preparation and practice, according to students, a teacher and a student have the greatest significance, which is confirmed by received data: respectively (9.2% and 14.3%) , (7.7% and 8.7%). This is due to students interest in learning, deepening and refinement of knowledge, developing skills, primary accumulation of experience, Comparative Analysis of Learning in Three-Subjective Didactic Model 249 professional motivation and, consequently, activity in learning. Significance of ICPE is gradually increasing, as it is evidenced by statistics data, in storing and information transmission: 11.1% and 10.7%. Visually, the results are presented in Fig. 6. Fig. 6. The significance of the subjects of the educational process during operations with in- formation during the practice Thus, a teacher is the most important subject of practical training organization (36.6%), although a student (32.1%) and ICPE (31.3%) are equal subjects. Generally during practical classes the importance of operations with information is as follows: processing (26.8%), transmission (26.2%), storing (24.4%) and collecting (22.6%). Let’s analyze the importance of training subjects in the process of student’s indi- vidual work organization. As you know, independent work of a student is a primary mean to master academic material at a time, free from mandatory training sessions. Leading role in collection, processing, storing and transmission of material belongs to a student (33.9), according to the relevant data: 9.7% 10.4% 9.4% 4.4%. This is primarily due to the students' understanding of the importance of having theoretical knowledge, development of skills, and accumulation of their own professional experi- ence and, as a result, operations with the information according to the educational goals. Practice has proved that the most active in independent work will be a student who is more motivated to master for his future profession. The result of questioning is the importance of teachers is on average 32.4%, ICPE - 33.7%, and an independent educational-cognitive students' work is task-teacher, under his leadership, but without his direct involvement but widespread use of information and communication teach- ing environment. The importance of business education in independent work is shown in Figure 7. So, according to students’ definition, important subjects of independent work are all three components of the didactic system - Student (33.9%) and Teacher(32.4%), and ICPE (33.7%). During the independent work with information, transaction proc- essing occupies a principal place (31.5%), then collection (27.9%), followed by stor- ing (27%) and afterwards transmission (13.6%). 250 A. Spivakovskiy, L. Petukhova, E. Spivakovska, V. Kotkova and H. Kravtsov Fig. 7. The significance of the subjects of the educational process during information opera- tions in class work The analysis of the results of the survey showed that according to students' all three constituents are important and significant components of the didactic Student- Teacher-ICPE system. This is the statistics of indicators weight components: Student (V1 = 32,5%), Teacher(V2 = 35,8%), ICPE (V3 = 31,7%). It’s important to underline that received students’ survey data correlate well with similar data of experts’ assess- ment of component’s importance of the didactic system. The student is a significant subject of a teaching process at the University as learning outcomes largely depend on its intended acquisition of trade. Proof that serve high levels of significance to stu- dents in maintaining and processing information during lectures, collecting, process- ing and transmitting information during practice, collecting, processing, storing in- formation during independent work. Major indicators of the importance of the teacher can be seen during collection, processing and transmission of information during the lecture, which is determined by specifics of this type of training sessions - teaching theoretical material. During practical sessions and independent students’ work the teacher has the greatest indicators of the importance of information processing. ICPE’s significance is high during operations with information and practical lessons in the process of information preserving, as for in-class, ICPE acts as an equal-right subject of the educational process. This is because ICPE provides access to informa- tional resources at any convenient time, quickly and easily enables to find all neces- sary information, provides flexible and convenient information sharing between stu- dents. However, according to average data ICPE significance inferior teachers’ im- portance, as a teacher manages the studying-cognitive students’ activity, coordinates their independent improvement of knowledge, skills and abilities. It should be noted, that due to ICPE systematical involving in learning process the role of it as new sub- ject will improve gradually because of improving learning outcomes. Summary results of the survey are presented in Fig. 8. Comparative Analysis of Learning in Three-Subjective Didactic Model 251 Fig. 8. The importance of the subjects of the educational process during operations with infor- mation 5 Conclusions Thus, the analysis of the scientific literature, theoretical and experimental study on transforming learning into different didactic models showed that information and communication pedagogical environment is an important subject in the process of learning at the University; ICPE transforms traditional subject-subject model of study into three-subject one, directly affecting and slightly changing the role and function of other subjects of study, partly fingering their functions itself, particularly in transient conditions while performing operations with information on various forms of training. References 1. Fokin, J. G.: Competencies of Education in University. Academy, Moscow (2002) 2. Kendall, H. L., Sugimoto, R. A.: The Didactic Theory of Wolfgang Ratke. California State University (1976) 3. Petukhova, L. E.: Theoretical Bases for Training Primary School Teachers in Information and Communication Teaching Environment. Scientific Monograph. Ayilant, Kherson (2007) 4. The Third Teacher: 79 Ways You Can Use Design to Transform Teaching & Learning by OWP/P Architects, VS Furniture, Bruce Mau Design, Abrams (2010) 5. Kravtsov, H. M.: Design and Implementation of a Quality Management System for Elec- tronic Training Information Resources. In: Ermolayev, V. et al. (Eds.) Proc. 7-th Int. Conf. ICTERI 2011, Kherson, Ukraine, May 4-7, 2011, CEUR-WS.org/Vol-716, ISSN 1613- 0073, pp. 88-98, CEUR-WS.org/Vol-716/ICTERI 2011-CEUR-WS-paper-6-p-88-98.pdf. (2011) 6. Rowe, G., Wright, G.: Expert Opinions in Forecasting: the Role of the Delphi Technique. In: J.S. Armstrong (Ed.) Principles of Forecasting – a Handbook for Researchers and Prac- titioners, pp. 125-144. Kluwer Academic Publishers, Boston, MA (2001) Conception of Programs Factory for Representing and E-Learning Disciplines of Software Engineering Ekaterina Lavrischeva1., Artem Dzyubenko1. and Andrey Aronov1. 1 Taras Shevchenko Kiev National University, Kiev, Ukraine lavryscheva@gmail.com, asmer@asmer.com.ua Abstract. The paper presents a new idea of knowledge representation for stu- dents studying software engineering by developing artifacts and software com- ponents accumulated in libraries or repositories for further reuse. The idea is based on the concept of assembly line by V. Glushkov, further elaborated by Soviet and foreign specialists (A. Ershov, V. Lipaev, J. Greenfield, G. Lenz, Y. Bai, M. Fowler). The paper introduces the elements of program factory: reus- able components, their interfaces, and assembly lines for designing and assem- bling complex software products from components. It is shown that modern op- erating environments provide prerequisites for such factories. The students’ program factory, implemented at Taras Shevchenko Kiev National University, is then described. The technologies behind the factory, as well as its goals are studied. Keywords. Artifact, reusable component, applied system, interface, assembling line Key terms. Methodology, Technology, Process, Qualification 1 Introduction The paper presents the elements of programs factories: reusable components (RC) and their interfaces, assembly lines for designing and assembling complex programs from RCs. It is shown that modern operating systems provide tools for creating specialized programs factories (MS.NET AppFabric, SOAFab, SCAFab, IBM VSphere, CORBA, etc.). Factories are built by different commercial structures for software product de- velopment, as well as for studying purposes (e.g., the programs factory implemented in Taras Shevchenko Kiev National University for the assembling artifacts from dis- ciplines that are studied at the university). The purpose of such factories is to study computer science, software engineering, programming technology and software sys- tems with electronic textbooks, develop applied projects and thus train highly compe- tent professionals in software industry. Over last decades a huge amount of various programs has been accumulated in the informational world that may be used as end products for complex programs devel- Conception of Programs Factory for Representing and E-Learning Disciplines … 253 opment. Therefore, a new approach has been formed in programming, namely reus- ability – reuse of ready-made software resources (reuses, assets, services, compo- nents, etc.), hereafter referred to as reusable components (RC). This term is used in informational world to represent new knowledge acquired over researches in certain fields of computer science. Being needed for somebody, it may be used in solving certain problems concerning similar artifacts as well as for development of new soft- ware systems (SS), applied systems (AS) or software product families (SPF). All software artifacts and RCs may be stored in public warehouses (libraries, re- positories) that may be searched by professionals to identify the necessary data for implementation in their own research activities. That is, reuse of ready-made re- sources becomes a capital-intensive activity in the field of software engineering and it is particularly important that universities possess so-called factories for scientific artifacts, programs and RCs needed by other students and professionals. With these factories, students may participate in industry development of scientific artifacts for mass use [1–3]. Based on such innovative ideas, Prof. E. Lavrischeva proposed fourth-year stu- dents at KNU to establish the first programs factory over the course of theoretical and practical labs on software engineering. This factory is focused on artifacts, software development, and repository maintenance. Students’ programs factory operates on the web site (http://programsfactoty.univ.kiev.ua) since December 2011. The web site has been visited by more than 5,000 people – students, scientists and teachers. 2 Establishing Programs Factory History of Software Industry in USSR. An idea of industry for computers and sup- porting software has been formulated by Academician V. Glushkov at Cybernetics Institute of NAS of Ukraine in 1960s-1970s. Under his guidance, a family of small computers ‘Mir’ (1967–1975) has been developed together with a language of ana- lytical and formula transformations for solving differentiation, integration and for- mula calculus problems. In addition, other computers (Dnepr, Dnepr-2, Kyiv-67, 70, macro-conveyor etc.) have been elaborated with auto code-typed programming lan- guages (PL) to develop information processing programs and automated management systems (AMS) for various organizations and enterprises. For their development, the problems of software quality and increase of production of reusable components have been investigated with computers at state institutions [4–6]. In 1975 V. Glushkov first formulated the concept of assembling conveyor consist- ing from technological lines for software products (SP). The core of Glushkov’s para- digm is to accelerate the transition from programming as an art to industrial methods of SP production in order to solve various economical, business, scientific problems with automated management systems. Then (1978), software development as joint scientific-technical production and the projects requesting for automating SP creation have been decreed. The first pilot fac- tory for software engineering was established for mass production of various AMSs (1978, Kalinin); but, because of lack of ready-made programs and immaturity of pro- 254 E. Lavrischeva, A. Dzyubenko and A. Aronov gramming technology for industry production, the factory had lasted for two years and was closed [7]. Nevertheless the experiments aiming to elaborate programming automation tools were lasting until the collapse of the USSR. V. Glushkov’s idea concerning programs factories is now running at the several industrial factories explored by different authors theoretically and in practice [8–13]: Conveyor by K. Czarnecki and U. Eisenecker Software factories for assembling applications by J. Greenfield, K. Short et al. Continuous integration by Martin Fowler EPAM assembly line for building various types of software, improving software quality and reducing risk Automated assembling of the multi-language programs in heterogeneous environ- ments by Y. Bai (VC++, VBasic, Matlab, Java, Visual Works, Smalltalk and oth- ers) Command development and assembling programs for software projects in MS Visual Studio Team Suite based on contracts G. Lenz’s program factory utilizing UML in .NET Assembling, configuration and certification of global scientific-technical software on the European Grid factory Compositional (assembling) programming by E. Lavrischeva for developing soft- ware products from reuses, services, artifacts, and so on Experimental KNU factory Careful analysis allowed singling out the key components of programs factories [13]: Prepared software resources (artifacts, programs, systems, reuses, assets, compo- nents, etc.) Interface as a mediator between two components, containing passport information for heterogeneous resources in a certain specification language (IDL, API, SIDL, WSDL, RAS, etc.) Operating environment with system facilities and tools supporting assembly lines with heterogeneous software resources Technological lines or product lines for mass production and assembling products. Methods of product development The elements listed are the fundamentals of SP production industry at the factories that operate at major foreign software companies such as Microsoft, IBM, Intel, Ap- ple, Oberon etc. At the KNU students’ programs factory, these fundamentals are im- plemented as certain lines for programs and scientific artifacts, which are the best among student-developed programs factories. The authors of the KNU programs factory consider it an integrated infrastructure for organizing production of mass usage SPs that are needed for customers and users from the fields of computer science, state government, commerce etc. The factory is equipped with technology lines (TL) or product lines [7, 12], as well as a collection of products, tools and services needed for automated processes execution over these Conception of Programs Factory for Representing and E-Learning Disciplines … 255 lines in modern operational environments (MS.NET, IBM, Sun Microsystems and so on). Web Site of KNU Programs Factory. The main objectives for the web site are: improving students’ skills in software development using the system for exchange of KNU students’ certified software products and scientific artifacts; increasing SP qual- ity and reliability; and learning to support the methods of SS industrial production. The web site presents lifecycle models, SP building lines, the line for program pro- duction with the help of MS.NET platform, and examples of student programs. Obligatory requirements are maintained during certification to store software products in the repository of the factory (the library pool). The main activities on the factory site are: Organizing program and artifact development Familiarizing students with tools and methods for program and SS development Representing students’ software products in the repository Citing excerpts from articles and textbook materials concerning various disciplines The factory is equipped with tools that allow broadening its functionality by specify- ing new software artifacts and storing them in the repository for further reuse. Each artifact is uniformly documented based on WSDL, IDL standard used in Grid global project. 3 Factory Lines and Components Development of Technological Lines. Technological lines are created at the techno- logical pre-production stage [7, 12] before SS production. They include activities for designing the TL scheme from processes and actions that determine the processing order of SS elements with appropriate technological modules (TM) or programming systems. The basic requirement of TL engineering imposed on production of pro- grams and components is to assemble TLs from lifecycle processes meeting problem domain goals using standard tools, TMs, and the system of regulatory documents. The TL is then supplied with ready-made components, tools and instruments that generate and implement specific functions or elements, as well as the management plan for processes for changing states of the elements and providing quality evaluation [13- 16]. The RC model for component-based development has the following specification: RC = {T. I, F, Imp, S}, (12) where T is type, I is interface, F is functionality, Im is implementation, and S is inter- operability service. Basic operations over components are: Specifying components and their interfaces (pre- and post-conditions, which must be satisfied by the caller) in such languages as IDL, API, WSDL, etc. 256 E. Lavrischeva, A. Dzyubenko and A. Aronov Maintenance of components, reuses and artifacts in the component repository for search, change and future integration into applied systems Integration of components into applications, domains, applied systems, software product families, etc. All aspects of RC development and their usage in SP and software product families are goals for reusability disciplines and building material for these systems. Applied system is a collection of software means (or functions of SS), including general tools (DBMS, protection systems, system services, etc.), constructed subsys- tems or components together with the tools for marshalling from one AS to another [14]. Product Lines at SEI. Product lines and product family (PF) is defined in ISO/IEC FDIS 24765:2009 (E) – Systems and Software Engineering Vocabulary: “Product line is a set of products or services that share a common, managed set of features satisfying the specific needs of a particular market segment or mission and that are developed from a common set of core assets in a prescribed way. Synonym: product family”. SEI professionals propose the two models for representation of activities for SS development, namely engineering and process ones. The engineering model assumes the three activities, namely RC development, PF development through RC configuration and management of the both above activities. The activity for RC development presupposes PF scoping and production planning of SS collection accounting for the context of SS usage, production limitations and the chosen strategy. The PF development activity includes designing each specific SS implementation based on the set of developed RC, and building software systems according to the PF implementation plan. The management activity is based on bal- ancing activities for RC development and PF maintenance tasks, and includes both organizational and technical management. According to the process model, a set of processes is performed at the two levels, namely domain engineering, being also referred to as the development “for reuse”, and application (or SS) engineering – the development “with reuse” [15]. The last one is performed over assembly line using ready-made RCs to shorten time and increase SS availability. Therefore, configuring the product family from RCs according to the specific requirements and needs of a particular market segment is the final one in the cycle of production activities. 4 KNU Students’ Programs Factory From the theoretical standpoint, program factories are based on the assembling con- veyor that includes a collection of various more or less complex production lines for software artifacts, programs and RCs. Conveyor lines contain process execution using system tools or technological modules that automate process execution for obtaining interim or final results. From the perspective of information technology, the factory provides the data processing toolset for the transition from individual programming of particular re- Conception of Programs Factory for Representing and E-Learning Disciplines … 257 sources to the industry of mass-usage SP. The factory increases SP development pro- ductivity during each lifecycle process due to use of RCs that possess the necessary functionality with due quality guaranteed by their developers. The assembling (com- position, configuration) line may benefit through reduction of efforts because of use of readymade artifacts or RCs stored in the repository. Home Publications Learning Repository Contacts Link Searching for RC Select Platform Searching for manuals Select Manual (if necessary) All Platforms All Books Select OS Searching for publications Select category (if necessary) All Platforms All category Input text for Searching Search Recent Reusable Components (RC) Repository Learning Shingles algorithm Home Literature Dzubenko Artem Moxiecode System “Software Engineering” Manual Author: Lavryscheva Katerina Myhailovna Aronov Andriy Dzubenko Artem Generic Line for particular programs production in MS.NET Recent Books Learning An Example of student’s program with the Line APPLICATION 2 . The List of Standards. “Software Engineering” Manual APPLICATION 1 . Software Products sertifying for storing them in Repository The Glossary of Terms. “Software Engineering” Manual Abbreviations& Denotations. Assembling with Conveyor “Software Engineering” Manual Line for Software Products Assembling from RCs and Processes Software Product Line Visitors: 5242 You may ask questions Fig. 1. Main page of KNU programs factory Programs Factory Technological Lines. The TL development is as a rule matched with some lifecycle, e.g., implemented in the MS.NET environment with guides, frameworks, programming languages, common libraries templates, system tools that support new subject-oriented languages such as Domain Specific Language, etc. For complex SP development, an assembly (compositional) line is proposed, as a tool for composing RCs using their interfaces from various interim libraries within development environments, as well as from RC repositories (Fig. 1). The figure shows the main page of the web site of the factory: lines 1, 2, 3, 4 and a text-book for e-learning fundamental aspects of software engineering [2]. The assembling conveyor has four implemented technological lines [1, 2] that, since 2011, aid in artifacts and program development. The structure of these lines is designed according to the technology explored by the author (Prof. E. Lavrischeva) in 1987–1991 [7, 12-16]. Simple TLs in factory are as follows: E-learning C# in VS.NET environment (Fig. 2) Saving components into corresponding repositories and selecting them from the repository to meet market demands on specific SPs 258 E. Lavrischeva, A. Dzyubenko and A. Aronov Assembling or configuring RCs into complex program structures (SP, FS) E-learning basic knowledge on software engineering with the dedicated e-textbook Fig. 2. Chart for components design in VS.NET environment Web Site Lines. The depicted TL for learning C# programming is designed ac- cording to the ISO/IEC 12207 standard of lifecycle processes while allowing for .NET specifics as to how to perform the following design tasks: Exploring demands on software products, singling out features and methods for automated generation Fixing requirements on implementations of SP functions for the domain Specifying software elements or artifacts, documenting their passport data and interfaces with a WSDL-like language Storing created software artifacts and programs in the repository The line for repository maintenance includes the mechanisms for stored RCs being uniformly documented with their WSDL passports, as well as the tools for selecting ready-made RCs and artifacts from the repository based on their passport data, func- tions and relevant solution examples (see Fig. 3). This line is pertained to the proc- esses for quality assessment of artifacts or programs created, verifying them from the perspective of reliability and quality. Fig. 3. Flowchart of the line for component search in repository The line for AS development at the factory includes the tools for engineering of specific required components using various PLs, programs and artifacts, both just developed and selected from repository; the line also supports building multilingual RCs and marshalling non-relevant types of data exchanged by the components [11]. The constituents of this line include standard tools for building or configuring multi- lingual components, testing both the sample components and the links between RCs Conception of Programs Factory for Representing and E-Learning Disciplines … 259 being composed together, as well as the tools for reliability and quality assessment and certification of the product obtained. The line for remote e-learning with the dedicated Software Engineering textbook [14, 15] is now actively used in studying topics of this discipline with KNU students. This line may also be used for independent studying in other high school institutions where software engineering or computer science courses are established. Design of Programs Factory. At the students’ factory, Visual Studio .NET li- censed by KNU is used as the foundation for the programs factory; capabilities of MS.NET platform in providing tools for multilingual AS development and support using the components in C#, C++, Basic, etc., are utilized as well. Consequently, third-party software developers for the factory may not be limited to the choice of a single PL. The factory web site is developed using PHP programming language, and, for its external representation, HTML5, CSS3 and JavaScript. The system core is independ- ently engineered by the authors with relying on known web frameworks [2]. 5 E-learning SE Disciplines Teaching students the aspects of software industry at Ukrainian universities is at its initial stage. To solve several education problems, we have introduced a new ap- proach to e-learning various aspects of SE, which assists in acquiring knowledge on software industry. A new concept for breaking down the software engineering disciplines (Fig. 4), which is necessary in industrial factory production, was proposed [7, 14]. Basic goals of software engineering disciplines are as follows: Scientific discipline consists of the classic sciences (theory of algorithms, set the- ory, logic theory, proofs, and so on), lifecycle standards, theory of integration, the- ory of programming and the corresponding language tools for creating abstract models and architectures of the specified objects, etc. Engineering discipline is a set of technical means and methods for software devel- opment by using standard lifecycle models; software analysis methods; require- ment, application and domain engineering with the help of product lines; software support, modification and adaptation to other platforms and environments Management discipline contains the generic management theory, adapted to team- based software development, including job schedules and their supervising, risk management, software versioning and support Economy discipline is a collection of the expert, qualitative and quantitative evaluation techniques of the interim artifacts and the final result of product lines, and the economic methods of calculating duration, size, efforts, and cost of soft- ware development. Product discipline consists of product lines, utilizing software resources (reusable components, services, aspects, agents, etc.), taken from libraries and repositories; it also contains assembling, configuring and assessing quality of software 260 E. Lavrischeva, A. Dzyubenko and A. Aronov Fig. 4. Software engineering disciplines In our opinion, all the SE disciplines considered above and their theoretical foun- dations must become the independent subjects to be taught to students specializing in the field of SE with the orientation toward the industrial production of SPs from readymade components (reuses, services, assets, and so on) [17]. The production cy- cle will be repeated in case of bringing changes in the product structure as it is done in technology of continuous integration by M. Fowler and in AppFabric in MS.NET (Fig. 5) [18]. Approach to Teaching SE. The SE educational course must include intertwined theoretical and practical fundamental positions and achievements in software devel- opment and integration [15, 17, 19]. The directions of the SE teaching course are as follows: Base concepts, principles and methods that constitute the basis of SE knowledge and technology of programming (e.g., the five SE disciplines, life cycle, project management, quality, configuration) and proved their productivity in practice Mathematics of systems analysis for subject domain with the use of elements of theory of algorithms, logic and semantics of programming for the formal design of key notions of the domain, reflection of their communications and relations in case of formal task of their models and SP architecture Conception of Programs Factory for Representing and E-Learning Disciplines … 261 Fig. 5. Structure of Microsoft AppFabric General principles and methods of designing programs, software products and software product families using ready-made objects, components, services, aspects, etc. Modern applied tools for representation of software products, which are widely used by professionals in research and development (systems analysis, decomposi- tions, architecture, design, ООP, ontology, etc.) Methods of measuring quality of software products Development environments (MS.NET, IBM, CORBA, Protégé, Eclipse, etc.) The directions of e-learning are summarized in the Software Engineering textbook. The students learn and apply them on practice with the use of system tools, arti- facts or programs developed by other students. Certain students’ achievements are certified by the teacher and can be added to the repository of the factory In terms of teaching SE, the author’s first textbook (2001), written in Ukrainian [18], is dedicated to the foundations of SE from Curricula-2001; the second textbook in Russian (2006) teaches the basics of Curricula-2004 [12]; the third textbook is developed for modern approach towards teaching SE [15], including topical outlines of some of the above-mentioned disciplines and fundamental aspects such as reliabil- ity and quality engineering. In the new textbook, basic elements and engineering tools are presented for development of various target SE objects and lifecycle processes, methods for design and management of collectives of executors, quality, terms, and 262 E. Lavrischeva, A. Dzyubenko and A. Aronov cost. The textbook on the web site describes new SE disciplines and fundamental aspects of SE. The structure of the textbooks corresponds to the typical Curricula- 2004 program, and to the modern requirements on the subject imposed by the pro- gram of the Ministry of Education and Science of Ukraine (2007). 6 Conclusions The result of the authors’ work is an experimental programs factory web site, accessi- ble on the Internet using address http://www.programsfactory.univ.kiev.ua/. This web site is proposed for e-learning product line development at the high school institutions on the specialties of informatics, computer sciences, information systems and tech- nology. The following concrete lines are established at the factory: Production of reusable components and artifacts Development of console applications, DLL component libraries, local Windows applications with C# in VS.NET Developing Java programs (a manual by I. Habibulin) Assembling programs from RCs in MS.NET environment E-learning software engineering with dedicated textbooks on the web site in Ukrainian (sestudy.edu-ua.net) and in Russian (intuit.ru) The prospects of future factory evolution are its further adjunction with new re- sources in the field of software engineering being yet prepared by the students, namely: Description of the process of development of complex programs and SS using DSL language (Eclipse-DSL, Microsoft DSL Tools) Transformation of general data types into fundamental data types from the perspec- tive of the standard ISO/IEC 11404-2007 generation tools Ontological representations of new disciplines for study (e.g., computational ge- ometry, lifecycle domains, verification) New applied product lines for business developed with appropriate mechanisms; SEI product lines approach, and so on References 1. Lavrischeva, E.: Concept of Scientific Software Industry and Approach to Calculation of Scientific Problems. Problems in Programming, 1, 3–17 (2011) (in Ukrainian) 2. Aronov, A., Dzyubenko, A.: Approach to Development of Students’ Program Factory. Problems in Programming, 3, 42–49 (2011) (in Ukrainian) 3. Anisimov, A., Lavrischeva, E., Shevchenko, V.: On Scientific Software Industry. Techni- cal report, Conf. Theoretical and Applied Aspects of Cybernetics. (2011) (in Ukrainian) 4. Glushkov, V., Stogniy, A., Molchanov, I.: Small Algorithmic Digital Computer. MIR, 11 (1971) (in Russian) Conception of Programs Factory for Representing and E-Learning Disciplines … 263 5. Glushkov, V.: Basic Research and Technology Programming. Programming, 2, 3–13. (1980) (in Russian) 6. Kapitonova, U., Letichevsky A.: Paradigms and Ideas of Academician Glushkov. Naukova Dumka, Kiev (2003) (in Russian) 7. Lavrischeva, E.: Formation and Development of the Modular-Component Software Engi- neering in Ukraine. V. Glushkov Institute of Cybernetics, Kiev (2008) (in Russian) 8. Czarnecki, K., Eisenecker, U.: Generative Programming: Methods, Tools, and Applica- tions. Addison-Wesley, Boston, MA (2000) 9. Bai, Y.: Applications Interface Programming Using Multiple Languages: A Windows’ Programmer’s Guide. Prentice Hall Professional, Upper Saddle River (2003) 10. Greenfield, J., Short, K., Cook, S., Kent, S.: Software Factories: Assembling Applications with Patterns, Models, Frameworks, and Tools. Wiley, Hoboken (2004) 11. Lavrischeva, E., Koval, G., Babenko, L., Slabospitska, O., Ignatenko, P.: New Theoretical Foundations of Production Methods of Software Systems in Generative Programming Context. Electronic Monograph, UK-2011, 67, VINITI RAN, Kiev, Moscow (2012) (in Ukrainian) 12. Lavrischeva, E., Grischenko, V.: Assembly Programming. Basics of Software Industry. 2nd ed. Naukova Dumka, Kiev (2009) (in Russian) 13. Andon, P., Lavrischeva, E.: Evolution of Programs Factories in Information World. News of NANU, 10, 15–41 (2010) (in Ukrainian) 14. Lavrischeva, E.: Classification of Software Engineering Disciplines. Cybernetics and Sys- tems Analysis, 44(6), 791–796 (2008) 15. Lavrischeva, E.: Software Engineering. Akademperiodika, Kiev (2008) (in Ukrainian) 16. Lavrischeva, E.: Theory and Practice of Software Factories. Software–Hardware Systems. Cybernetics and Systems Analysis, 47(6) 961–972 (2011) 17. Lavrischeva, E., Ostrovski, A., Radetskyi, I.: Approach to E-Learning Fundamental As- pects of Software Engineering. In: Ermolayev, V. et al. (eds.) Proc. 8th Int. Conf. ICTERI- 2012, CEUR-WS, vol. 848, pp.176–187, CEUR-WS, online (2012) 18. Kolesnyk, A., Slabospitskaya, O.: Tested Approach for Variability Management Enhanc- ing in Software Product Lines. In: Ermolayev, V. et al. (eds.) Proc. 8th Int. Conf. ICTERI- 2012, CEUR-WS, vol. 848, pp.155–162, CEUR-WS, online (2012) 19. Babenko, L., Lavrishcheva, E.: Foundations of Software Engineering. Znannya, Kiev (2001) (in Ukrainian) Public Information Environment of a Modern University Natalia Morze1, Olena Kuzminska2 and Galina Protsenko3 1 Kyiv Boris Grinchenko University, Vorovskogo St. 18/2, Kyiv, Ukraine n.morze@kmpu.edu.ua 2 National University of Life and Environmental Sciences of Ukraine, Heroev Oborony St. 16a, Kyiv, Ukraine olena_k@bk.ru 3 Pecherska Gymnasium № 75, A. Ivanova St. 11, Kyiv, Ukraine galinapro@gmail.com Abstract. Processes of society globalization and technification require changes in training modern specialists and therefore involve changes of educational systems, including the creation of Information environments schools. The article is de- voted to the topics of design, development and implementation of experience of Information environments in the educational process and scientific activities of universities. Developed by authors model of the environment has been imple- mented for constructing the information environment of the Kyiv Boris Grin- chenko University and National University of Life and Environmental Sciences of Ukraine. The article describes approaches to the training of students and teach- ing staff of these universities for effective implementation and development of the resources of the Information environment. Conducted monitoring of resource usage of Information environment confirms the prospects of the authors' model and the methodology of its introduction. Keywords. Information environment, experience, knowledge technology, learn- ing platform, repository Key terms. Environment, Academia, Development, Information Communication Technology 1 Conceptual Foundations of an Information Environment Globalization of education leads to significant changes in the teaching systems: global- ized learning goals, unified content and methods. New forms of technology and educa- tion focused on the integration of information and communication technologies (ICT) appear in the learning process. Especially significant changes occurred with the means of learning. From the concept of "learning tools" in the traditional model of education was made the transition to the educational environment in activity oriented teaching practice, then to the education space in the context of person-centered, individualized Public Information Environment of a Modern University 265 approach, and finally to the Information environment (IE), which is realized in the process of development and ICT [1]. IE defined as a structured set of resources and technologies based on the consistent technological and educational standards which ensures free access of persons of the educational process to information resources, their effective communication and coop- eration within such an environment for achieving educational goals that are known, understandable, achievable and concrete for them preliminarily. IE of educational institution represents (has to represent) an adaptive model of global, national, information spaces and inherits their most characteristic functional properties, particularly in the communicative aspect of IE is a space of co-curricular activities based on ICT in the integration aspect provides the implementation of joint actions by establishing appropriate rules and the adoption of regulations that means that environment can emerge and develop only in accordance with the goals and objectives of the above-mentioned spaces, including the regulatory framework in the field of in- formation policy at national and international levels, the state and prospects of devel- opment of information technology, the characteristics of learning process in educational establishment [2]. There is a list of computer technologies in annual reports of the International Media Consortium that will have (have) a significant impact on the organization of the educa- tional process in the near future, namely: mobile technology and cloud computing (2009), open content, e-books , personal web (2010 - 2011 years), semantically com- patible programs, Smart-objects, supplemented reality (2012 - 2014 years), educational games, sensor devices and interfaces, data visualization, training analyst (2015 - 2016 years.) [3]. In the world teaching practice Web 2.0 services are regarded as qualitatively new means of distribution and storage of teaching materials, effective tools of educational platforms [4, 5]. Wikis, blogs, social networks, websites and audio streaming, news channels allow users to collaborate - sharing information data store links and multime- dia documents, create and edit content, solve practical problems, perform educational and research projects, etc. That is why understanding the nature and objectives of the construction, using and developing of information protection, a clear understanding of its structure, compo- nents, systems development and selection of high quality resources, selection of effec- tive service based on Web 2.0 technologies belongs to one of the main tasks of a mod- ern University. With the current requirements of not only the operation but also the system of educational institutions, general management principles and principles of educational systems, as the leading principles of good design information and educa- tional environment of the institution and of the overall architecture, should be allocated as follows: 1. The principle of a systematic approach. This means that build model should be based on systematic analysis of educational establishment. That means that structural elements, internal and external communications, which will consider the educational establishment as an open system, should be highlighted. 266 N. Morze, O. Kuzminska and G. Protsenko 2. The principle of modular structuring of information and data information. The main purpose - to provide information and data needs in the most complete form, which allows to characterize the state of the system and provide adequate tools for the im- plementation of administrative functions and educational tasks. 3. Principle of modification, addition and permanent renewal. Implementation of this principle allows for expansion, upgrading and updating of the model with additional specific and understandable to persons indicators and measuring data. Thus, it can be changed or adjusted in accordance with the specific educational establishment, its traditions, mission and tasks. 4. The principle of approximation, which states that the system should be responsible for its complexity, structure, functions, etc. to those conditions in which it operates, and to those requirements that are set to it. 5. The principle of giving the necessary and sufficient information for the management of educational establishment. 6. The principle of data sharing. The same data can be used by several users. In addi- tion, each user should receive this information in an easy to view it at any time and from any place. 2 Information Environment of a University – Concept Realization Information environment of the institution at the present stage should include: Personal computing devices - a means of educational, researching and administrative activities of the institution Environment supporting collective and individual communication and cooperation Open educational resources - objects of educational activities and interactions Centralized and decentralized training platforms Means of information security and centralized filtration incompatible with the edu- cational process content and more The overall architecture (organizational structure and the associated operation of technological systems in education) is the basis of the process of creating educational technology systems adequate to the conditions of their use, in particular, an unlimited amount of resources that can be integrated into the educational process, a large number of users that can use the tools and technologies of technological systems, the number of students who may be involved in the joint solution of one educational task. Educational environment for such systems provides by international technological standards for interfaces, formats, communication protocols to provide mobility, interoperability, stability, efficiency and so on. In this approach Internet is considered as a global platform of creation and dissemi- nation of collective knowledge. Information environment is a mean and a place for creation, accumulation and harmonization of educational resources of efficient commu- nication and cooperation, education and training of both students , teachers and admin- istrators. The proposed model allows us to implement a set of technological principles Public Information Environment of a Modern University 267 of open Information environment of the university such as adaptability, integrity, com- plexity, interoperability. Construction of such Information environment provides a clear projection of its ob- jectives, functional, access channels, organization of communication of students, teach- ers and researchers; system of continuous monitoring. The main features of the educa- tional process in an open Information environment are: Openness of environment - students and teachers are actively participating in the development of educational resources and Information environment Willingness of participants - formation of need for building individual learning tra- jectories, positive motivation to cooperate and work in a team, willingness to dis- seminate the results of their own educational activities in the public access Monitoring of objects and subjects of environment - monitoring the quality of cre- ated resources, providing access to them and their efficiency of usage, observing the activities of the subjects of the educational process, organizing the feedback and as- sessment Implementation of the proposed model in a particular educational institution in- volves the selection of platforms (Fig. 1) and resources: Scientific articles of educational-research and Masters members of National Univer- sity of Life and Environmental Sciences of Ukraine http://elibrary.nubip.edu.ua Abstracts of theses defended in National University of Life and Environmental Sci- ences of Ukraine http://elibrary.nubip.edu.ua Conference proceedings of National University of Life and Environmental Sciences of Ukraine http://elibrary.nubip.edu.ua Works of magisters of National University of Life and Environmental Sciences of Ukraine http://elibrary.nubip.edu.ua Training materials to support the educational process of National University of Life and Environmental Sciences of Ukraine http://elibrary.nubip.edu.ua e-Learning of National University of Life and Environmental Sciences of Ukraine http://moodle.nauu.kiev.ua Distance learning http://agrowiki.nubip.edu.ua Harmonized (supplemented by expert comments of National University of Life and Environmental Sciences of Ukraine and links to internal and external resources) standards http://agrowiki.nubip.edu.ua Regulations of National University of Life and Environmental Sciences of Ukraine http://elibrary.nubip.edu.ua Thematic practice-oriented information articles http://agroua.net 268 N. Morze, O. Kuzminska and G. Protsenko Fig. 1. Open Information Environment of National University of Life and Environmental Sci- ences of Ukraine Effective usage of resources of the Information environment is largely dependent on the willingness of teaching staff to implement innovative pedagogical and information technologies and work with students in IE. Question of training faculty can be solved on the basis of properly designed training system focused not so much on the study of specific technologies as on: Formation among the teachers methodical approach to the selection and usage in their professional activities IE resources to achieve educationally meaningful results in the context of ensuring the availability of educational materials, improving the quality and effectiveness of the educational process Developing skills of the educational process with the use of IE resources and manag- ing innovative educational projects Logical design and creation of ICT-oriented learning tools While building a training system for teaching staff to use IE resources in educational activities it is necessary to take into account the necessity: Modular structuring of content that reflects the technological and didactic possibili- ties of the usage of specific resources IE Malancing and harmonizing individual content modules training program for teach- ers In the process of training teachers act as students (http://lilia.moyblog.net/category/) and this allows to simulate educational situations, identify problems and use of IE com- ponents to create training courses of new sample Public Information Environment of a Modern University 269 (http://moodle.kmpu.edu.ua/dn/course/view.php?id=144¬ifyeditingon=1). Topics of workshops: Institutional repository and its role in the creation of electronic educational and re- searching environment of the University Platform of e-learning Moodle Safe work in the Internet Creating of modern ontology on a base of wiki-portals Role of ICT in usage of formative assessment Usage of Web 2.0 for the students individual work Blogs and their usage patterns in the educational process Podcasts usage in educational process Role of ICT in the organization of cooperation and communication Creation of Information environment will provide flexible formation of educational and methodical complexes according to the different models of learning, make teaching materials cheaper and more accessible, improve learning efficiency by providing shar- ing of experiences and a variety of educational materials between students and teachers. Organization of educational and research activities in the Informational environment determine what skills a student should possess, namely: Access to information data and resources – the ability to search, collect and store information data Management of information data resources – the ability to choose existing resources for categorizing and structuring information data Critical evaluation of information data and resources – the ability to make judgments about the data quality, importance, usefulness or effectiveness as well as the reliabil- ity, specific and address orientation Creation of information data - the ability to interpret and present data, generate data and knowledge Exchange of information data – the ability to transmit information data by means of information environment in a proper way Acquiring of the mentioned skills and abilities is developed in the process of inde- pendent activity of students, such as the preparation and defense of Master’s Thesis. The wiki-portal EcoAgroWiki is chosen as a technological platform of the informa- tional support for Master’s educational activity. Considering the functionality of wiki technology article authors managed to organize a community of masters, teachers, aca- demic advisors practice (see Table 1). In order to simplify page layout on the EcoA- groWiki portal according to the standard IMS ePortfolio Information Model (http://www.imsglobal.org/ep/epv1p0/imsep_bestv1p0.html), it was designed templates of teachers and students portfolio. The results of network cooperation of its members is collections of useful links, digital electronic resources, target selection and description of the use of modern software tools, organization of project activities and collaboration, system of effective communication, consultation and expert evaluation. 270 N. Morze, O. Kuzminska and G. Protsenko Table 1. Subject of seminars series “Presenting of Masters’ scientific researches’ results using ICT” Annotation Resources Topic 1. Electronic publishing Scientometrical bases (ЕBSCO). http://web.ebscohost.com/ehos FAO- resources: AGORA. t/search http://www.fao.org/index_en.h tm Topic 2. Institutional repository Institutional repository of NUBiP of Ukraine. http://elibrary.nubip.edu.ua OAI harvester of Ukraine. http://oai.org.ua International repository. http://arxiv.org/ Topic 3. Bibliography Resources description variants. http://agrowiki.nubip.edu.ua/ Personal bibliographic managers. wiki/index.php/Zotero Topic 4. Research results publication in the Internet Google documents, blogs, forums, conferences. http://apctt.blogspot.com Topic 5. Cooperation organization Google groups.Wiki-portal. http://agrowiki.nubip.edu.ua University portal. LMS. http://it.nubip.edu.ua/ http://nubip.edu.ua 3 Experience of Resources of the Open Information Environment’s Usage by Students and Teachers The real impact of the Open Informational environment into educational activities or- ganization of the university was determined by on-line poll on wiki portal EcoA- groWiki and on-site discussions of the educational process. Master’s programme stu- dents of the NUBiP of Ukraine faculty of Computer Sciences and of the faculty of Ecology took part in the poll. Before series of seminars authors examined the attitude of students towards imple- mentation of the Information environment resources in teaching and research activities. The vast majority of students (189 of 200 respondents) believe that creation of Open Information environment significantly enhance the information support of education activities. However, 60% masters in computer sciences (hereinafter referred to as Group 1) and 20% of ecologists (hereinafter referred to as Group 2) consider effective the use of e-learning courses developed on the platform LMS Moodle, 50% of Group 1 and 30% of Group 2 use the institutional repository for viewing topics and presenta- tions of Masters’ Thesis of previous years. The question with which search engines are students required information, including these with a scientific character, the over- whelming number of respondents named Google, and to save search results 70% of Group 1 and 50% of Group 2 use personal folders and file cards on personal computers. As of communication between students, most of them called social networks, and as of Public Information Environment of a Modern University 271 communication with teachers (Academic Advisors) – personal correspondence via e- mail. There are three the most important students’ opinion, directions of the Information environment use. Students argued their choices from the position of information liter- acy [6]. 80% of Group 2 and 60% of Group 1 noted acquiring skills for analyzing the obtained data, sites, resources actively and productively; planning and managing their studying; establishing effective communication and cooperation; solving problems together, selecting the most effective resources and technologies to solve specific tasks. Interviewed teachers (42 educational research worker of NUBiP of Ukraine) noted the stiffening of Masters’ Thesis, especially in a part of analysis of research problem development, usage of modern resources, in particular materials of open scientific jour- nals and bibliography description. In addition, Academic Advisors, who have joined the experiment, noted the increase of their own computer literacy through the usage of scientific communication means and cooperation in the process of joined work together with students. And creation of teachers’ portfolio also make for promoting University activities, searching for partners in joint projects etc. 4 Conclusions Experience of Information environment creation and usage of its resources in the Kyiv Borys Hrinchenko University and NUBiP of Ukraine suggests the following arguments in favor of the proposed model: individualization and personalization of the academic activity, quality, flexibility, ability to meet the educational requirements, timeliness, self and mutual control, cooperation. Open Information environment have to be built as a system of functionally and structurally interconnected information and technological elements, skillful usage of which allows a teacher in practice solve didactic tasks on the technological basis with a guaranteed quality in the age of education informatization. Information environment creation at the level of educational institution leads to that educational materials and services will be available to every subject of the educational process. As a result, conditions for equal access to a quality education will be formed – an opportunity for everyone to learn at any place and at any time become a reality. Under these conditions, the Information environment is potentially unlimited as to the available quantitative and qualitative number of educational resources (can be used in the education process), number of users (can use its resources and technologies) and number of subjects of educational activities that can work together for solving educa- tional tasks. References 1. Manako, А. F.: Evolyutsiya ta Konvergentsiya Informatsiynyh Tehnologiy Pidtrymky Os- vity ta Navchannya. In: Proc. ITEA-2011, pp. 3–19, IRTC, Kyiv (2011) (in Ukrainian) 2. Bykov, V. Yu.: Avtomatyzovani Informatsiyni Systemy Yedynoho Informatsiynogo Pros- tory Osvity i Nauky. Zbirnyk Naukovyh Prats Umanskoho Derzhavnoho Pedahohichnoho Universytetu im. Pavla Tychyny, Ch. 2, 47–56 (2008) (in Ukrainian) 272 N. Morze, O. Kuzminska and G. Protsenko 3. The Horizon Report: 2011 K-12 Edition, New Media Consortium, http://www.nmc.org/pdf/2011-Horizon-Report-K12.pdf (2011) 4. Blees, I.: Web 2.0 Learning Environment: Concept, Implementation, Evaluation. eLearning Papers, 15, 18, http://www.elearningeuropa.info/en/article/Web-2.0-Learning-Environment %3A-Concept%2C-Implementation%2C-Evaluation (2009) 5. Malinka, I.: Involving Students in Managing their Own Learning. eLearning Papers, 21, 13, http://www.elearningeuropa.info/en/article/Involving-students-in-managing-their-own- learning (2010) 6. Information Literacy at Otterbein College. Ottebein University, http://www.otterbein.edu/ resources/library/information_literacy/index.htm Designing Massive Open Online Courses Vladimir Kukharenko 1 1 National Technical University “Kharkiv Polytechnic Institute”, Frunze Street 21, 61002 Kharkiv Ukraine kukharenkovn@gmail.com Abstract. Connective massive open online courses (MOOC) for teachers from Ukraine and Russia were conducted in 2011-2013. They were: Strategy of Dis- tance Learning in the Organization, Social Services in Distance Learning, Dis- tance Learning from A to Z, Designing Online Courses. The accumulated ex- perience allowed to develop recommendations for each ADDIE step of MOOC designing for the Russian-speaking audience. Keywords. Connectivism, massive open online course, personal learning envi- ronment, ADDIE Key terms. Competence, Didactics, TeachingMethodology, TeachingProcess, ICTEnvironment 1 Introduction The term “massive open online course” (MOOC) was introduced by Dave Cormier in George Siemens’s distance course “Connectivism and Connective Knowledge” in 2008 and 2010 (http://connect.downes.ca/). This course was devoted to the problems of a new learning theory – connectivism, according to which learning is the process of creating a network (more than 2200 people studied). Units of such a network are ex- ternal entities (people, organizations, libraries, websites, books, journals, databases, or any other sources of information). The act of learning implies the creation of the ex- ternal network units. Over the last few years several dozens of open online courses have been conducted. Those courses are based on the new approach named “connec- tivism” and therefore abbreviated as cMOOC. cMOOC is characterized [1] by a structured network, the use of daily bulletin, a big amount of information material, the social approach to teaching. cMOOC enables people to cluster around the central core. In cMOOC the teacher plays a lot of roles [2]: he is an amplifier, tutor, he directs and socially manages creation of meanings, he filters, models and is always present. The student’s success in cMOOC is provided by his ability to navigate the net- work, the formed personal learning environment (PLE) and personal learning network 274 V. Kukharenko (PLN) as well as his personal goals. Personality development and personal learning play a central part in in cMOOC [1]. Experts believe that cMOOC [3] is suitable for effective independent learners, who have learned to select content. The reduced participation can be neither good nor bad. MOOC can be most effective as a form of continuous education and perfecting skills. 2 Analysis of cMOOC of National Technical University “Kharkiv Polytechnic Institute” The courses are free of the traditional content. For each week is given an extended abstract and a set of links to various materials on the subject. A brief analysis of the topic is on the webinar, position of experts considered the on the guest webinar. Before the start of the course students are given instructions about the features of a connective course. Notes the desirability of selectively view recommended materials, preparation of remixes, posting material online and active participation in the discus- sion. 2.1 Distance Course "Strategy of e-learning Development in the Organization" The open distance course "Strategy of e-learning development in the organization" lasted 6 weeks in February - April 2011 [4]. The main objective of this course was to show the possible uses of e-learning in the organization and to assist in the development of strategy of learning that takes into account the overall strategy of the organization; to learn the designing of the learning process in the open distance course, to assess readiness of the Russian-speaking audi- ence to study in the new environment. The target audience comprises teachers, post-graduate students, heads of educa- tional departments from various organizations. The participants are expected to have skills of working in Internet, social networks and means of web communication (syn- chronous and asynchronous) for realizing communication, collaboration and exchange of information. 45 persons were registered for the course, pages of the course were visited by more than100 people, 12 people passed the final survey. The questionnaire was completed by the same number of participants from the academic and corporate sectors with experience in educational work of more than 5 years (83%) and the experience of distance learning of more than 3 years (67%). 3 participants couldn’t formulate their goal of taking this course, the rest of the par- ticipants (8 people) had the goal of getting acquainted with the features of open dis- tance learning course and new social services. The main activity of the participants could be traced by a mailing list (about 200 messages, half of them were sent during the first two weeks of classes, then the activ- ity decreased); in Moodle only the forum of dating worked, all other invitations to discussions were not supported. Based on the materials of the course, two participants Designing Massive Open Online Courses 275 created blogs. Generally, the group worked passively – its members read the proposed materials, did not disclose their sources, did not give their points of view on specific topics of the course. There have been conducted six weekly, introductory and final webinars which were attended by about 10 participants each. All webinars were conducted in the envi- ronment of WIZIQ, one of them – in a virtual world “VAcademy”, where the partici- pants got acquainted with virtual environment possibilities. Besides, 3 guest webinars also took place. The experience of giving such course shows that the open course for CIS audience is a new and not always obvious concept, the great amount of instructional material and absence of clearly stated ideas cause great difficulties for participants. The limited set of social services, disrespect and misunderstanding of Twitter summons problems when tracing tutors’ and their colleagues’ work. Apart from it, the author thinks that the topic of the course was rather challenging for the learners. The fact that the per- sonal instructional environment is not formed was the reason of problems arising during the learning process [4]. 2.2 Distance Course “Social Service in Distance Teaching” An open distance course “Social Services in Distance Course” [5] was held from May 23 to July 3, 2011 [6]. For this course Wiki, Mailing List, Twitter, DIIGO, learners’ blogs and aggregator netvibes.com were used. This study examined the hypothesis that introductory webinars on forming per- sonal learning environment (PLE) and the Workshop will increase the activity of students and material will be drafted for discussion by the full-time session. Besides, the introductory webinars were supposed to be held by a group of tutors. Since the beginning of subscribing to the course active attendance of the course and surfing the pages started. By the moment classes began 43 people had been sub- scribed. At the beginning of the course we could observe the maximum of visits and browsing. Then the attendance was gradually decreasing. It should be noted that the number of visitors (Ukraine – 64%, Russia – 23%, the USA – 9%, Belorussia – 4%) was twice as big as the number of those registered. 30 people passed entrance questioning, 83% of them are university lecturers, the others are from the corporate sector; 74% of people have teaching experience of more than 5 years, 80% are experienced in making distance courses; 57% are experienced in tutor’s work; 37% use mobile phones to access the Internet. It is essential that personal learning environment of the course’s participants is poorly formed, only 10-20% of participants use most services which is not helpful in communication. Most of the participants stated the aim of the training as a necessity to master the new social services and outline the new approaches to distance training on their base. On completing the course 10 participants took part in the questioning. On average the learners spent 6-8 hours a week. 276 V. Kukharenko 2.3 Distance Course “Distance Course from A to Z” The open distance course “Distance Course from A to Z” includes two parts. The first part which was held from December 5, 2011 to January 22, 2012 was devoted to the tendencies of creating the system of distance training at the current stage of Internet development. In the second part - the distance course design methods and the distance learning process. The main objective of the course is to analyze the level of distance training devel- opment in Ukraine on the base of webinars held in May-October 2011 [7], consider the tendencies of distance training development abroad and outline the requirements to the modern system of distance training. The course is based on the blogs and articles published 2-3 months before the course started. Such references show modern tendencies in the distance learning sys- tem mainly abroad. To shape the references Twitter was used. Before the course started the following webinars were held: “Social Services in Teaching Process”, “Twitter”, “Personal Learning Environment”. Their task was to help the learners acquire the skills of using social services during the course. The total number of those registered in the course is 31. According to Google Analitics the course was attended by 430 people from 29 countries of the world, 117 cities. Among them: from Ukraine – 76%, Russia – 13%, Belorussia – 5%. On aver- age there were 30 people a day. Surfing the weekly pages ranges from 550 to 200. During the course 12 people wrote 68 blogs, 85 messages were left in the course. More than 80% of participants have teaching experience of more than 5 years, 68% of them – more than 10 years. The experience in distance course usage is from 1 to 10 years. 22 participants answered the question of the final questionnaire. They are experi- enced teachers with 10-year experience of pedagogical work and 5-year experience of distance training, half of them worked near 3 hours in the course (3 persons worked more than 8 hours). During the course most of the participants stated their aims, which can be generally outlined as follows: to obtain and systematize the ideas of modern distance training and other universities’ experience. Answering the question about the novelty of the distance course all participants mentioned such services as Twitter, DIIGO, Creative Commons licences, personal learning environment, open learning resources and open distance courses, new ap- proaches in distance education. The participants of the course liked the system approach to the problem, the format of the course, unobtrusive and not strict management, great number of various infor- mation resources, possibility to work with information and new services in a suitable time, openness of communication, active information and experience sharing, the potential efficiency of the course, as under certain conditions the mechanism of self- reproduction of the course can be launched, that is, it will start working without visi- ble participation of the organizers. The learners noted the challenges of the open distance course, such as mastering new tools, personal aim of the study (dynamic and not always concrete), work with great volume of unstructured information in foreign languages. Designing Massive Open Online Courses 277 The participants’ work and their impressions of the training process of Mass Open Distance Course were considered at the Xth International Seminar “Modern Peda- gogical Technologies in Education” which took place on January 31 – February 2, 2012 [8]. One of the suggestions was to organize the groups by interests. It should be noted that the attendance of courses did not stop after they were offi- cially closed in April 2012. On the whole during the year the course was attended by 2492 users, most of them live in Ukraine (Kharkiv, Kyiv, Odessa) – 60%, Russia (Yaroslavl, Moscow, Yekater- inburg) – 20%, and Kazakhstan (3%). 2.4 Distance Course “”E-learning Design” In 2012-2013 academic year the Research Laboratory of Distance Learning made an attempt to hold a combined MOOC course "E-learning Design”, which consists of a course for beginners (xMOOC constructivism approach) and that for heads of dis- tance learning centers of an organization (cMOOC connectivism approach). The free program of the course includes three modules: “Basics of Distant Learning" (6- week long), "Technology of the Development of Distance Learning Course" (12-week long) and "Practicum for Tutors" (6- week long). The objective of the course is to improve and standardize the level of teacher train- ing in distant learning for colleges and universities in Ukraine. Experienced professionals (managers) of distance learning take the course, built on the connectivism principle, in order to organize education for distance learning course developers in their organizations. The educational process of the course encourages the exchange of experiences among the students, improving the quality of the course. Students plan the training for distant-learning course developers either independently or making use of the proposed distance course for beginners. The overall goal of this course is to improve the efficiency of training for distance learning course developers. To participate in this course, you should be able to use social services (twitter, RSS, netvibes, paper.li, scoop.it and others) and log in the open course [9]. The second course, built on the principles of constructivism, is where beginners learn to create their own distance courses. The tutor provides general management of the educational process, the leaders of distance learning centers may act as local tutors for their teachers. If desired, these students can take part in the first distance learning courses as well. As many as 90 individuals from the academic (91%) and corporate (9%) sectors, with distance learning (45%) and tutor (46%) experiences logged in the course. Most of the students (64%) have professional experience of over 5 years. Most students (43%) spent over 5 hours each week working on the course, 3-5 hours per week (31%), with the rest spending less than three hours weekly (25%). The intention of the students involved were focused mainly on observation of course events (80%), participation in discussions (75%), establishing new contacts (72%), creating a distance course (68%). 278 V. Kukharenko After the first distance-learning course "Basics of Distance Learning" the open connective part of the Wiki course was closed because of student low activeness while the course participants were active in making their assignments. The second part of the course "The Development of Distance Learning Course Technologies" envisaged the involvement of groups of distance courses developers from various universities together with their instructors. However, only the teaching staff of Kharkiv National Pharmaceutical University accepted the invitation. During the educational process, they were very active in various forums, helped each other and created distance learning courses. Among 61% of the logged-in students involved in the training, 40% worked ac- tively, yet only 11% (9 people) complied with the course program. Apart from creating distance courses, the students filled out workbooks (5 tasks, performed by 29…12 participants), questionnaires (6 tasks, done by 34…12 partici- pants), did two tests (32 and 27 participants), and discussed various issues in 10 fo- rums. More than 5 hours per week were spent to work on the course by 43% of stu- dents, from 3 to 5 hours a week – by 31% and 25% - less than 3 hours a week. 3 cMOOC Design The courses held allowed recommendations on the ways to design cMOOC for Rus- sian-speaking audiences. It should be noted that currently the MOOC design is being focused on technical aspects, i.e. the choice of content media placement, communica- tion, aggregators and other services. Stephen Downes [10] has given some recom- mendations for cMOOC design. MOOC design is supposed to use the ADDIE (Analyzing, Designing, Developing, Implementing, Evaluating). This approach is an adaptation of the design methods of technical facilities for the pedagogy. 3.1 Analysis cMOOC features uncertain audience and a variety of learners’ purposes. Therefore themes of a distance course are determined by the author of the course, or, which is preferable, by the author and his/her team. The themes must be relevant, contemporary, and a lot of unstructured information should be generated in the selected area. A theme is the best choice if it is the object of author’s research. In this case, the author can formulate his/her research aims, this leading to the program of the course. The main problem at this stage is a small audience and its inactiveness, poor PLE. Most often, students are targeted at getting acquainted with some information and establishing new contacts. A course participant must have essential implicit knowl- edge on the theme of the course, technical skills to work with social services and time management tools. Designing Massive Open Online Courses 279 3.2 Design After completing the course program, the duration of the course should be deter- mined. The course duration is desirable not to exceed the time period of 6-8 weeks as students find it difficult to concentrate for longer periods. The next step is to draw up draft descriptions for sections of the course. A most critical step is to select informative materials. To do that, the author of the course should possess the content curator skills. In this case, a vast number of links on Twitter to various information resources treated as blogs, e-magazines like scoop.it and much more are made available by the time the course is ready, which is among the content curator’s function. The most recent links to informative materials should be selected; and it is desir- able to choose the number of links on the topic that are above Dunbar’s number in order to organize selective information processing for students. Dunbar's number is a cognitive limit to the number of people with whom one can maintain steady social relationships (100 - 230, 150 selected) [11]. With a small number of links, students psychologically tend to seek reading all materials. After selecting the informative materials it is advantageous to write a brief abstract for each link, e.g. to use the tools cruxlight.com, and sort them out according to areas so as to make students’ work easier. 3.3 Development Now comes the time for a summary for all practical hours, with problem-setting and specifying various tasks for students. A list of recommended student activities is de- sirable, e.g. to make about 10 retweets, write 2-3 blogs with a review of some sources etc. The course should contain a variety of recommendations and references for those who have little experience of using social services. It is also necessary to develop entry and graduation questionnaires. Weekly ques- tionnaires are recommended to include questions: Which materials were of interest? Why (not)? What was new and interesting in the webinar? Which can be considered an alternative blog? Students can choose between filling out the questionnaires and writing a blog (re- mix, reflection). The next step is to select guest lecturers for each week. They can be author’s col- leagues or involved students who are leading experts in some aspects of the course. 3.4 Implementation MOOC should be conducted by a team of tutors. Their roles may be different. For example, they can share functions determined by the main author of the course. The tutors can be independent, each of them preparing their own materials for the course and giving his/her opinions during webinars without prior consultations. The most 280 V. Kukharenko important thing is that teams of tutors work very actively thus setting the work pace for the audience. Before the MOOC starts, it is advisable to hold 1-2 webinars telling about the tools employed, especially twitter, evernote, mailing lists, blogs, and use of translators. This could be helpful for students to get prepared for the educational process. The process of designing a distance course is an iterative process of creating a fun- damentally new service. It requires studying other authors’ experiences, new social services, continuous work with new information related to course themes. 4 Resume The experience shows that it is very difficult to introduce cMOOCs in education and distance learning in the CIS. This could be due to the choice of themes for courses, low activeness of the pedagogical community, a small number of social services used. For example, Twitter is not very popular in the educational community. At the same time, four courses held led to formation of a community of as many as 30...40 mem- bers who actively participate in all cMOOCs organized not only in the CIS. Low activeness of cMOOC participants does not allow implementing the principles of connectivism. Therefore, open courses are first needed on the use of social services in education, followed by training content curators. Furthermore, specific features of CIS audience must be considered when designing cMOOCs. References 1. Downes, S.: Education as Platform: The MOOC Experience and What we can Do to Make it Better. March 12, http://halfanhour.blogspot.com/2012/03/education-as-platform-mooc- experience.html?spref=tw (2012) 2. Bosman, J.: Teacher Roles and MOOC, http://moocblogcalendar.wordpress.com/2012/ 03/19/change11-teacher-roles-and-mooc/ (2012) 3. Stevenson, D.: MOOC- The Recent Discussions about MOOCs. http://learning- aworkinprogress.blogspot.com/2012/03/change11-mooc-recent-discussions-about.html (2012) 4. Kukharenko, V.N.: Innovation in e-Learning: Massive Open Online Course. High Educa- tion in Russia, 10, 93–98 (2011) (in Russian) 5. Distance Course: Social Services in Distance Learning. http://el-ukraine.wikispaces.com/ (in Russian) 6. Kukharenko, V.M.: Learning Process in Massive Open Online Course. Management Theory and Practice in Social Systems, 1, 40–50 (2012) (in Ukrainian) 7. Webinar Records of Chief of Center of Distance Learning Ukraine Universities. http://dl.khadi.kharkov.ua/mod/resource/view.php?id=8404 (in Ukrainian) 8. Х Workshop Materials, http://dl.kharkiv.edu/mod/resource/view.php?id=11229 (in Ukrain- ian) 9. Distance Course: Design e-Learning. http://de-l.wikispaces.com/ (in Russian) 10. Downes, S.: Creating the Connectivist Course. http://moocblogcalendar.wordpress.com/ 2012/01/03/creating-the-connectivist-course/ (2012) 11. Wikipedia: http://en.wikipedia.org/wiki/Dunbar 's_number (2012) The Role of Informatization in the Change of Higher School Tasks: the Impact on the Professional Teacher Competences Dmitry Bodnenko1 1 BGKU, Borys Grinchenko Kyiv University, 18/2 Vorovskogo st. 04053 Kyiv, Ukraine bodnenko@kmpu.edu.ua Abstract. Last decade is characterized by the tendencies of the modern higher school development. This is based on the informatization of education. In this work, basing on didactics of higher school pedagogy a system of psychological and pedagogical characteristics for higher school teachers was created. Psycho- logical, educational requirements for professional competence of university teachers are based on components of pedagogical skills of teachers of higher education institutions and absorb almost all of its functions, duties and skills, but increased usage of ICT determines the specification of the classification system of psychological and educational requirements for information and communication competence of the university teacher. Keywords. TeachingMethodology, competence, distance education, technolo- gies of distance education, tutor, listener, course of distance education, informa- tive resource, network services Key terms. Didactics, CompetenceFormationProcess, InformationCommunica- tionTechnology, TeachingMethodology 1 Introduction In the conditions of the pedagogical paradigm updating, emergence and distribution of network technologies, and consequently, enrichment of personality aspects of modern teacher preparation in higher school, a large value is acquired by interpreta- tion of the teacher's professional competence concept. The problem explored by us in this article is concerned with how the informatiza- tion of education influences the requirements for professional competence of teaching at universities. During the recent years, in Ukraine ICT development has been defined as one of the priorities. In 2007 the Parliament passed the Law "About the Basic Principles of information Society’s development in Ukraine in 2007-2015" (№ 537-V of 9.01.2007), pursuant to the Action Plan has also been adopted. Both documents were designed to promote the development of information society and the introduction of 282 D. Bodnenko information technology as apriority direction of the state policy. The need for further development and implementation of ICT is confirmed, also, by a number of national documents, such as the Draft of National Education Strategy in Ukraine for 2012- 2021 years CMU (Meeting of 11.09.2012), the Laws of Ukraine "About the concep- tion of the national programme of informatization" (№ 75/98-VR of 04.02.1998) with amendments introduced according to the laws N 3421-IV (3421-15) of 09.02.2006, VVR, 2006, N 22, article 199, N 3610-VI (3610-17) of 07.07.2011). Unfortunately, these intentions, mostly in the present time, are remaining on an embryonic stage, that is reflected in the low Ukrainian ratings of competitiveness and network readiness. Implementation of informatization at the universities hasn’t been explored fully yet. This topic opens up many opportunities for further studying and investigation: implementation of educational programs, websites, the use of network services, an invention of new forms, ways, exploring of university teachers potential at the new open education environments. Writing this article we set the following goals: explor- ing of the university teachers’ professional competence; the definition and justifica- tion of psychological and educational requirements for information and communica- tion competency of University teacher. 2 The Main Part of Our Research In recent years, the term "tutor" became well-known, it gained a considerable popular- ity in higher education, giving ground in frequency of use only to the term "teacher". These notions are almost synonymous, which could cause some dubieties about the necessity of introducing the new foreign term. But, in fact, the token tutor expands the notion teacher, especially in the context of the gradual integration of Ukraine into the European educational space. S. Goncharenko [4] notes that the term tutor (English tutor from Latin tueor, to ob- serve, to care) – is a teacher-mentor in British "public schools", high forms of gram- mar schools and teachers colleges [4]. In modern educational paradigm, regardless of the learning forms, the principle about the student’s importance was established, that fact illustrates a student as a dominant. According to this fact, teachers act as assistants, as friend, as mentor, who supports students in getting education. With the rapid growth of informational and communicational component of the educational process, particularly in the context of the implementation of distance studying, it is difficult to imagine a teacher, who just transmits the information to the listener, even if it is a video lecture. Teachers need to be a coordinator, facilitator, who "synthesizes and accompanying student’s resources" [6], which has much bigger freedom of the choice (educational content, time, place, methods of education), compared with the traditional student. We support the aspects of the network training’s specific which is - pointed out in V.M. Kukharenko’s and V.Y. Bykov’s studios [5], which claim that the teacher-tutor should focus on his practice, doing his classes for the listener of the e-learning course. The Role of Informatization in the Change of Higher School Tasks … 283 However, working in the field of distance studying, the teacher communicates with the diverse contingent. Each person is a personality, according to some needs, abilities and opportunities. That is why the teacher’s task is to choose the best way of the studying process. They have to coordinate their activities (growth, educational content, methods, tools) with options of the audience potential or take responsibility for themselves, creating a new pace of studying. Also they should to select, a group of listeners who exactly over- take the course (it is not the fact that all students will be able to take this rate). As the research is based on the condition, that the teachers of the higher educa- tional institutions are studied for the implementation of the e-studying course, so we agree with S.S. Vitvytskaya’s supposition [3] that teachers have to learn to perform the following functions: organizational (the head, the leader in the maze of the knowl- edge and skills); informational (the carrier of the last information); transformational (transformation the socially meaningful content of the knowledge to the act of the personal knowledge); orientational and regulative (the teacher’s structure of the knowledge determines the structure of the student knowledge); catalytic (transforma- tion the object of the education into the subject). The teacher, who is not at the top level mastering pedagogy of higher education, it is difficult, in our view, to prepare for the implementation of distance learning in higher education. According to A.I. Kuzmynskyi [7], the higher school teachers have some compe- tence (the high professional competence, pedagogical competence, social and eco- nomic competences, communicative competence), have a high level of general culture and, also we can highlight especially important, in our mind, for the introduction of ICT competencies the functional responsibilities the universities’ teachers, in particu- lar. All these ingredients along with the pedagogical skills (except some components of the educational technology), outlined in the A.I. Kuzminskoho’s works [7], form the basis of psycho-pedagogical portrait of the teacher, and transforming the system MDs we will get in the future - a distance studying tutor or teacher, who has informa- tion and communication competences, who is ready to work upon condition of a new paradigm of education. In particular the pedagogical skills include: moral and spiri- tual qualities, professional knowledge, social and pedagogical qualities, psychological and pedagogical skills, pedagogical technique. Russian authors, as G. Adrionova [1], M.V. Vislobokova [11], V.P. Verzhbytskyy [2] say that the traditional teacher and teacher of the e-studying - are mutually differ- ent personalities with the different characteristics. We disagree with this opinion, because with the help of the teaching skills of higher school teachers (as one of the key characteristics of the higher school teacher) it was possible to form the teacher - oriented on the using of new teaching technologies, including ICT. The fact that many characteristics are really opposite, because they depend on the tasks set for the teacher and the student. The most part of the scholars and teachers asserts [8, 9]: teacher – is a basis of the educational process, the most important component, that organizes the e-studying process and ensures its quality. V.M. Kukharenko [5] emphasizes the role of the tutor 284 D. Bodnenko in the e-studying system: "Any course requires the tutor, but a good course - requires skillful tutor". Distance studying technologies that are introduced at the time of the rapid devel- opment of interactive technologies, absorb current dominant of educational paradigm: the activities of the teacher-tutor designed to organizing, promoting and supporting of the students’ independent learning activities at the distance learning courses, which has the development of creativity as a basis, developing of searching abilities, analyz- ing and organizing information and rendition on the basis of the the right decisions’ findings. In our opinion, to develop the creative abilities of the listeners can only the teacher, who also is a creative person. In the context of the above statement is A.M. Eagle’s saying [10] that the tutor – is a teacher of the high level, who is able to interact with the audience, producing new relaxed and fun lessons for all the. The researcher says that the tutor doesn’t have to teach the audience, he has to maintain him as long as the student takes sufficient in- dependence and competence. The proof of this distribution is the work of the authors from the Problem Labora- tory NAM NTU "KPI" [5], where the experience of the foreign researchers [12,13] is adapted to the Ukrainian Distance Learning System and where is outlined some tu- tor’s responsibilities, according to two stages (development of the course and organi- zation of the educational process). We have to note that in this case the tutor can be a user of already created e-studying course. So systematizing the lined material, taking it as a basis, based on the generally ac- cepted didactic principles, which are clearly defined in the A.I. Kuzminskiy’s work [7], form the system of psychological and educational requirements for informational and communicational competences of the university’s teacher. Informational and communicational skills include the following components: To have at least one information-educational environment To know the range of services provided by the environment and technology of the handling these services; To know the basic principles of the telecommunication systems To know the specifics of the webinar, audio, video- teleconferencing, chats and forums; To know the rules of conducting (etiquette) during the interactive dialogue To know the specifics of working within formational resources (databases, infor- mation services) To be able to use the communication capabilities of computer’ networks to organ- ize fruitful communication between the participants of the educational process To make the organization and conducting of the telecommunication project To own and use network services in a professional activity Didactic skills: To create and shape the course material for students of e-learning courses with the optimal (understandable, accessible, scientific) laying out the information to ensure personal, effective and independent from the listener’s time and his working place To implement psycho-pedagogical monitoring (previous, current, interim, final) The Role of Informatization in the Change of Higher School Tasks … 285 To manage the independent educational and cognitive activity of the students, to develop intellectual capabilities and to form the motives of the education To teach students the efficient and effective methods of independent activity in the educational process Constructive skills: To integrate and combine full-time, part-time, external and distance learning To create e-learning and /or correct an existing course, according to the educational process requirements To adopt the effective types and forms of participants’ activity of the e- studying/network educational process To make the selection of the methods and means of education To have the skills of the informational navigation To plan the perspective stages for the management of the students group (small group) To organize using the distance learning technologies of the individually-oriented approach to the audience Organizational skills: To balance the demands of discipline with the students’ needs To demonstrate to the listeners their personal potentialities concerning to the pro- vided educational information To carry out a systematic discussion about the students needs for continuous im- provement of the distance learning process To provide the necessary support and assistance to the ENK students; To organize and conduct network role games To organize and manage the students’ activity, in a small group, to create the opti- mal conditions for the development of their independence and competence and provide pedagogically effective activity of the listeners To organize the participants’ meetings of thee-studying process Cognitive skills: To study physical, psychological and social components of special features of lis- teners’ individual development, their needs, social self-determination etc. To analyze individual styles cognitive-educational activity of listeners To involve modern pedagogical (tutor) experience (learning in cooperation, small groups method, project method, different-level education, forming evaluation, re- search, explore methods etc) and creatively apply it in own tutor practice To master new scientific information in the subject field, methods of teaching and use rationally in scientific-pedagogical work To search for facts that stimulate activation of the cognitive activity of students in informational and educational environment and apply them To generate new ideas and perspectives of tutor and student activities and apply modern technologies, forms and methods of distance learning Communicative skills: To determine the feasibility of the relationship between subjects of educational process 286 D. Bodnenko To coordinate interpersonal relationship between the students in the group or be- tween students in small groups To prevent conflict situations that may arise in the process of e-studying, resolve them To apply collective activity, cooperate, determine common strategy of activity and prove its relevance, be able to admit own mistakes Make simple, easy and tolerant communication with any age, social and ethnic categories of students. Perceptive skills: To understand (by the look of students) incentives of activations of students’ activ- ity in information-educational environment and be able to apply such knowledge To be concerned with the inner world of the audience, understand their mental state. Observe special features of the independent activity of the student in an In- formation-educational environment during e-studying process. To improve directly the technology, information saturation, activity, depending on the needs of the group (the audience) Suggestive skills: To master a method of forming a systematic and critical thinking To form reflection in students as a mean of evaluating their activities with the pur- pose of further improvement To affect (emotional and volitional aspect) students with forms, methods and means of e-studying to create in them a certain mental state, prompting them to definite actions Applied skills: To possess additional hardware, software, psychological educational equipment To create Web Pages, publications, websites, blogs, wiki, etc. To have skills to program in specialized environments Skills in psycho-technical sphere: knowingly and properly use acquisitions from psy- chology in field of applying network services. 3 Conclusion and Future Work Psychological, educational requirements for professional competence of university teachers are based on components of pedagogical skills of teachers of higher educa- tion institutions and absorb almost all of its functions, duties and skills, but increased usage of ICT determines the specification of the classification system of psychologi- cal and educational requirements for information and communication competence of the university teacher. According to the results, which we obtained during the research, we can outline some basic theoretical and empirical achievements of the author in the context of the study objectives: Professional competence of teachers of higher education on condition of higher education tasks’ changes and transition to the society of knowledge were outlined. System of psycho-pedagogical requirements for information and communication competence of university lecturer was formed and explored. The system consists of a The Role of Informatization in the Change of Higher School Tasks … 287 list of skills: informative-communicative, didactic, constitutive, managerial, cogni- tive, communicative, perceptual, suggestive, applied and the skills of a psychotech- nique sphere. Prospects for further scientific research are seen in detailed usage of professional competence of university teachers in practical activities, including the usage of net- work services in teaching students of Humanitarian specialties. References 1. Adrianova, G.: Typologiya Subjectov Distantcionnogo Obucheniya. In: Proc. All-Russian Remote Teachers Council (2000) http://www.eidos.ru/books/read-room/andrianova1.htm. (in Russian) 2. Verzhbitsky, V.P.: Distance Education in Russia. http://www.tcde.ru/de/st001.html (in Russian) 3. Vitvitcka, S.S.: Osnovy Pedagogiky Vyschoi Shkoly. Textbook. Second Edition. Center of the Education, Kyiv (2006) (in Ukrainian) 4. Goncharenko, S. V.: Ukrainsky Pedagogichny Slovnyk. Second Edition. Lybid, Kyiv (1997) (in Ukrainian) 5. Bykov, V. Y., Khuharenko, V. N. (eds.): Distance Education Process. Textbook. Mille- nium, Kyiv (2005) (in Ukrainian) 6. Koycheva, T. I.: Preparation of Future Humanities Teachers as Tutors of Distance Learn- ing Systems. Manuscript. Konstantin Ushinsky South Ukrainian National Pedagogical University, Odessa (2004) (in Ukrainian) 7. Kuzminsky, A. I.: Pedagogika Vyshoi Shkoly. Textbook. Knowledge, Kyiv (2005) (in Ukrainian) 8. Morze, N. V.: Osnovy Iformatciyno-Comynikaciynyh Tehnologiy. Publishing Group, Kyiv (2006) (in Ukrainian) 9. Oliynyk, V. V.: Distantciyna Osvita za Kordonom ta v Ukraine: Stysliy Analitychniy Oglyad. CIPPO, Kyiv (2010) (in Ukrainian) 10. Orel, A. M: Tutor s Raznyh Tochek Zreniya. http://www.ou- link.ru/654/bulletin_654_8/tutor-3.htm (in Russian) 11. Starov, M. I., Chvanova, M. S., Vislobokova M. V.: Psihologo-Pedagogicheskie Problemy pri Dis- tantcionnom Obuchenii. Distance education, 12(2), 26–30 (2012) (in Russian). 12. Ragan, L. C.: Good Teaching is Good Teaching: an Emerging Set of Guiding Principles and Practices for the Design and Development of Distance Education. CAUSE/EFFECT journal, 22(1), http://eee.educuse.edu/ir/library/html/cem9915.html (1999) 13. McKenzie, B., Mims, N., Bennett, E., Waugh, M.: Needs, Concerns and Practices of Online Instruc- tors. Online Journal of Distance Learning Administration, 3(3) (2011) 1.5 ICTERI Tutorials UML Profile for MARTE: Time Model and CCSL Frédéric Mallet1 Université Nice Sophia Antipolis, Aoste team INRIA/I3S, Sophia Antipolis, France Frederic.Mallet@unice.fr Abstract. This 90 minutes tutorial gives a basic introduction to the UML Profile for MARTE (Modeling and Analysis of Real-Time and Embedded systems) adopted by the Object Management Group. After a brief introduction to the UML profiling mechanism, we give a broad overview of the MARTE Profile. Then, the tutorial shall focus on the time model of MARTE and its companion language CCSL (Clock Con- straint Specification Language). Keywords. UML Profile, Real-Time, Embedded systems, MARTE, CCSL Key terms. StandardizationProcess, UbiquitousComputing, Concur- rentComputation, ModelBasedSoftwareDevelopmentMethodology 1 Audience and focus The targeted audience is academics or industrials interested in high-level mod- eling with UML and its application to the analysis of real-time and embedded systems. The tutorial does not require any preliminary background as it will give a high-level and broad description of the UML Profile as well as a closer focus on its time model. To ensure that a large public can follow the presentation, we should start by a brief overview of UML light-weight extension mechanism, the so-called profiling mechanism. 2 Topic The UML profile for Modeling and Analysis of Real-Time and Embedded sys- tems, referred to as MARTE [1], has been adopted by the OMG in Novem- ber 2009 and revised in June 2011. It extends the Unified Modeling Language (UML) [2] with concepts required to model embedded systems. This ninety minutes tutorial gives a basic introduction to the UML Profile for MARTE. After a broad view of the Profile, the tutorial shall focus on the time model of MARTE and its companion language CCSL (Clock Constraint Specification Language). 290 F. Mallet 2.1 General Introduction to MARTE Figure 1 shows the general structure of MARTE. The General Component Modeling (GCM) and Repetitive Structure Mod- eling (RSM) packages offer a support to capture the application functionality. GCM defines basic concepts such as data flow ports, components and connec- tors. RSM provides concepts for expressing repetitive structures and regular interconnections. It is essential for the expression of parallelism, in both ap- plication modeling and execution platform modeling; and for the allocation of applications onto execution platforms. Fig. 1. Structure of MARTE specification The Hardware Resource Modeling (HRM) package, which specializes the con- cepts of GCM into hardware devices such as processor, memory or buses, allows the modeling of the execution platforms in MARTE. The Allocation (Alloc) package allows the modeling of the space-time allocation of application func- tionality on an execution platform. Both the HRM and Alloc packages can be used with the RSM package for a compact modeling of repetitive hardware (e.g., grids of processing elements) and data and computation distributions of a par- allel application onto such a repetitive hardware. UML Profile for MARTE: Time Model and CCSL 291 The models described with the previous packages can be refined with tempo- ral properties specified within the Time package [3]. Such properties are typically clock constraints denoting some activation rate constraints about considered components. The concepts of the Time package are often used with the Clock Constraint Specification Language (CCSL) [4, 5], which was initially introduced as a non-normative annex of MARTE. 2.2 MARTE Time Model In MARTE, a clock c is a totally ordered set of instants, Ic . In the following, i and j are instants. S A time structure is a set of clocks C and a set of relations on instants I = c∈C Ic . ccsl considers two kinds of relations: causal and temporal ones. The basic causal relation is causality/dependency, a binary relation on I: 4⊂ I × I. i 4 j means i causes j or j depends on i. 4 is a pre-order on I, i.e., it is reflexive and transitive. The basic temporal relations are precedence (≺), coincidence (≡), and exclusion (#), three binary relations on I. For any pair of instants (i, j) ∈ I × I in a time structure, i ≺ j means that the only acceptable execution traces are those where i occurs strictly before j (i precedes j). ≺ is transitive and asymmetric (reflexive and antisymmetric). i ≡ j imposes instants i and j to be coincident, i.e., they must occur at the same execution step, both of them or none of them. ≡ is an equivalence relation, i.e., it is reflexive, symmetric and transitive. i # j forbids the coincidence of the two instants, i.e., they cannot occur at the same execution step. # is irreflexive and symmetric. A consistency rule is enforced between causal and temporal relations. i 4 j can be refined either as i ≺ j or i ≡ j, but j can never precede i. In this paper, we consider discrete sets of instants only, so that the instants of a clock can be indexed by natural numbers. For a clock c ∈ C, and for any k ∈ N? , c[k] denotes the k th instant of c. Specifying a full time structure using only instant relations is not realistic since clocks are usually infinite sets of instants. Thus, an enumerative spec- ification of instant relations is forbidden. The Clock Constraint Specification Language (ccsl) defines a set of time patterns between clocks that apply to infinitely many instant relations [4]. The uml Profile for marte proposes several specific stereotypes in the Time chapter to capture ccsl specifications. Figure 2 briefly describes the three main stereotypes. Boxes with the annotation metaclass denote the uml concepts on which our profile relies, so-called metaclasses. Boxes with stereotype are the concepts introduced by marte, i.e., the stereotypes. Arrows with a filled head represent extensions, whereas normal arrows indicate properties of the intro- duced stereotypes. Clock extends uml Events to spot those events that can be used as time bases to express temporal or logical properties. ClockConstraint extends uml Constraints to make an explicit reference to the constrained clocks. TimedProcessing extends Action to make explicit their start and finishing events. When those events are clocks, then a ClockConstraint can constrain the underlying action to start or finish its execution as defined in a ccsl speci- fication. 292 F. Mallet Fig. 2. Excerpt of the MARTE Time Profile 2.3 The Clock Constraint Specification Language On top of marte clocks, the Clock Constraint Specification Language defines a set of operators (relations and expressions) [4]. As an example, consider the clock relation precedence (denoted ≺ ), a transitive asymmetric binary relation on C: ≺ ⊂ C × C. If lef t and right are two clocks, lef t ≺ right, read ‘lef t precedes right’, specifies that the k th instant of clock lef t precedes the k th instant of clock right, for all k. More formally: For a pair of clocks (lef t, right) ∈ C ×C, lef t ≺ right means ∀k ∈ N? , lef t[k] ≺ right[k]. Similarly, let us consider the transitive and reflexive binary relation on C called isSubclockOf and denoted ⊂ . lef t ⊂ right (read lef t is a sub clock of right) means that for all k, the instant lef t[k] of lef t coincides with exactly one instant of right. More formally: lef t ⊂ right means ∀k ∈ N? , ∃n ∈ N? |lef t[k] ≡ right[n]. The relation ⊂ is order-preserving. All the coincidence-based relations are based on isSubclockOf. When both lef t ⊂ right and right ⊂ lef t then we say that lef t and right are synchronous: lef t = right. A ccsl specification consists of clock declarations and conjunctions of clock relations between clock expressions. A clock expression defines a set of new clocks from existing ones, most expressions deterministically define one single clock. An UML Profile for MARTE: Time Model and CCSL 293 example of clock expression is delay (denoted $: C × N? → C). c $ n specifies that a new clock is created and is the exact image of c delayed for n instants: o = c $ n defines a clock o ∈ C such that ∀k ∈ N? , o[k] ≡ c[k + n]. By combining primitive relations and expressions, we derive a very useful clock relation that denotes a bounded precedence. lef t ≺n right is equivalent to the conjunction of lef t ≺ right and right ≺ (lef t $ n). The special case, when n is equal to 1 is called alternation and is denoted lef t ∼ right (reads lef t alternates with right). 3 Conclusion The uml Profile for marte is dedicated to the modeling and analysis of real- time and embedded systems. Its time model relies on a notion of clock borrowed from the synchronous languages [6] and their polychronous extensions [7]. Those clocks can be logical or physical. The time model also provides a support to build causal and temporal constraints to force the clocks to tick according to predefined patterns. Thus, the evolution of the clocks imposes an execution semantics on the underlying uml marte model. Whereas the marte time model provides the notions of clocks and constraints, its companion language, the Clock Constraint Specification Language provides a syntax to define the constraints themselves. This brief tutorial introduces the main concepts of marte time model and gives an overview of ccsl. Acknowledgments This work has been partially funded by the PRESTO Project (ARTEMIS-2010- 1-269362). References 1. OMG: UML Profile for MARTE, v1.1. Object Management Group. (June 2011) formal/2011-06-02. 2. OMG: UML Superstructure, v2.4.1. Object Management Group. (August 2011) formal/2011-08-06. 3. André, C., Mallet, F., de Simone, R.: Modeling time(s). In: 10th Int. Conf. on Model Driven Engineering Languages and Systems (MODELS ’07). Number 4735 in LNCS, Nashville, TN, USA, ACM-IEEE, Springer (September 2007) 559–573 4. André, C.: Syntax and semantics of the Clock Constraint Specification Language (CCSL). Research Report 6925, INRIA (May 2009) 5. Mallet, F., André, C., de Simone, R.: CCSL: specifying clock constraints with UML/Marte. Innovations in Systems and Software Engineering 4(3) (2008) 309– 314 6. Benveniste, A., Caspi, P., Edwards, S.A., Halbwachs, N., Le Guernic, P., de Simone, R.: The synchronous languages 12 years later. Proc. of the IEEE 91(1) (2003) 64–83 294 F. Mallet 7. Le Guernic, P., Talpin, J.P., Le Lann, J.C.: Polychrony for system design. Journal of Circuits, Systems, and Computers 12(3) (2003) 261–304 8. Mallet, F.: Logical Time in Model Driven Engineering. Habilitation à diriger des recherches, Université Nice Sophia-Antipolis (November 2010) Biography Frédéric Mallet is an associate professor at Université Nice Sophia Antipolis. He is a permanent member of the Aoste team-project, a joint team between INRIA and I3S laboratory. He received a PhD in Computer Science in 2000 and his habilitation degree [8] in 2010. Since 2007, he has been heavily involved in the definition, finalization and revision of the UML Profile for MARTE1 and has been a voting member of the successive finalization and revision task forces for MARTE at the OMG. His main research interests include the definition of models for the specification of functional and non-functional properties of real-time and embedded systems. He also develops tools and techniques for the validation and verification of such models. 1 http://www.omgmarte.org Ontology Alignment and Applications in 90 Minutes Vadim Ermolayev1 and Maxim Davidovsky1 1 Department of IT, Zaporozhye National University Zhukovskogo st. 66, 69063 Zaporozhye, Ukraine vadim@ermolayev.com, m.davidovsky@gmail.com Abstract. In this paper, we describe the structure and outline the content of a short tutorial on Ontology Alignment. The tutorial is planned in three parts within an overall timeframe of 90 minutes. Part 1 covers the fundamentals of ontology alignment and offers basic definitions, problem statements and prob- lem classification based on the span, dynamics, direction, and distribution set- tings. This material is illustrated by: (i) using a walkthrough example of two elementary ontologies in Bibliographics domain; and (ii) offering a deeper dis- cussion of one of the exemplar problems of ontology alignment – ontology in- stance migration – which has a practical utility for real world applications. The second part presents a software solution for ontology instance migration prob- lem. The solution is demonstrated on the pair of Bibliographic ontologies of our walkthrough example. Part 3 puts ontology alignment in the context of several categories of applications which are important for the industries and the knowl- edge economy as a whole. The applications of ontology alignment in those categories are overviewed and requirements to the solutions are extracted. Keywords. Ontology, ontology alignment, knowledge-based application, agent, argumentation, negotiation, information flow, ontology instance migration Key terms. KnowledgeRepresentation, KnowledgeManagementMethodology, KnowledgeManagementProcess, KnowledgeTechnology, ICTTool 1 Introduction This paper outlines the tutorial on the basics and problems of Ontology Alignment. The material is illustrated by our agent-based solution for ontology instance migration problem – one of practically important sub-problems in ontology alignment. The de- mand for applications of ontology alignment in real world applications is also pre- sented. The tutorial, though given for the first time, is based in parts on our previous tutorial on Agent-Based Ontology Alignment [1]. This tutorial differs from [1] in the following: (i) it is broader in scope as covers not only agent-based approaches to align ontologies; and (ii) it is more oriented to reviewing industrial applications of ontology alignment and analyzing their requirements to the technology. 296 V. Ermolayev and M. Davidovsky 1.1 Structure and Timeframe Part 1 targets a broad audience of those who are interested in the problems of ontol- ogy alignment in general and starts at a relatively basic level. It begins with informal definition of ontology alignment and puts the problem into the context of the other knowledge harmonization and integration problems. It further explains the motivation to study the methods of ontology alignment. Further the basic formalisms for ontol- ogy alignment are introduced and explained using an incremental approach. The ge- neric ontology alignment problem is stated first and illustrated by the walkthrough example. This generic problem statement is further refined by offering a classification of the types of ontology alignment problems. A particular attention is paid to the on- tology instance migration problem as a sub-problem of ontology alignment. The time frame for the first part of the tutorial is 30 minutes1. A standard configuration of pres- entation equipment is required: 1 beamer, 1 presentation screen, 1 microphone for the presenter, 1 additional microphone for the questions from the audience. Part 2 offers a more practical material as it is focused on the presentation of the agent-based software solution for the ontology instance migration problem. The mate- rial of this part covers the presentation of the: (i) solution architecture; (ii) methodol- ogy shaping out the workflow; (iii) software demonstration that migrates instances from one to the other ontology of our walkthrough example. The time frame for the second part of the tutorial is also 30 minutes. Part 2 uses two independent presentation channels: one for the tutor and the other for software demonstration. Therefore it re- quires an enhanced configuration of presentation equipment: 2 beamers, 2 presenta- tion screens, 2 microphones for the presenters, 1 additional microphone for the ques- tions from the audience. Part 3 is focused on the discussion of the importance of ontology alignment tech- nology for real world applications. It starts with revisiting the motives to have this technology in place and proves the necessity of having the solutions for several cate- gories of ICT applications, particularly in information and knowledge processing. In fact a review of applications, their specific requirements, and available solutions is given in this concluding part of the tutorial to provide a holistic, cross-domain view on the role of ontology alignment as a fundamental technology for today’s knowledge economy. Similarly to parts 1 and 2, the time frame for part 3 of the tutorial is 30 minutes. Similarly to part 1, part 3 requires a standard set of presentation equipment. The whole tutorial is therefore given in 90 minutes. A small break could be planned after Part 2 if the audience wishes to do so for having some discussions or posing in-depth questions. Though questions are allowed to be posed at any time, all three parts are planned with 5-minute question and answer sessions at their ends. 1 Timings are given approximately. Small deviations could occur depending on the number of questions coming from the audience. Ontology Alignment and Applications in 90 Minutes 297 1.2 A Walkthrough Problem and Example An example problem of ontology alignment that is used throughout the tutorial for detailed discussions is the ontology instance migration problem. The problem state- ment for ontology instance migration is presented in Section 2. The approach and software for solving this problem is demonstrated in Section 3. The applications that require the migration of ontology instances are mentioned among those discussed in Section 4. Besides that, a very simple and artificial example of two different Biblio ontolo- gies is used for illustrations throughout the tutorial. The structural schemas and asser- tional parts of these ontologies are provided in the support material at http://isrg.kit. znu.edu.ua/a-boa/index.php/A-BOA_Walkthrough_Problem_and_Example. 1.3 Support Materials, Discussions, and Contributions For additional support materials a reader is advised to visit the A-BOA Wiki (isrg.kit.znu.edu.ua/A-BOA/) which has been developed for our previous tutorial on Agent-Based Ontology Alignment [1]. A-BOA Wiki is a Semantic MediaWiki based collaborative platform and a resource providing teaching content and discussion func- tionality. 1.4 Motivation to Study Ontology Alignment The world around us is multi-faceted and polysemic in a sense that a model of the world developed in the mind of an individual or by a social group may be different from the model of the others. Knowledge-based systems reflect this fact in their knowledge representations. However, we do many things across several facets or even across subject domains. So, the knowledge representations of the corresponding facets of knowledge representation have to be brought into a harmonized or aligned state to enable proper communication, coordination or information processing. Biblio ontologies give a simple example of such different facets, or knowledge representations, for the same body of knowledge about conference papers. Imagine that Biblio-2 is the knowledge representation of a conference management system, while Biblio-1 is the model for a paper repository used by a publisher for book production. The descriptions of the papers that have been accepted for a conference have to appear in the publisher’s paper repository. Similarly, the publisher’s informa- tion about the page limits has to be given to the conference management system to instruct the authors at proper time. Knowledge representations of Biblio-1 and Biblio-2 have therefore to be aligned for enabling seamless transformation and transfer of individual records between these two distributed knowledge-based sys- tems. The tutorial will teach how such alignments could be done and what the com- plications in that activity are. An attendee will learn that an alignment is essentially a result of applying a set of formal transformations to a knowledge representation – to its structure and individual assertions. An alignment allows interpreting knowledge that is external to the inter- 298 V. Ermolayev and M. Davidovsky preter in the same way it interprets its own knowledge schema and assertions. For example, if an alignment of Biblio-2 to Biblio-1 exists, the publisher, who is the owner of Biblio-1 may seamlessly import the assertions about the accepted papers to its production repository. Similarly, an alignment of Biblio-1 to Bib- lio-2 is required by conference organizers to get the publisher’s information about publication constraints like page limits. In a summary, ontology alignment has to be a technology at hand for all those who develop distributed constellations of knowledge-based systems that require collabora- tion across the nodes. Building ontology alignments efficiently and effectively is also important for the management and maintenance of such systems. Indeed, the fact that you have developed a perfect ontology alignment for your system does not yet allow you to retire. World changes and these changes are reflected in some facets of knowl- edge representations sporadically and without informing the other nodes. Hence the alignment activity has to be repeated in order to bring the whole system to a harmo- nized state. 2 Basics and Problems of Ontology Alignment This section of the tutorial presents the formal problem statement and classification of ontology alignment problems, discusses one of the problem statements – for the on- tology instance migration problem in more detail. Following Euzenat and Shvaiko [2], an ontology is formally denoted as a tuple O C , P, I , T , V , , ,, where C is the set of concepts (or classes); P is the set of properties (object and datatype properties); I is the set of individuals(or instances); T is the set of datatypes; V is the set of values; is a reflexive, anti-symmetric and transitive relation on C C P P T T called specialization, that form partial or- ders on C and P called concept hierarchy and property hierarchy respectively; is an irreflexive and symmetric relation on C C P P T T called exclusion; is a relation over I C V P called instantiation; is a relation over I P I V called assignment; (the sets C , P, I , T , V are pairwise disjoint). It is also assumed (c.f. [3]), that an ontology O comprises its schema S and the assertional part A (see also Fig. 2): O S , A ; S C , P, T ; A I , V (1) Ontology schema is also referred to as a terminological component (TBox). It contains the statements describing the concepts of O, the properties of those concepts, and the axioms over the schema constituents. The set of individuals, also referred to as an assertional component (ABox), is the set of the ground statements about the individuals and their attribution to the schema – i.e. where these individuals belong. Ontology matching is denoted as a process of discovering the correspondences (or mappings) between the elements of different ontologies. A mapping (or a mapping rule [2]) is a tuple m e, e, , n , where: e, e are the elements of C , R, I , T , V of the Ontology Alignment and Applications in 90 Minutes 299 respective ontologies O and O ; , , , , is a set of relations; and n is a confidence value (typically in the range of 0,1 ). Finally, ontology alignment is denoted as the result of applying the discovered set of mapping rules to the respective ontologies. A generic ontology matching process and ontology alignment are described and pictured in more detail at http://isrg.kit.znu. edu.ua/a-boa/index.php/Basic_Definitions_and_Generic_Problem_Statement. Based on the features of participating ontologies and the span of e, e across C , P, I , T , V -s of O and O a classification of the problems of finding ontology align- ments could be outlined and formally stated. Graphical interpretation of some of these problems is described in more detail at http://isrg.kit.znu.edu.ua/a-boa/index.php/ Classification_of_Ontology_Alignment_Problems. The dimensions along which the problems are classified are: Complete (C), structural (S), or assertional (A) alignment Static (S) versus dynamic (D) aligned ontologies Bi-directional (B) versus uni-directional (U) alignment Fully distributed (D) settings versus the presence of a central (C) referee ontology A generic ontology alignment process may therefore be classified as a complete static bi-directional alignment using central referee ontology (CSBC). Our walk- through problem of ontology instance migration could be classified as assertional, static, uni-directional, distributed (ASUD) ontology alignment problem. Yet another important feature for classifying ontology alignment processes is the presence of iterations for the refinement of alignments. All the processes discussed above are one-shot. However, the resulting alignments may appear to be of insuffi- cient quality after their evaluation. Iterative ontology alignment processes aim at im- proving this shortcoming by incorporating the evaluation step and the refinement cycle in the process – please refer to (http://isrg.kit.znu.edu.ua/a-boa/index.php/ Classification_of_Ontology_Alignment_Problems) for a graphical illustration. Itera- tive ontology instance migration process is discussed in more detail below. Our agent-based software prototype toolset for solving this problem is presented in Sec- tion 3. One of the practically important ontology alignment problems, especially in fully distributed and dynamic settings, is the problem of transferring the individuals of one (source) ontology to the empty assertional part of the other (target) ontology [4]. Let us consider two arbitrary ontologies O s ( S s , As ) and O t ( S t , At ) conceptu- alizing the semantics of the same universe of discourse U – for example O s and O t are the two ontologies describing the same subject domain. U could be regarded as a collection of ground facts: U { f } . Essentially, O s and O t are the interpretations of U. These ontologies would be considered identical if and only if: f U int s ( f ) int t ( f ) , (2) I I where int I ( f ) is the interpretation of the fact f by the individuals from I of ontology O. 300 V. Ermolayev and M. Davidovsky Consequently, an abstract metric of interpretation difference idiff (U , O s , O t ) could be introduced. The value of idiff will be equal to zero for identical ontologies and will increase monotonically to one with the increase of the number of f U such that (int I s ( f ) int I t ( f )) . Hence, idiff 1 iff f U(int I s ( f ) int I t ( f )) . ( 1 idiff ) may further be interpreted [4] as balanced F-measure. Ontologies O s and O t are structurally different if their schemas differ: S s S t . This structural difference may be presented as a transformation : S s S t . Transformation T may be sought in the form of the set of nested transformation rules over the con- stituents of S s resulting in the corresponding constituents of S t . Let us assume now that, given two structurally different ontologies O s and O t , the ABox of O s contains individuals ( I s ), while the ABox of O t is empty ( I t ). The problem of minimizing idiff (U , O s , O t ) by: (i) taking the individuals from I s ; (ii) transforming them correspondingly to the structural difference between O s and O t us- ing T; and (iii) adding them to I t – is denoted as ontology instance migration prob- lem. Theoretically ontology instance migration problem can be solved in one shot. In practice however each of the sub-tasks (ii-iii) may result in the loss of assertions [4]. Therefore an iterative refinement of the solution could yield results with a lower re- sulting idiff value. Hence, the problem has to be solved using an iterative ontology alignment process. Essentially, an iterative solution of ontology instance migration problem develops a sequence of O s states Osts i in a way to minimize the idiff (U , O s , O t ) in a way that: idiff (U , Osts i , Ot ) idiff (U , Osts j , Ot ) i j , (3) where: Osts i is O s in the state after accomplishing iteration i; i, j are iteration numbers. 3 A Solution for Ontology Instance Migration Problem This section demonstrates our agent-based solution for the ASUD ontology alignment problem stated above as ontology instance migration problem. This problem has been chosen as it possesses significant practical interest in real world applications, in particular for Ontology Engineering and Management in distributed and dynamic settings [4]. Instance migration in our solution is performed iteratively, so the align- ment is refined from iteration to iteration. Many influential publications, for example [5], envision that intelligent software components, like agents, need to be used together with ontologies for making seman- tic technologies accepted and effective in open and decentralized scenarios. For such agent based solutions, comprising industrial applications, the heterogeneity problem is the challenge that has to be faced. Ontology alignments are a means to solve the chal- Ontology Alignment and Applications in 90 Minutes 301 lenge. From the other hand agents, being the recipients of ontology alignment solu- tions, may help solving ontology alignment problems. For a graphical illustration and more details of a simplified agent-based architec- ture for solving a generic ontology alignment problem please refer to http://isrg. kit.znu.edu.ua/a-boa/index.php/Theoretical_Foundations_and_Demonstration. The architecture introduces the wrapper agents W and W for ontologies O and O respec- tively. Agent R wraps the central referee ontology O r and helps W and W finding the proper mappings using O r (a matchmaker function). W and W produce their own sets of mappings M and M in collaboration with each other (a fully distributed prob- lem setting) or also in collaboration with R (the problem setting with a central referee ontology). At the Apply Mappings step M and M are autonomously applied by W and W to O and O . A problem in developing such an agent-based solution is how do the agents collaborate and develop these mappings. The presented solution is based on automated meaning negotiations between agents [6] as a way to discover structural differences between the schemas of O and O . Similarly to [7], this approach aims at aligning ontologies by parts (contexts) that are relevant to a particular negotiation encounter. Negotiations imply iterative monotonic reduction of semantic distances between the contexts. An agent uses propositional substitutions which may reduce the distance and support them with argumentation. The process is stopped when the distance reaches a commonly accepted threshold or the involved parties exhaust their propositions and arguments. As opposed to the Ar- gumentation Framework based approaches, this approach addresses the entire process of semantic reconciliation between ontologies and does not require off-the-shelf map- pings. The methodology used in our solution comprises several steps in the workflow. Steps (I) and (II) correspond to Discover Mappings, step (III) is for Applying Map- pings, step (IV) corresponds to the step of evaluation and making decision about un- dertaking one more iteration. Iteration loop however does not involve mappings dis- covery in our solution. Instead, the mappings are revised manually by a knowledge engineer based on the list of migration failures in the migration log. Step (V), though important in practice, is not demonstrated. Biblio-1 and Biblio-2 are used as examples of O and O . The demonstrated agent-based solution is evaluated by comparing to our former work [4] where Ontol- ogy Difference Visualizer (ODV) tool [8] was used for discovering the structural difference between aligned ontologies. Ontology instance migration process starts with the step (I) of discovering the structural difference between O and O . Only TBoxes of the ontologies are used as the sources. Structural difference is discovered by the SDiff Discovery Engine (SDDE) [9] – a system of collaborative software agents negotiating on semantic contexts [10] for finding mappings M : S S . For demonstration purposes discovered structural difference is visualized using UML extension [8]. The mappings are further written down by SDDE as instance transformation rules [4] at the subsequent step (II). In- stance Migration Engine (IME) is invoked at step (III) to perform the instance trans- formations according to these transformation rules. All the cases in which IME fails 302 V. Ermolayev and M. Davidovsky to perform the transformation are recorded to the instance migration log. Step (IV) involves a knowledge engineer who checks the migration log and decides if a refine- ment is required. If so, he starts the new iteration by refining the set of the transforma- tion rules based on his analysis of the failure cases and using the rule editor of IMS at step (II). The refined set of rules is fed to the IMS at step (III). The loop continues until the knowledge engineer decides that further refinement is not possible, or all the instances of I s are migrated to I t . 4 Applications of Ontology Alignment In this part of the tutorial a few selected categories of applications that require align- ing information or knowledge representations are analyzed. A broader spectrum of applications is surveyed in [11]. In particular, attention is paid to the requirements related to ontology alignments that are posed by the applications in each category. A particular ontology alignment problem fitting to these requirements is also outlined. A good survey of ontology-based applications is [12]. Ontology matching and alignment applications are discussed in [2]. Another comprehensive summary of ontology matching techniques and applications is [13]. In addition to these surveys, the publications surveying or reporting ontology alignment approaches are for exam- ple Chuttur [14], Vázquez-Naya et al. [15], Zhdanova et al. [16], Euzenat et al. [17]. Based on these inputs the following several typical application categories are ana- lyzed in the tutorial with a focus on real world applications. 4.1 Distributed Information Retrieval Distributed Information Retrieval (DIR) is an important category of applications that assist retrieving and fusing information from heterogeneous, distributed, and inde- pendent information resources. Ontologies in DIR are used for representing the struc- tures of information at different nodes and for translating or transforming user queries and system responses. In particular, ontologies in DIR are important for extracting information or knowledge satisfying the semantics and the context of a user query. Ontology alignments are required: At query transformation step – for correlating query structure and semantics with different information resource schemas and metadata and building respective par- tial queries At query result fusion step – transforming and putting together the retrieved infor- mation instances Hence, a solution of an SSUD ontology alignment problem is required for query transformation and of an ASUD problem for results fusion and delivery to a user. A critical requirement at the latter step is high recall as it is important not to miss any potentially relevant information while irrelevant individuals can be filtered out using other techniques. One more important requirement to an ontology alignment solution in DIR is its scalability in terms of the complexity and number of aligned ontologies. Ontology Alignment and Applications in 90 Minutes 303 4.2 Human-Machine Dialogues Ontology alignments are used in human-machine interaction for providing mutual understanding between a user and a processing node. A software agent may represent a processing node in such interactions as an intelligent wrapper. Ontologies and their alignments can be used to obtain a formalizable set of requirements, structures, que- ries, etc. from informal or poorly structured user descriptions. As a rule such dialogs are run in iterative way. Hence, iterative ontology alignment methods fit to this category of applications better. Brasoveanu et al. [18] argue the importance of using generic multimodal ontolo- gies on the Semantic Web and propose an approach to enhance human-agent interac- tion based on multimodal ontologies. Guzzoni et al. [19] propose a toolkit-based ap- proach for modeling human-agent interaction. Their toolset provides a means to model different aspects of an intelligent assistant such as: ontology-based knowledge structures; service-based primitive actions; composite processes and procedures; natu- ral language and dialog structures. Tijerino et al. [20] report a framework for human- agent collaboration for the purpose of problem solving on the Semantic Web. In hu- man-machine dialogue scenarios the most critical requirements are adaptability, integrativity, and scalability that allow enhancing human-machine mutual under- standing. 4.3 Ontology Evolution, Versioning, Refinement Ontology evolution, versioning, and refinement are important problems in Ontology Engineering (OE) and Management (KM) applications. Solutions are required for adequately representing knowledge in changing domains. Ontology alignment is one of the enabling technologies in these applications. Indeed, all three problems cope with transforming a source ontology revision to a target state (revision) that fits to the requirements causing the transition. Important aspects of this transition are that the target revision has to: (i) be consistent; (ii) re-use the source as much as allowed by the requirement of being consistent Ontology alignments are used both to ensure consistency and maximal possible degree of re-use. Provided that the source revision is consistent, for proving that the resulting ontology revision is consistent it is sufficient to build the complete static bi- directional alignment (CSBC or CSBD problems). For the proper re-use of the source revision the solution of a uni-directional alignment problem will fit. For example a typical sub-task in an ontology refinement process is ontology instance migration from the source revision to the target revision [4]. A balanced combination of appro- priately high recall and precision is an essential requirement for the instance migra- tion solution. 4.4 Service Composition The automation of web service composition or orchestration at run time is a challeng- ing problem in Service Science which is intensively researched in the last decade. The 304 V. Ermolayev and M. Davidovsky complexity of the problem is caused by the inherent distributed character of software systems based on the use of services (for example Web services), the openness of these systems, and the dynamic character of their configurations and constellations. A sub-stream of research in the field develops the frameworks for services that inten- sively use ontologies as service descriptions – Semantic Web Services. Two promi- nent examples of these frameworks are OWL-S [21] and WSMO/L/X [22] which however do not fully solve runtime service composition problem. More advanced approaches exploit collaborative agents as service wrappers for managing services and service brokers or mediators for manipulating their descriptions in a runtime composition process (for example [23]). Like in Ontology Engineering and Manage- ment, a balanced combination of appropriately high recall and precision is an essen- tial requirement for service composition. The scalability of the solution is also impor- tant. The aspects of ontology reconciliation with respect to Web services and their com- position are elaborated in [24, 25, 26]. An important requirement for such systems is the capability of adaptation and integration for providing compliant access and mak- ing the use of aggregate and atomic services more convenient. 5 Learning Outcomes By the end of the tutorial the participants will: Learn the basics of ontology alignment that will enable them to understand the notions of an ontology, ontology mapping, the process of ontology matching, and the alignment as a result of matching process Learn the generic ontology alignment problem and the classification of its flavors based on the features of distributedness, the span of alignment, the direction of alignment, and the dynamic character of the source ontologies. Specifically, learn about the ontology instance migration problem as one of the ontology alignment problems. Be able to differentiate between one-shot and iterative ontology alignment methods and judge about the appropriateness of using this or that kind of a method in a par- ticular setting Learn about one of the agent-based solutions for ontology alignment (ontology instance migration problem) Learn that ontology alignment is a very important, enabling technology for several kinds of the applications of distributed knowledge-based systems. In particular, learn which of the requirements of these applications make ontology alignment a challenging task. References 1. Ermolayev, V., Davidovsky, M.: Agent-Based Ontology Alignment: Basics, Applications, Theoretical Foundations, and Demonstration. Tutorial Paper. In: Dan Burdescu, D., Aker- kar, R., Badica, C. (eds.) Proc. WIMS 2012, 11-22, ACM (2012) Ontology Alignment and Applications in 90 Minutes 305 2. Euzenat J., Shvaiko P.: Ontology Matching. Berlin Heidelberg, Springer-Verlag (2007) 3. Nardi, D., Brachman, R. J.: An Introduction to Description Logics. In: Baader, F., Calvanese, D., McGuinness, D. L., Nardi, D., Patel-Schneider, P. F. (eds.) The Description Logic Handbook. Cambridge University Press New York, NY, USA (2007) 4. Davidovsky, M., Ermolayev, V., Tolok, V.: Instance Migration between Ontologies having Structural Differences. Int. J. on Art. Int. Tools. 20(6), 1127-1156 (2011) 5. Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American, 284, 28–37 (2001) 6. Ermolayev V., Keberle, N., Matzke, W.-E., Vladimirov, V.: A Strategy for Automated Meaning Negotiation in Distributed Information Retrieval. In: Y. Gil et al. (Eds.): ISWC 2005 Proc. 4th Int. Semantic Web Conference (ISWC'05), 6–10 November, Galway, Ire- land. LNCS 3729, pp. 201–215 (2005) 7. Atencia, M., Schorlemmer, M.: Formalising Interaction-Situated Semantic Alignment: The communication product. In: Proc. of the Tenth International Symposium on Artificial In- telligence and Mathematics (ISAIM'08), Fort Lauderdale, Florida, USA (2008) 8. Ermolayev, V., Copylov, A., Keberle, N., Jentzsch, E., Matzke, W.-E.: Using Contexts in Ontology Structural Change Analysis. In: Ermolayev, V., Gomez-Perez, J.-M., Haase, P., Warren, P., (eds.) Proc. CIAO 2010, CEUR-WS, vol. 626 (2010) 9. Davidovsky, M., Ermolayev, V., Tolok, V.: Agent-Based Implementation for the Discov- ery of Structural Difference in OWL DL Ontologies. In: Mayr, H. C., Ginige, A., Liddle, S. (eds.) Proc. 4th Int. United Information Systems Conference (UNISCON 2012), LNBIP 137, Springer-Verlag, Berlin Heidelberg (2013) 10. Ermolayev, V., Ruiz, C., Tilly, M., Jentzsch, E., Gomez-Perez, G. M., Matzke, W.-E.: A Context Model for Knowledge Workers. In: Ermolayev, V., Gomez-Perez, J.-M., Haase, P., Warren, P. (eds.) Proc. CIAO 2010, CEUR-WS, vol. 626 (2010) 11. Davidovsky, M., Ermolayev, V., Tolok, V.: A Survey on Agent-Based Ontology Align- ment. In: Proc. 4th Int. Conf. on Agents and Artificial Intelligence ICAART’12, pp. 355– 361 (2012) 12. Gargantilla, J., Gomez-Perez, A.: OntoWeb: A Survey on Ontology-Based Applications. Deliverable 1.6. OntoWeb Consortium IST Project IST-2000-29243 (2004) 13. Scharffe, F., Euzenat, J., Le Duc, C., Mocan, A., Shvaiko, P.: Analysis of Knowledge Transformation and Merging Techniques and Implementations. KWEB/2004/D2.2.7/0.8 (2007) 14. Chuttur, M. Y.: Challenges Faced by Ontology Matching Techniques: Case Study of the OAEI Datasets. J. of Information Technology, 3(1), 33–42 (2011). 15. Vázquez-Naya, J. M., Romero, M. M., Loureiro, J. P., Sierra, A. P.: Ontology Alignment Overview. Encyclopedia of Artificial Intelligence 2009, pp. 1283–1289 (2009) 16. Zhdanova A,.V., de Bruijn, J., Zimmermann, K., Scharffe, F.: Ontology Alignment Solu- tion. Deliverable D14 v2.0 (2004) 17. Euzenat, J., Laera, L., Tamma, V., Viollet, A.: Negotiation and Argumentation Techniques among Agents Complying to Different Ontologies. Deliverable D2.3.7, KWEB, v1.0 (2006) 18. Brasoveanu, A., Manolescu, A., Spânu, M. N.: Generic Multimodal Ontologies for Hu- man-Agent Interaction. Int. J. of Computers, Communications & Control, 5(5), 625–633 (2010) 19. Guzzoni, D., Baur, C., Cheyer, A.: Modeling Human-Agent Interaction with Active On- tologies. Artificial Intelligence, SS-07-04, 52–59 (2007) 20. Tijerino, Y.A., Al-Muhammed, M., Embley, D.W.: Toward a Flexible Human-Agent Col- laboration Framework with Mediating Domain Ontologies for the Semantic Web. In: Proc. 306 V. Ermolayev and M. Davidovsky ISWC 2004 Workshop on Meaning Coordination and Negotiation, Hiroshima, Japan, pp. 131–142 (2004) 21. Martin, D., Paolucci, M., McIlraith, S., Burstein, M., McDermott, D., McGuinness, D., Parsia, B., Payne, T., Sabou, M., Solanki, M., Srinivasan, N., Sycara, K.: Bringing Seman- tics to Web Services: The OWL-S Approach. In: Cardoso, J., Sheth, A. (eds.) Proc. SWSWPC 2004, LNCS 3387, pp. 26–42 (2004) 22. Roman, D., Keller, U., Lausen, H., de Bruijn, J., Lara, R., Stollberg, M., Polleres, A., Feiera, C., Bussler, C., Fensel, D.: Web Service Modeling Ontology. Applied Ontology, 1(1), 77–106 (2005) 23. Ermolayev, V., Keberle, N., Kononenko, O., Plaksin, S., Terziyan, V.: Towards a Frame- work for Agent-Enabled Semantic Web Service Composition. Int. J. of Web Services Re- search, 1(3), 63–87 (2004) 24. Li, L., Yang, Y.: Agent Negotiation Based Ontology Refinement Process and Mechanisms for Service Applications. Service Oriented Computing and Applications, 2, 15–25 (2008) 25. Paurobally, S., Tamma, V., Wooldridge, M.: A Framework for Web Service Negotiation. ACM Transactions on Autonomous and Adaptive Systems (TAAS), 2(4) (2007) 26. Huang, J., Zavala, R., Mendoza, B., Huhns, M. N.: Reconciling Agent Ontologies for Web Service Applications. In: Proc. of Multiagent System Technologies: Third German Con- ference (MATES-05). LNAI 3550, pp. 106–117 (2005) Biographies Vadim Ermolayev is an associate professor at the Department of Information Tech- nologies (IT) of Zaporozhye National University and the lead of Intelligent Systems Research Group. He is also a research consultant in Semantic Technologies, Intelli- gent Software Systems, Distributed Artificial Intelligence. The research projects he took part in were focused on: intelligent systems and knowledge representations for enterprises; business and informal process dynamics; intelligent distributed informa- tion retrieval; the confluence of agent-based systems and Semantic Web services; ontology engineering, evolution, and refinement; performance management in engi- neering design. Alignment of knowledge representations was one of important topics in those projects. Maxim Davidovsky is a PhD candidate at the Department of Mathematical Modeling (MM) of Zaporozhye National University. He also works for the Laboratory of Web- based Technologies and Distance Learning and is the member of Intelligent Systems Research Group at the Department of IT. Maxim received his MSc degree in applied mathematics and accomplished his postgraduate course in mathematical modeling and computational methods at the Department of MM. His research interests are in dis- tributed and decentralized knowledge-based systems and software development. The focus of his current research activity is agent-based ontology alignment and instance migration specifically in distributed and decentralized settings. Part 2. ICTERI Workshops 2.1 2nd International Workshop on Information Technologies in Economic Research (ITER 2013) Foreword We would like to offer you the section including a ten papers we have selected for the 2-nd instance of our International Workshop on Information technologies in economic research (ITER 2013) which has been organized as a session in the technical the 9-th International Conference on ICT in Education, Research, and Industrial Applications: Integration, Harmonization, and Knowledge Transfer (ICTERI 2012) held at Kherson, Ukraine on June 19-22, 2013. The necessity to support decisions by the means of IT at different levels of busi- ness process any organization, to verify economical hypotheses and usage of gained knowledge in the learning process requires the use of IT to process relevant informa- tion. Skills of analytical information processing for decision making can effectively be realized only by using information and communication technologies. The large numbers of economic studies is not supported by modern mathematical framework and ICT, leading to poor quality of the research on both the micro-, macro-and industry levels. Creation of ITER is intended to familiarize researchers with modern ICT and mathematical methods of information processing in areas such as Business process management: business process for firms, suppliers, customers, information systems in small and medium business, IT-innovations in management process, R&D company, business intelligence approach, management in virtual or- ganization, e-commerce, cloud technology in business, e-governance. Quantitative methods in economics: econometrics research on micro- and macro level, business process economical modeling, software package for economic re- search, modeling industry mergers and acquisitions, finance modeling. IT education for economists: business informatics curricula, IT-professional training for economical organizations, software programs for economic education, mobile technologies in economical education, e-learning for economists, business games. Under international level research many scientist and researchers use the Ukrainian and Russian language, which greatly limits the possibilities for the world community to become familiar with published papers. This restricted the number of accepted papers till 10 of the 20 provided for ITER: Business process management Decision Supporting Procedure for Strategic Planning: DEA Implementation for Regional Economy Efficiency Estimation Applying of Fuzzy Logic Modeling for the Assessment of ERP Projects Efficiency Matrix Analogues of the Diffie-Hellman Protocol Binary Quasi Equidistant and Reflected Codes in Mixed Numeration Systems Quantitative methods in economics Mechanism Design for Foreign Producers of Unique Homogeneity Product Are securities secure: Study of the Influence of the International Debt Securities on the Economic Growth 310 How to make high-tech industry highly developed? Effective model of national R&D investment policy IT education for economists Econometric Analysis on the Site “Lesson Pulse” Mathematical Model of Banking Firm as Tool for Analysis, Management and Learning Features of National Welfare Innovative Potential Parametric Indication’ Informa- tion-Analytical Tools System in the Globalization Trends’ Context Only timely and qualitative preparation of an economic study will provide recom- mendations and suggestions for decision-makers, to promote the efficient use of mate- rial and budgetary resources in organization. June, 2013 Tanya Payentko Sergey Kryukov Vitaliy Kobets Binary Quasi Equidistant and Reflected Codes in Mixed Numeration Systems Evgeny Beletsky1 and Anatoly Beletsky1 1 Department of Electronics, National Aviation University of Kiev, 1, av. Cosmonaut Komarov, 03680, Kiev, Ukraine ebeletskiy@gmail.com, abelnau@ukr.net Abstract. The problem of constructing quasi equidistant and reflected binary Gray code sequences and code in a mixed factorial, Fibonacci and binomial numeration systems is considered in the article. Some combinatorial construc- tions and machine algorithms synthesis sequences, based on the method of di- rected enumeration are offered. For selected parameters of sequences all quasi equidistant (for individual cases - reflected) codes with Hamming distance equal to 1 are found. Keywords. Reflected codes, quasi equidistant sequence, Hamming distance Key terms. Research, CodingTheory, MathematicalModelling 1 Introduction Coding theory is one of the most important areas of modern applied mathematics. Beginning of the formation of mathematical coding theory dates back to 1948, when it was published a famous article by Claude Shannon [1]. The growth of codes origi- nally was stimulated by tasks of communication. Later constructed codes found many other applications. Now codes are using to protect data in a computer memory, cryp- tography, data compression, etc. The work is devoted to a rather small, but extremely important for applications subset of so-called quasi-equidistant and reflected codes. The class of quasi equidis- tant codes are sequences of uniform (i.e., containing the same number of bits) of bi- nary code combinations in which any adjacent (neighboring) code sets (words) are at the same Hamming d distance equal to a fixed number of natural numbers (i.e. d = 1, 2, …) [2]. Equidistant sets include such codes in which any two words (code com- binations) are at the same distance d [3]. Finally, we shall refer to the reflected subset quasi equidistant codes with distance d =1, the formation of which is based on the principle of mirror reflection? [4]. But if we restrict ourselves to only one mirror, the code sequence will contain the original sequence, after which is the same sequence just re-written in reverse order, which is 312 E. Beletsky and A. Beletsky unacceptable, since it leads to code repetition. The elimination of repetition can be provided by initial expansion of the number of digits combinations. The essence of the "mirror" reflection of the expansion is explained below as an example of canoni- cal reflected Gray codes and in other sections of this article. The main objective of this study is to develop algorithms for constructing quasi- equidistant and reflected binary Gray codes as well as code sequences in a mixed factorial, Fibonacci and binomial bases. The method of direct enumeration is the base of algorithms of computer sequences synthesis. 2 Basic of Number System The history of discrete mathematics and computer science is directly related to the development and introduction of newer principles of representation and encoding digital information, which are based on the numeration system of numbers. By a nu- meration system we understand the way of image sets of numbers using a limited set of characters that form its alphabet, in which the characters (elements of the alphabet) are located in the established order, occupying a certain positions [5]. Any numeration system should be composed of a finite set of non-negative numbers — a range that it encodes. It always includes the number 0 and then follows the natural numbers start- ing with 1 [6]. There are various numeration system (as well as methods for their classification), whose number is constantly growing. All systems can be divided into the following main classes: positional, not positional and mixed. In the positional numeration sys- tems the same numeric characters (digit) has different meanings in its description depending on the location (level) where it is resides. By positional numeration system is generally understood the p numeration system, which is defined by an integer p 1 — is called a base of numeration system. Un- signed integer N in p numeration system is represented as a finite linear combina- tion of powers of n N k p k , (1) k 1 where k are integers satisfying the inequality 0 k ( p 1) , n the number of digits of the number. The simplest examples of positioning systems (1) can be binary, decimal, and other numeration systems. In no positional numeration systems the value which indicated by the digit does not depend on the position in a number. At the same time the system may impose restrictions on the position of numbers, for example, that they are in descending order. The Roman and many other systems belong to not positional systems. The mixed numeration system is a generalization of the p system, and often refers to the positional numeration systems. The base of mixed numeration system is an increasing sequence of numbers pk , k 1, 2, , and each N number is presented like linear combination: Binary Quasi Equidistant and Reflected Codes … 313 n N k pk , k 1 there are some restrictions exist for k coefficient. One of the known examples of the mixed system is a factorial numeration system, in which the bases are the sequence of factorials pk k! . Another commonly used Fibonacci numeration system is a system that is based on Fibonacci numbers. The Binomial system in the form in which it is presented in the relevant section of this article, we will also include to a mixed numeration system. A positive integer is depicted in an arbitrary numeration system as a sequence of symbols N n n 1 k 21, where N - the number representation in this numeration system, besides each k symbol takes rk bit in general case (if binary alphabet is using). Note the following general characteristics of quasi equidistant codes with Ham- ming distance d 1 . Let’s agree each code sequence starts with zero code. And as result of this agreement the following code after the zero code should be placed with weights 1 and 2, and Further weight codes must alternate even (E) — odd (O) under the scheme 012OEOE E(O) . (2) Scheme (2) is a symbolic form of the tree sequence code combinations. Let’s ne and no to be the amount of even and odd code words in a sequence. If the sequence (2) ends up with odd code combination this means ne no , and if even — ne no 1 . This becomes evident: Statement 1. Inequality 0 ( ne no ) 1 , (3) is a necessary (but not always sufficient) condition for the construction of quasi equi- distant codes. 3 Sequences of Gray Codes Classic Gray codes [7] may be called canonical, since for arbitrary length sequence of combinations are not only quasi equidistant, but also reflected. Let’s G ( n ) se- quence of n-bites classical Gray codes. To construct ( n 1) bites reflected Gray Codes, let’s us note as Grc ( n 1) codes, it is just enough to prefix for each source code G ( n ) the 0 digit and 1 to the left of code group G R (n ) constructed by reflected (reflex or reverse) mirror of G ( n ) sequence, i.e. 314 E. Beletsky and A. Beletsky Grc ( n 1) 0G ( n ) ||1G R ( n ) , (4) where || - is a symbol of concatenation (conjunction of sequences). According to (4), Grc ( n 1) G ( n 1) and as a result sequences of Gray codes of G ( n ) number of digits n 2 are both quasi equidistant and reflected, and besides the line of reflection goes through 2n 1 and (2n 1 1) code combinations. On the basis of the canonical code G ( n ) , n 2 , the equidistant Gray codes can be con- structed. For example, Tab. 1 show the three 12-bit code quasi equidistant sequences, one of which corresponds to the canonical version of the Gray code. The first six variants of sequences in the table constructed of canonical option 1 as a result of a variety column rearrangement saving the Hamming distance d 1 of related code combinations. Variants 7-12 are formed as a result of inverse none zero rearrangements of code combinations from appropriate variants 1-6. Table 1. Three bit quasi equidistant Gray code Variants of sequence 1 2 3 4 5 6 7 8 9 10 11 12 000 000 000 000 000 000 000 000 000 000 000 000 001 100 100 001 010 010 100 001 010 010 001 100 011 110 101 101 110 011 101 101 110 011 011 110 010 010 001 100 100 001 111 111 111 111 111 111 110 011 011 110 101 101 110 011 011 110 101 101 111 111 111 111 111 111 010 010 001 100 100 001 101 101 110 011 011 110 011 110 101 101 110 011 100 001 010 010 001 100 001 100 100 001 010 010 The first six variants of sequences in the table constructed of canonical option 1 as a result of a variety column rearrangement saving the Hamming distance d 1 of related code combinations. Variants 7-12 are formed as a result of inverse none zero rearrangements of code combinations from appropriate variants 1-6. As follows from Tab. 1 the only variants 1 (canonical) and 6 of Gray codes belong to a set of three bites reflected codes. At the same time each three bite sequence by (4) statement pro- duce subset of four bite reflected Gray codes. Thereby it is true: Statement 2. All amounts L(отG ) ( n ) of reflected Gray codes of n number of digits is defined by n, if n 2; L(rcG ) (n 1) 2n !, if n 3. Binary Quasi Equidistant and Reflected Codes … 315 3 Factorial Sequence The integer positive number N in factorial number of numeration system can be represented as n N k k !, 0 k k (5) k 1 where k 1, 2,, n; 0 k k . Extended form of (5) statement is N n n ! n 1 (n 1)! 2 2! 1 1! , (6) Statement (6) is so called numerical, or digital, function [8] of factorial system. There are first 120 decimal numbers (Tab. 2) defined by their k coefficients in facto- rial numeration system. Table 2. Binary representations of decimal numbers of factorial numeration system N N k Fakt N N k Fakt N N k Fakt N N k Fakt N N k Fakt 0 0 24 100000 48 1000000 72 1100000 96 10000000 1 1 25 100001 49 1000001 73 1100001 97 10000001 2 10 26 100010 50 1000010 74 1100010 98 10000010 3 11 27 100011 51 1000011 75 1100011 99 10000011 4 100 28 100100 52 1000100 76 1100100 100 10000100 5 101 29 100101 53 1000101 77 1100101 101 10000101 6 1000 30 101000 54 1001000 78 1101000 102 10001000 7 1001 31 101001 55 1001001 79 1101001 103 10001001 8 1010 32 101010 56 1001010 80 1101010 104 10001010 9 1011 33 101011 57 1001011 81 1101011 105 10001011 10 1100 34 101100 58 1001100 82 1101100 106 10001100 11 1101 35 101101 59 1001101 83 1101101 107 10001101 12 10000 36 110000 60 1010000 84 1110000 108 10010000 13 10001 37 110001 61 1010001 85 1110001 109 10010001 14 10010 38 110010 62 1010010 86 1110010 110 10010010 15 10011 39 110011 63 1010011 87 1110011 111 10010011 16 10100 40 110100 64 1010100 88 1110100 112 10010100 17 10101 41 110101 65 1010101 89 1110101 113 10010101 18 11000 42 111000 66 1011000 90 1111000 114 10011000 19 11001 43 111001 67 1011001 91 1111001 115 10011001 20 11010 44 11110 68 1011010 92 1111010 116 10011010 21 11011 45 111011 69 1011011 93 1111011 117 10011011 22 11100 46 111100 70 1011100 94 1111100 118 10011100 23 11101 47 111101 71 1011101 95 1111101 119 10011101 316 E. Beletsky and A. Beletsky Let’s mark Ф( k ) sequence of n bite factorial codes. In the case where number of digits of code combination from code set Ф( k ) less than k , it is prefixed with required amount of zeros. Let’s Фd ( k ) sequence of quasi equidistant k bite factorial codes with Hamming distances among related combinations equal to d . Based on data from Tab. 2 it is easy to create (Tab. 3) sequences Ф1 ( k ) for k 1 (singular case), and also k 2 and k 3 created by columns rearrangement of base sequences (variant 1). Table 3. Sequences of quasi equidistant Factorial Codes Ф1 ( k ) k 1 k2 k 3 1 1 2 1 2 3 4 5 6 0 00 00 000 000 000 000 000 000 1 01 10 010 010 100 100 001 001 11 11 011 110 101 110 011 101 10 01 001 100 001 010 010 100 101 101 011 011 110 110 100 001 010 001 100 010 Table 3 illustrates one possible method of synthesis of quasi equidistant codes. Its idea is in the following. At the very first stage the base sequence of quasi equidistant codes of n number of digits is created by means of some method (for example, the method of direct search which is examined below). On the second stage a variety of all possible rearrangements of base sequence columns (check out Tab. 3, the corre- spondent values are of number 1) is done which results in formation of n! different quasi equidistant codes. And finally on the third stage the sequences which contain restricted code combinations are excluded from n! sequences. Such combinations are 110 codes from Tab. 3 highlighted with bold type. So from six three bite sequences the only two generate quasi equidistant factorial sequences. Starting from k 4 apart from quasi equidistant sets it is possible to create reflected factorial codes Фrc ( k ) . Starting from k 4 apart from quasi equidistant sets it is possible to create reflected factorial codes Фrc ( k ) . The algorithm of reflected codes creation depends on their number of digits. In particular, here is easily provable by direct verification. Statement 3. The set of uniform reflected factorial codes defined by recurrence re- lation Фrc ( k ) 0Ф1 ( k 1) ||1Ф1R ( k 1) , Let’s discuss the problem of synthesis of quasi equidistant factorial codes with a number of digits n 4, 7 . So taking the data from Tab. 3 let’s construct a preliminary weights distribution of n bite code combinations resulting in Tab. 4. The amount of codes with even and odd weights in current table for all variants n are satisfying inequality (3) and this means, that all required conditions for quasi equidistant facto- rial codes creation are met. Binary Quasi Equidistant and Reflected Codes … 317 Schema (2) of uniform codes Ф(4) weights interchanges, according to Tab. 4, is 012O2O2O2O2O (7) Table 4. Weights distribution of code words Ф ( n ) Code The bit of code combinations weight n4 n5 n6 n7 0 1 1 1 1 1 4 5 6 7 2 5 9 14 20 3 2 7 16 30 4 2 9 25 5 2 11 6 2 ne 6 12 24 48 no 6 12 24 48 In all 12 24 48 96 At that from 5 odd elements (O) of sequence (7) two elements are equal 3 and the rest – 1. Which means, that there are ten possible variants of quasi equidistant facto- rial code trees of number of digits n 4 , from whose the one, for depiction, is shown on Fig. 1. Fig. 1. Tree Ф1 (4) of sequence 012321232121 The symbolic form of the tree of code combination sequence Ф1 (5) can be repre- sented by schema 012OEOEOEOEOEOEOEOEOEOEO , (8) One of variants is shown on Fig. 2. Fig. 2. The variant of tree sequence Ф1 (5) 318 E. Beletsky and A. Beletsky Let’s go to validation to the whole amount of trees variants Ф1 (5) . First of all pay attention (Fig. 2) the code combinations with weight of 4 must reside between codes with weights equal 3. This is required to provide a distance between related combina- tions equal to 1. Merge code pairs with weights equal to 3 among whose code with weights equal to 4 are reside. By that we can get rid of two code pairs with weights 3 and 4 in column n 5 Tab. 4 and schema (8) rewrite as 012O2O2O2O2O2O2O2O2O . (9) There are group of nine odd (O) code combinations which contains four codes with weight equal to 1 and five with weight equal to 3 in the schema (9). It is evident the 126 variant of not complete trees of sequence Ф1 (5) exists, equal to number of nine by four combinations. And now take into consideration that in each of 126 variants of symbolic form (9) because of the operation, inversed to “merge” operation described above, it is possible to restore entire schemas of trees (8). Because of 10 possible methods of inverse operation means the entire amount of trees Ф1 (5) construction equal to 1260. Performing by the same method validation of amount of trees LФ (6) of Ф1 (6) sequences we get LФ (6) =1513512. With increasing of number of digits n the complexity of combinatorial validation LФ ( n ) and amount of trees Ф1 ( n ) dra- matically increases. For example, all 10 variants of trees Ф1 (4) are shown in Tab. 5. Ф1 (4) Table 5. Trees № Tree variant № Tree variant 1 012323212121 6 012123212321 2 012321232121 7 012123212123 3 012321212321 8 012121232321 4 012321212123 9 012121232123 5 012123232121 10 012121212323 First of all we construct ranged by weights v sequence of uniform codes Ф (4) (Tab. 6). Table 6. Ranged Ф (4) codes Code weight v № 0 1 2 3 1 0000 0001 0011 0111 2 0010 0101 1011 3 0100 0110 4 1000 1001 5 1010 Binary Quasi Equidistant and Reflected Codes … 319 In correspondence with a schema of sixth tree variant (Tab. 5) the first two code sequences, which will be called layers of tree branch, choose 0000 and 0001 codes. We could choose 0010 layer instead of 0001. The third layer to choose would be a code with weight equal to 2, the one which consist of 0001 code with Hamming dis- tance equal to 1. Suitable ones are codes in columns with 1, 2 and 4 numbers of Tab. 6. The code with smaller number will be considered as a base, the rest – alternative. Keep moving the same way with codes choosing for Ф1 (4) sequence, using the schema of chosen tree, we have a Tab. 7. Table 7. Synthesis of of Ф1(4) branch № Code weight Base code Alternative code 1 0 0000 2 1 0001 0010 3 2 0011 0101 1001 4 1 0010 5 2 0110 The ninth layer of tree under synthesis should be a code with weight equal to 2, moreover it must reside from previous code with distance equal to 1. But there is no such a code, which were not used in Tab. 6. In order to cope with this deadlock we will do the following. We will go up through columns of and will do a substitution in this row with a nearest alternative code located from the right of it. In this case we should substitute base code 0011 with alternative code 0101 and after- wards continue the synthesis procedure for Ф1 (4) . An example of quasi equidistant codes Ф1 (4) synthesized by method of direct enumeration is shown in Tab. 8. Table 8. Ф1 (4) Sequences, correspondent to 012321212321 tree Number The branch of the tree Tree of tiers 1 2 3 4 5 6 7 8 9 10 0 0 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 1 1 0001 0001 0010 0010 0010 0010 0100 0100 0100 0100 2 2 1001 1001 0011 0011 1010 1010 0101 0101 1100 1100 3 3 1011 1101 1011 1011 1011 1011 1101 1101 1101 1101 4 2 0011 0101 1010 1010 0011 0011 1100 1100 0101 0101 5 1 0010 0100 1000 1000 0001 0001 1000 1000 0001 0001 Number The branch of the tree Tree of tiers 1 2 3 4 5 6 7 8 9 10 6 2 1010 1100 1001 1100 0101 1001 1001 1010 0011 1001 7 1 1000 1000 0001 0100 0100 1000 0001 0010 0010 1000 8 2 1100 1010 0101 0101 1100 1100 0011 0011 1010 1010 9 3 1101 1011 1101 1101 1101 1101 1011 1011 1011 1011 10 2 0101 0011 1100 1001 1001 0101 1010 1001 1001 0011 11 1 0100 0010 0100 0001 1000 0100 0010 0001 1000 0010 320 E. Beletsky and A. Beletsky 4 Fibonacci Sequences Fibonacci codes are generalized concept of classical binary code [9]. Any nonnega- tive integer N 0, 1, 2, … can be exclusively represented by a numerical Fibo- nacci function N n Fn n 1Fn 1 k Fk 2 F2 1F1 . (10) Besides the sequence { k } in (1) doesn’t contain pairs of neighbor unities which are provided by equivalent conversion called “folding’ operation: 011 100 . This operation makes it possible to represent Fibonacci number as so called “minimal” form, the code combination of which will have minimal weight. For example, [10], 01111011001 10011100001 10100100001 . (11) The codes which are underlined in example (11) are codes for which folding opera- tion was performed. As it follows from this example the folding operations resulted in weights decreasing of code combinations. Namely, the amount of units in the final code is less than in the original one. Using the folding operation it is easy to come to a representational algorithm of multidigit binary Fibonacci numbers. As an example let’s consider a method of repre- sentation of natural sequence of decimal numbers (including zero) by four digits numbers of Fibonacci codes. We need to agree to label code numbers from right to left assuming the smaller (the very right) number the correspond to number 1, then number 2 and so on. We choose such a coding method of first three decimal numbers 0, 1 and 2: 010 0000 ; 110 0001 ; (12) 210 0010. A conversion from decimal number k10 to ( k 1)10 number in Fibonacci codes (la- bel them as Fk and Fk 1 correspondingly) will be performed using a rule: if there is 0 in a smaller position Fk then it is substituted with 1 in Fk 1 code. If there is 1 in a smaller position Fk then this 1 goes to the second position and writes as 0 in a smaller position. This rule is using in system (12) while conversion from F1 to F2 . Let’s represent number 310 with Fibonacci code. But before we go, following the rule described above we will get code 310 00011 which by folding operation would be represented in its minimal form Binary Quasi Equidistant and Reflected Codes … 321 310 0100. (13) According to statements (12) and (13), the smaller positions of Fibonacci codes are using for decimal numbers 1, 2 and 3 representations correspondingly. Those values are generalized by the following recurrent block synthesis algorithm of binary Fibo- nacci sequences. Let’s F ( k ) is a set of Fibonacci numbers of the same length in- cluding 0. Then we have: Statement 4. A set of k bite Fibonacci numbers of the same length is defined by recurrent correlation F ( k ) 10 || F ( k 2) . (14) The proving of just formulated statement can be easily performed by a method of direct verification. In the right part of (14) the F ( k 2) set is consisted of ( k 2) position numbers. From this it is followed that if any subset of Fibonacci numbers, included in F ( k 2) , contain digits the number of digits of whose are less than k 2 then those numbers are prefixed with required amount of zeros. Algorithm (14) is right for any value k 2 . Indeed, if k 2 then F (2) 10 || F (0) . As long as F (0) set is empty then F (2) set contains the only Fibonacci digit 10, which corresponds to decimal digit 210 . There are Fibonacci codes for limited sequence of decimal numbers calculated us- ing recurrent formula considering initial condition (12) in Tab. 9. Zeros, which are located to the left of bigger unit in Fibonacci coders, have been removed. You can see values n in column F of Tab. 9, equal to number of codes which can be created by a fixed number of binary positions. For example, F 3 means the four bite combinations, which contain 1 in its older position, can be created three Fibonacci codes. Writing down the values from F column we will get sequence 1, 1, 2, 3, 5, 8, 13,… which is classical sequence of Fibonacci numbers. Now go to estimation of variants of quasi equidistant Fibonacci code trees. For this purpose based on data from Tab. 9 let’s create a preliminary table of distribution of code combinations weights, included in F ( k ) , k 4, 7 , (Tab. 10). By analysis of data from Tab. 10 we have the following conclusion. Quasi equidistant sequences of four digit Fibonacci numbers are end up with code combinations with weight of 1, five or six number of digits with weight of 2 and seven numbers of digits with odd weight equal to 1 or 3. 322 E. Beletsky and A. Beletsky Table 9. Fibonacci numbers k10 Fk F k10 Fk F k10 Fk F 0 0 13 100000 21 1000000 1 1 1 14 100001 22 1000001 2 10 1 15 100010 23 1000010 3 100 16 100100 24 1000100 2 4 101 17 100101 25 1000101 5 1000 18 101000 26 1001000 6 1001 3 19 101001 8 27 1001001 13 7 1010 20 101010 28 1001010 8 10000 29 1010000 9 10001 30 1010001 10 10010 5 31 1010010 11 10100 32 1010100 12 10101 33 1010101 Table 10. Distribution of code combinations weights F ( k ) All code Number of code digits ( k ) combinations 4 5 6 7 0 1 1 1 1 1 4 5 6 7 2 3 6 10 15 3 1 4 10 4 1 nч 4 7 11 17 nн 4 6 10 17 All together 8 13 21 34 It is not that complicated to perform a calculation LF ( k ) of quantity of variants for quasi equidistant Fibonacci sequence F1 ( k ) trees. The result of this calculation for chosen k parameters is shown in Tab. 11. Table 11. Power of tree subset F1 ( k ) Amount of tree variants F1 ( k ) k 4 5 6 7 LF ( k ) 1 5 126 205920 For reflected Fibonacci codes it is right the following Statement 5. A set of even k bite reflected Fibonacci codes is defined by recurrent correlation Binary Quasi Equidistant and Reflected Codes … 323 Фот ( k ) 00 F1 ( k 2) 10 F1R ( k 2) , (15) where F1R ( k ) sequence is inversed to F1 ( k ) , i.e. the sequence of quasi equidistant codes F1 ( k ) written in reverse order. As an example (Tab. 12) of calculated using a computer a branch of one tree F1 (6) . Table 12. Sequences F1 ( k ) of tree 012321232123232121212 Number The branch of the tree Tree of tiers 1 2 3 4 5 6 7 8 0 0 000000 000000 000000 000000 000000 000000 000000 000000 1 1 000001 000001 000010 000100 000100 001000 001000 010000 2 2 000101 010001 100010 000101 010100 001010 101000 010001 3 3 010101 010101 101010 010101 010101 101010 101010 010101 4 2 010100 000101 001010 010001 000101 101000 100010 010100 5 1 000100 000100 001000 000001 000001 100000 100000 000100 6 2 100100 100100 101000 100001 100001 100001 100001 000101 7 3 100101 100101 101001 100101 100101 101001 101001 100101 8 2 100001 100001 001001 100100 000000 001001 001001 100100 9 1 100000 100000 000001 100000 100000 000001 000001 100000 10 2 100010 100010 010001 100010 100010 010001 010001 100010 11 3 101010 101010 010101 101010 101010 010101 010101 101010 12 2 101000 101000 000101 101000 101000 000101 000101 101000 13 3 101001 101001 100101 101001 101001 100101 100101 101001 14 2 001001 001001 100001 001001 001001 100100 100100 100001 15 1 001000 001000 100000 001000 001000 000100 000100 000001 16 2 001010 001010 100100 001010 001010 010100 010100 001001 17 1 000010 000010 000100 000010 000010 010000 010000 001000 Number The branch of the tree Tree of tiers 1 2 3 4 5 6 7 8 18 2 010010 010010 010100 010010 010010 010010 010010 001010 19 1 010000 010000 010000 010000 010000 000010 000010 000010 20 2 010001 010100 010010 010100 010001 100010 001010 010100 324 E. Beletsky and A. Beletsky 5 Binomial Sequences There are many known methods for binomial codes creation and based on them – binomial sequences [11]. We will consider two ways of even binomial codes synthesis in this unit. First of them we will call an “algorithm A. Borysenko”, and the second one an “algorithm of A. Beletsky”, which is called as alternative algorithm here in after. The whole idea of first algorithm of uneven binary binomial codes, which correlate to algorithm of full summarized binomial arithmetic, is described in [12], page 124. Of course any uneven binary code can be converted to even code of n number of dig- its (length). For this purpose it is just enough to prefix the code combination such amount of zeros so the common number of digits became equal to n . To construct algorithms of binomial arithmetic by Borysenko it is enough to define two parameters k and n, the first one defines the maximal amount of units in codes, the second one by value r n 1 , defines the maximal length of uneven binomial number. A decimal zero in Borysenko’s binomial code is written down as l n k of zeros, the range P of binomial numbers is defined by formula Fmax P 1 . Here are a number of examples of binomial numbers Bx (algorithm A. Borysenko), creation whose correspond to decimal value x (Tab. 13). Table 13. Variants of binomial number sequences n = 6, k = 4 n = 6, k = 2 n = 6, k = 3 x Bx x Bx x Bx x Bx x Bx x Bx 0 00 10 11010 0 0000 10 10000 0 000 10 1000 1 010 11 11011 1 00010 11 10001 1 0010 11 10010 2 0110 12 11100 2 00011 12 1001 2 00110 12 10011 3 01110 13 11101 3 00100 13 101 3 00111 13 10100 4 01111 14 1111 4 00101 14 11 4 0100 14 10101 5 100 5 0011 5 01010 15 1011 6 1010 6 01000 6 01011 16 11000 7 10110 7 01001 7 01100 17 11001 8 10111 8 0101 8 01101 18 1101 9 1100 9 011 9 0111 19 111 Let’s label B ( n, k ) sequence of binomial numbers created by Borysenko’s algo- rithm. From analysis of Tab. 4 we get the following conclusion. Statement 6. Direct and inverse binomial sequences are linked with correlation R B ( n, k ) B ( n, n k ) , Binary Quasi Equidistant and Reflected Codes … 325 R where B ( n, n k ) sequence of binomial codes, which first of all is written in re- R verse order to codes in B ( n, k ) and secondly each position of B ( n, n k ) forms by result of inversion (i.e. substitution of 0 to 1 and vice versa) of corresponding posi- tions B ( n, k ) . Let’s find out a possibility of quasi equidistant codes B1 ( n, k ) creation based on set of binomial numbers B ( n, k ) . For this purpose using the data from Tab. 13 lets create a table of code combination weights distribution (Tab. 14) included in B ( n, k ) set. According to data from Tab. 14 and also values ne and no comparison, received for many other parameters n and k , we can conclude the inequality (3) for codes B ( n, k ) is not true and as sequence it is true Table 14. Distribution of code combination weights B ( n, k ) Weight of code B (6, 4) B (6, 2) B (6, 3) combination 0 1 1 1 1 2 4 3 2 3 10 6 3 4 10 4 5 nч 9 11 7 nн 6 4 13 All together 15 15 20 Statement 7. Binomial codes do not form quasi equidistant sequences. Let’s move to creation of alternative binomial codes. Introduce numeric function B nCnn n 1Cnn11 k Ckk 1C11 (15) where k k ( k 1)( k 1 l ) Clk , l l! - binomial coefficient which is equal to number of k and l combinations. The coef- ficients k are defined by a correlation k 0, k / 2 , in which x means rounding of number x to the nearest integer above. Series (15) is presented in form of binary coefficients k for each of who’s the limited number of positions equal to number of digits and required for binary value k / 2 representation is assigned. Coefficient unambiguously defines the value of monomial k Ckk , as it is shown in Tab. 15 (in which for example purpose the value k 7 is chosen). 326 E. Beletsky and A. Beletsky Table 15. An example of monomial series calculation (16) 7 0 1 2 3 4 C77 1 7 21 35 35 7C77 0 7 42 105 140 For a sequence of binomial codes created by numerical function (15), let’s intro- duce a label B ( n, r ) in which n parameter will be called a power of a function, and r order of function, which is equal to coefficient n . A fragment of binomial codes is shown in Tab. 16. Table 16. The sequence of binomial numbers B (4, 2) N 3 2 1 N 4 3 2 1 0 0 10 1 0 1 1 1 1 1 11 1 1 0 0 1 12 1 1 0 1 0 2 1 0 13 1 1 0 1 1 3 1 1 14 1 0 0 0 1 0 4 1 0 1 15 1 0 0 0 1 1 5 1 1 0 16 1 0 0 1 0 1 6 1 1 1 17 1 0 0 1 1 0 18 1 0 0 1 1 1 7 1 0 0 1 19 1 0 1 0 0 1 N 3 2 1 N 4 3 2 1 8 1 0 1 0 20 1 0 1 0 1 0 9 1 0 1 1 21 1 0 1 0 1 1 In order to decide a question regarding the possibility of quasi equidistant binomial sequences creation let’s create a table of a set of code combinations weights (Tab. 17). Table 17. Distribution of weights of code combinations B1 ( n, r ) Amount of digits of binomial sequence Weight 3 4 5 6 7 8 9 10 0 1 1 1 1 1 1 1 1 Amount of digits of binomial sequence Weight 3 4 5 6 7 8 9 10 1 2 2 2 2 2 2 2 2 2 3 5 5 6 6 6 6 6 3 1 2 4 9 9 12 12 12 Amount of digits of binomial sequence Weight 3 4 5 6 7 8 9 10 Binary Quasi Equidistant and Reflected Codes … 327 4 2 4 7 15 15 17 5 2 12 12 23 6 4 8 29 7 2 18 8 4 Even 4 6 8 1 14 26 30 57 Odd 3 4 6 11 13 26 28 55 In all 7 10 14 22 27 52 58 112 Sign + – – + + + – – As an example check Tab. 18, where results of quasi equidistant codes creation by a method of direct enumeration based on one of trees for B (4,2) is shown. Table 18. Results of computer code synthesis B1(4,2) Number The branch of the tree Tree of tiers 1 2 3 4 5 6 7 8 0 0 000000 000000 000000 000000 000000 000000 000000 000000 1 1 000001 000001 000001 000001 000001 000001 000001 000001 2 2 000101 000101 000101 000101 000101 000101 000101 000101 3 3 100101 100101 100101 100101 100101 100101 100101 100101 4 4 100111 100111 100111 100111 100111 100111 100111 100111 5 3 100011 100011 100110 100110 100110 100110 100110 100110 6 2 000011 100010 100010 100010 100010 100010 100010 100010 7 3 001011 101010 100011 100011 101010 101010 101010 101010 8 2 001010 001010 000011 000011 001010 001010 001010 001010 9 3 011010 011010 001011 001011 001011 011010 011010 011010 10 4 011011 011011 011011 101011 011011 011011 011011 011011 11 3 011001 011001 011001 101001 011001 001011 011001 011001 12 2 001001 001001 001001 001001 001001 001001 001001 001001 13 3 101001 101001 101001 011001 101001 101001 101001 101001 14 4 101011 101011 101011 011011 101011 101011 101011 101011 15 3 101010 001011 101010 011010 100011 100011 100011 001011 16 2 100010 000011 001010 001010 000011 000011 000011 000011 17 1 000010 000010 000010 000010 000010 000010 000010 000010 18 2 000110 000110 000110 000110 000110 000110 000110 000110 19 1 000010 000010 000010 000010 000010 000010 000010 000010 Number The branch of the tree Tree of tiers 1 2 3 4 5 6 7 8 20 2 000110 000110 000110 000110 000110 000110 000110 000110 21 3 000111 000111 000111 000111 000111 000111 000111 000111 22 4 010111 010111 010111 010111 010111 010111 010111 010111 23 3 010110 010110 010110 010110 010110 010110 010110 010110 328 E. Beletsky and A. Beletsky A feature of alternative binomial codes is that they do not allow creating quasi equidistant codes in a full manner as it is visible from Tab. 18. In particular, for all sequences shown in Tab. 18, the latest codes (highlighted) reside from previous codes with a Hamming distance equal 3 but not 1, as it is required for sequence B1 (4,2) . This feature of alternative binomial codes is visible in all possible variants B1 ( n, r ) . 6 Conclusions The main result of this research is formation of generalized conditions for quasi equi- distant and reflected codes existence which are produced by even consistent binary code combinations in a mixed numeration systems. Except of Gray codes the Fibo- nacci, factorial and binomial codes with Hamming distance between related code combinations equal to 1, are also included in a set of such codes. The main method for synthesis of quasi equidistant codes is a method of computer direct enumeration. The results of this research can be easily generalized and applied for cases where Hamming distance is more than 1. References 1. Shannon, C. T.: A Mathematical Theory of Communication. Bell. Syst. Tech. J., 27, 379 – 423, 623 – 656 (1948) 2. Efimenko, V. V., Karpjuk, B. V., Stukalin, Iu. A.: An Algorithm for Synthesis of Binary Quasi Equidistant Codes. Journal of Acad. Sience, USSR, AVTOMETRIJA, 5, 109–115 (1968) (In Russian) 3. Bogdanov, G. T., Zinovjev, V. A. Todorov, T. J.: On the Construction of Quasi Equidistant Codes. Journal of Problems of Information Transmission, 43(4), 13–36 (2007) (In Russian) 4. Beletsky, A. Y., Beletsky E. A.: Quasi Equidistant Codes. NAU Publishing , Kiev (2008) (In Russian) 5. Banja, E. N., Selivanov, V. L.: About the Features of the Construction of Various Number Systems. Journal of NTUU "KPI" Informatics, Management and Computer Science, 49, 68–73 (2008) (In Russian) 6. Borysenko, A. A., Cherednychenko, V. B.: Number Systems in Computing. Bulletin of the SSU, Engineering Series, 4, 162–177 (2009) (In Russian) 7. Grey, F.: Pulse Code Communication, Pat. USA, № 2632058 (1953) 8. Borysenko, A. A.: Discrete Mathematic. Textbook publishing house SSU (2007) (In Russian) 9. Stahov, A. P.: Codes of Golden Proportion. Radio Communication, Moscow (1984) (In Russian) 10. Stahov, A., P.: Fibonacci Codes, http://goldenmuseum.com/1010FibCodes_rus.html 11. Zanten, A. Ja.: Binomial System and Enumerations of Combinatorial Objects. Journal of Discrete Analysis and Operations Research, Series 1. 6, 12–18 (1999) (In Russian) 12. Borysenko, A. A.: Binomial Count. Theory and practice. Publishing house SSU (2004) (In Russian) Mechanism Design for Foreign Producers of Unique Homogeneity Product Vitaliy Kobets1 1 Kherson State University, 1, 40 rokiv Zhovtnya Street, 73000, Kherson, Ukraine vkobets@kse.org.ua Abstract. Paper concerns to impact on custom receipts of duty rate changing from single to differentiated ones by customs house for foreign producers. To get maximal custom receipts for achieving of social goal state may introduce differentiated duty rates for foreign producers of unique product. Success of this state policy will depend on effectiveness of incentive compatibility conditions for these producers. Keywords. Mechanism design, single duty, differentiated duty, custom policy, social choice function Key terms. MechanisnDesign, RevelationPrinciple, SocialGoal, Mathemati- calModel, IncentiveCompatible 1 Introduction Mechanism is a mathematical structure, modeling an institute and determining the set of rules, regulating actions accessible to the participants and determining as partici- pants strategies in given communication system are transformed in results. In the absence of co-operation mechanism between participants the final result can substan- tially differ from social optimal one. A mechanism implements given objective func- tion, realizing it on participants types space [2; 8]. Mechanism structure includes [7]: 1. Social choice function (SCF is a final result demanded by the society) 2. Implementation mechanism (realization of SCF by the payoffs and distributive functions of product and money); 3. Revelation mechanism of participants types (by a social planner); 4. Motivating mechanism (it is intended to make conditions for revelation of true information by participants about their types [6]). Objective function F is a composition of messages and result h (fig.1). 330 V. Kobets Fig. 3. Mechanism M and objective function F Mechanism can enforce to cooperation rules, when participants accomplish ac- tions, violating set rules [4]. Two extremes of mechanisms’ types are centralized (planned system) and decentralized (such as competition market), between them are continuum numbers of other mechanisms. The decentralized mechanism (saving confidentiality mechanism) implies the pri- vate expenditure (for collection and verification of information reliability) [1]. Exis- tent mechanisms can be complemented or substituted by the new ones, for example, by the means of legislation changes. Reasons of new mechanism introduction: Revelation of unsatisfactory aspects activity of existent economic systems or insti- tutes (market failures) Established economic system gives advantage only for certain participants Mechanism tasks: To ground social choice function with the desirable characteristics for society To develop compatible conditions for participants to reveal their true types (reser- vation price, costs etc.) To make implementation process of social choice function by the help of chosen mechanism (direct or indirect) The direct mechanisms provide direct transfer of the truthful private information about their types by the agents to the public planner (not realistic mechanism). The indirect mechanisms create motive, under which to the agents more profitable to open true information, than to conceal or to distort it (more realistic mechanism) [10]. During organizing of customs mechanism, as well as any other, to the participants concerns of social planner (government) and agents (payers of the customs tax). The agent is selfish person, who has private information only about him or her own type (for example, personal income, costs, profit). Economic environment is exogenous variable, given by nature or received from the last periods (competition type, technology, rule of custom policy). In model neither the agents, nor mechanism designer do not know prevailing environment. Mechanism designer knows: (i) class of environments, for which should be developed the mecha- nism; (ii) desired criterions for SCF [5; 9]. Mechanism Design for Foreign Producers of Unique Homogeneity Product 331 SCF represents criterions for result estimation, but not a means of goal achieve- ments as mechanism does. For customs house SCF mapping types space (average costs of production for im- porters) in results space (custom receipts). The participant type (average costs) de- fines its message (invoice cost of the goods), which causes final result (custom re- ceipts). So the purpose of customs mechanism can be maximization of receipts from cus- toms taxes in the state budget under creation of the appropriate motivating system for the importers (increase of the invoice price, preferential duty rates regime). This paper has a following structure. We make a literature review in this, first, part. Problem statement and basic assumptions of model are presented in the second part. Part 3 deals with main results for participants under fixed and differentiated duty rates. Last part concludes. 2 Problem Statement Search of effective ways of state budget replenishment by the means of indirect taxes requires introduction of flexible duty customs for foreign producers foreseen by the proper government laws in relation to payment of custom payments. Criteria, after which the state aims to set the duty rate on import commodities, and to foresee protec- tionism principle for domestic producers, profitableness principle for the state and utility principle, for domestic consumers. Peculiarity of optimization for import duty rate is that foreign producers, forming a competition domestic market, will maximize own profits, taking into account a mar- ket price [3], whereas for state size of custom rate depends on invoice cost of com- modity, which can be corrected by a custom house in the direction of increase and have to corresponds to prevailing (equilibrium) market price. Product invoice price indicates cost of commodity, which transfers through custom border of Ukraine. If the invoice price indicated in freight customs declaration below of average price in the base of Government custom service of Ukraine, there is the rise of product price to average level before getting customs clearance for product. From the customs value of product duty and VAT is counted, that countries are trans- ferred in a budget. Customs takes place as follows: B t TRN , where B – duty sum; TRN – part of product invoice price, that exceeds an un- taxable size (in UAH); t – duty rate (in per cent) from the product invoice price, which now in Ukraine is equal 10%. For construction of model, that describes co-operation of foreign producers and customs we assume: 332 V. Kobets n foreign firms produce homogeneous product, which is supplied to the domestic market and has no domestic analogues; between firms there is quantitative Cournot competition; cost functions of all firms are linear on production quantities (constant scale re- turn), and reverse domestic market demand function is linear on the quantity of foreign products; information about average costs of foreign firms and domestic market demand is uniformly (symmetrically) distributed between all participants (foreign producers, domestic consumers and government). Participants’ objective functions: Foreign producers: Total cost of producer i consists of variable cost (fixed cost we assume zero in long- run period, vi is average (variable) cost of producer i) and duty sum: TCiF vi qi t P qi , where Pf P is invoice price of unit product. Profit of producer i: iF P qi vi qi t P qi or iF (1 t ) P qi vi qi i q 0 max , i 1,..., n , t is endogenous vari- able, i.e. duty rate determined by government. 1. State (Ukrainian Income and Duty Ministry) n Tax proceeds to state budget is: B t Pf q . i 1 i 2. Domestic market: n Reverse linear function of domestic demand is P b c Q b c q , i 1 i where P is market price of product, b - maximal price of foreign product on do- mestic market (under zero import supply). 3 Results 3.1 Custom Receipts Model Construction for Fixed Duty Rate 3.1.1 Producer profit maximization After substitution of market price to profit function of producer i we obtain: n iF (1 t ) b c qi qi vi qi , i 1,..., n . First order condition (FOC) i 1 Mechanism Design for Foreign Producers of Unique Homogeneity Product 333 iF of profit give us: (1 t ) b 2c qi c qi vi 0 . Similarly we qi j i get partial derivatives for profit function of all producers. Algebraic transformation yields: v1 2 q1 q2 ... qn b c (1 t ) , v2 q1 2 q2 ... qn b , c (1 t ) ..................................................., vn q1 q2 ... 2 qn b c (1 t ) . Solving of system by matrix approach give optimal value of foreign producer sales (duty rate 0 t 1 ): 1 (n 1) v j n v qj b , j 1,..., n , (1) (n 1) c 1 t n v i where v i 1 - average product cost of all foreign producers. If in equation n (1) average cost of producer j lower than average cost of all producers: v j v , then after increasing of duty rate t , its optimal sales will rise. And vice versa: if v j v , then optimal sale of producer j will decrease. Total quantity of foreign producers with using of (1) will equal: n n (1 t ) b v Q qj . (2) j 1 c (n 1) (1 t ) Thus growth of duty rate always will lead to decreasing of total quantity of unique dQ good for foreign producers at domestic market: 0. dt 3.1.2 Budget Custom Receipts Maximization The receipts from foreign producers’ duty customs after substitution of total sales of import quantity in expression (2) will give: t 0 B t Pf Q max , where Pf const – product unit invoice price. 334 V. Kobets First order condition for maximization of custom receipts is determined by condi- dB tion 0 or equivalent to following equation: bt 2 2b t b v 0 , from dt here equilibrium duty rate will be equal: v t 1 . (3) b Equilibrium single duty rate (3) will have inverse relation with average cost of all foreign producers and direct relation with maximal domestic product price. Thus the invoice price of product will be set at a level P b c Q or taking into accoun (2) and (3) we will get the equilibrium indexes of invoice price and sales accordingly: Pf* b b n v , Q n b b v . * (n 1) c (n 1) . Farther from expression B t Pf Q we will define that the equilibrium (maximal) custom sum will form: b v b n v . 2 n b B* (4) c (n 1) 2 Consider dependence between equilibrium duty state and custom receipts on fig.2. Dependence of custom receipt on duty rate 100000 80000 60000 custom receipts B 40000 Bd 20000 0 % % % % % % % % % % % % % % % % % 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 -20000 duty rate, % Fig. 2. Laffer curve – dependence between custom receipts and duty rate Mechanism Design for Foreign Producers of Unique Homogeneity Product 335 (n=10, b=40, v=5,25, c=0,01, P=24,04) t=73%. 3.2 Custom Receipts Model Construction for Differentiated Duty Rate 3.2.1 Producer profit maximization Profit of foreign producer I is presented by next expression: (1 ti ) P qi vi qi i F qi 0 max , where ti is differentiated duty rate for foreign producer i. FOC for profit function gives ( i 1,..., n ): iB (1 ti ) b 2c qi c qi vi 0. . qi j i Similarly we obtain partial derivatives for profit functions of others foreign pro- ducers. b v1 2 q1 q2 ... qn c c (1 t ) , 1 b v1 q1 2 q2 ... qn , c c (1 t2 ) . ..................................................., b v1 q1 q2 ... 2 qn c c (1 t ) . n System solving by matrix approach give optimal sales values for foreign producers on domestic market (duty rate 0 ti 1 , i 1,..., n ): 1 nvj v qj b i , j 1,..., n . (5) (n 1) c 1 t j i j 1 ti 3.2.2 Budget custom receipts maximization Receipts from taxation of differentiated duty rates for foreign producers of homogene- ity products will equal: n Bd Pf ti qi ti 0,i 1,..., n max , where Pf const – invoice price per i 1 unit product for foreign producers. FOC for maximization of custom receipts is defined by following n conditions: dBd 0 , where i 1,..., n . Obtained n-equation system with n unknown duty dti 336 V. Kobets rates ti after equivalent algebraic transformations define reaction curves (6) ti fi (t i ) , which demonstrate dependence duty rate of i-th producer ti and duty rates of all its rivals t i . In this function duty rates for foreign producers have to change in a same direction. Thus increasing optimal duty rate by one of the producers requires rising of duty rates for all others foreign producers. vi (1 t j ) j i ti 1 , i 1,..., n . (6) vj b j i 1 t j Such adjustment change of duty rates will proceed until the equilibrium size of each duty rates will not be set. System solving of n equation formed from functions (6) gives the following sizes of equilibrium duty rates for foreign producers: v1 ti 1 , i 1,..., n . (7) b Obtained result shows reverse dependence between the average cost and size of dti duty rate for import product: 0 and shows direct dependence between dvi maximal price of domestic market and duty rate. From expression (7) follows that more effective producers (with average cost lower than industry average cost) will be assessed after the higher duty rate, than less effective ones for maximization of custom receipts. Now equilibrium invoice price and quantity sale with using of expression (7) and linear function of domestic demand will be set at the appropriate levels: n n b b vi b n b vi Pfd * i 1 , Qd* i 1 . (n 1) c (n 1) It is important circumstance that after differentiation of duty rates for the foreign producers, import of product on domestic market will drop Q Q , that will result d* * in rising of price for consumers. Additionally, possibility of charging lower duty rates for one producer and higher for another ones will generate corruption actions. To prevention it, necessary objective indexes for differentiation of these rates. Expedi- ence of differentiated rate introduction will arise only after condition of increase of custom receipts Bd B in comparison with the fixed duty rate (fig. 3). Mechanism Design for Foreign Producers of Unique Homogeneity Product 337 82500 82000 81500 Bd B 81000 80500 80000 71% 71% 72% 72% 73% 73% 74% 74% 75% 75% 76% Fig. 3. Comparison of influencing of the differentiated and fixed duty rates on customs receipts (n=10, b=40, v=5,25, c=0,01, P=24,04) 4 Conclusions So if as to differentiation of tariffs to take public accountant reports from the financial records audit of firm in part of total cost forming, it will decrease possibilities of re- alization of unfounded duty rates differentiation by custom house. At the same time more effective producers will be interested not to disclosure in- formation about true total cost with purpose to drop size of duty rates. Less effective producers vice versa will have motives to reveal its total cost, which below than aver- age cost per unit. Thus if less effective producers will prove that the effective ones gave false infor- mation, it will become foundation for the rise of duty rates to more effective produc- ers and decrease of duty rate for less effective. Thus such custom policy will allow to the state will put information transaction cost about producers from itself on less ef- fective producers. From one's part, more effective producers will have motives to prove that less effective producers set too high the size of its inefficiency. Collusion between all foreign producers about non-disclosure information about own costs will be highly unlikely when number of foreign producers will be grow and collusion will be high-probability when firm concentration will be high. To receive maximal custom receipts for achieving of social aim state may imple- ment differentiated duty rates for foreign producers of unique product which depend from producers’ cost. Success of this state policy will depend on effectiveness of incentive compatibility conditions for these producers, which mean extracting of true information about cost from foreign producers by the means of firms cross-sectional audit. 338 V. Kobets References 1. Dilip, M.: Decentralization, Hierarchies, and Incentives: A Mechanism Design Perspec- tive. Journal of Economic Literature, 44, 367–390 (2006) 2. Jehle, G. A., Reny, P. J.: Advanced Microeconomic Theory. Prentice Hall, New York (2005) 3. Williamson, O. E.: Markets and Hierarchies: Analysis and Antitrust Implications. Free Press, New York (1975) 4. Maskin, E.: Mechanism Design: How to Achieve Social Goals, HSE, Moscow (2009) (In Russian) 5. Nikolenko, S. I.: Mechanism Design Theory, Binom, Moscow (2009) (In Russian) 6. Archibald, G. C.: Information, Incentives and the Economics of Control. Cambridge Uni- versity Press, London (2005) 7. Narahari, Y., Garg, D., Narayanam, R., Prakash H.: Game Theoretic Problems in Network Economics and Mechanism Design Solutions. Springer Series in Advanced Information and Knowledge Processing (AIKP). Springer-Verlag London Limited, London (2009) 8. Hurwicz, L., Stanley, R.: Designing Economic Mechanisms. Cambridge University Press, Cambridge (2006) 9. Izmalkov, S., Sonin, K., Yudkevich, M.: Mechanism Design Theory. Questions of Eco- nomics, 1, 4–26 (2008) (In Russian) 10. Bergemann, D., Stephen, M.: Robust Mechanism Design. Econometrica, 73, 1771–1813 (2005) 11. Myerson, R.: Game Theory Analysis of Conflict. Harvard University Press, Cambridge (1997) Features of National Welfare Innovative Potential Parametric Indication Information-Analytical Tools System in the Globalization Trends’ Context Elena Lazareva Southern Federal University, 105, str. Bolshaya Sadovaya, 344006, Rostov-on-Don, Russia el_lazareva@mail.ru Abstract. In the article innovation-reproductive and rent-generating function of national welfare is exposed, necessity and real ways of national welfare innova- tive potential look-ahead analytical estimates methodology and tools’ revision in a context of globalization trends are offered, complex analysis of the results of system parametristic indication of the strategy of national welfare develop- ment in the innovation economic growth interests on the author’s set of instru- ments ground is conducted. Keywords. National welfare, innovation as a new form of combining industrial, intellectual and social resources; innovation rent; information-analytical tools system Key terms. NationalWelfare, CorporateModel, SpatialStrategy 1 Introduction In a context of the modern economic development model the essence of national wel- fare is expressed in new aspects – it becomes not only the accumulated re-iterative reproduction process result, but also is converted into the integrated innovation- oriented economic growth resource-factor. This conversion is connected with world and national economic systems movement towards innovative «knowledge econ- omy», competition gravity center transference to the science, education, innovative activity sphere, non-material actives role in economic reproduction process increas- ing. The resource-provided countries have the export-raw model of economy. Their de- velopment may be characterized in comparison with other countries by the rough, spasmodic rate, mainly caused by considerable raw materials prices and economic instability. Such development is inevitably accompanied by the problems which brake economic modernization and its social and innovative orientation. On the contrary, the development of the countries which realize the policy of human capital quality, 340 E. Lazareva national well-being, high technologies increment provides advantages in world socio- economic evolution, raises competitiveness of national «intellectual» economy. Increasing human development quality importance for economic growth generating and competitiveness initiated the mounting interest of economists to the subjective factor (the human capital) role in production progress. It gradually promoted the na- tional welfare parameters (at first – the individual, especially economic; later – the social, public) inclusion into the economic dynamics resource supply research system. The globalization accompanied by substantial capital mobility and national economies openness increasing transforms the national welfare economic content and display forms in reproduction process, modernizes its structure and functions in the conditions of transition to the innovative-focused economy. These tendencies find reflection in the new long-term economic trend research methodology – the methodology which equally considers society and economy inter- ests. The national welfare becomes the major productive forces element and the inte- grated institutional condition of the human capital reproduction. The world financial-economic crisis, showing critical dependence of the national economies upon mobile global resources (financial, information, technologic re- sources) and, in particular, exposing the fact that the dynamics of the Russian GDP is still to a considerable extent determined by the external factors of the conjuncture, made topical the problem of finding internal, innovative resources-sources for devel- opment. The present situation requires the internal innovative resources of social- economic development, first of all, existing resources of the national welfare, the reserves of which in Russia are still unused in full due to the underdeveloped nature of the institutions of their conversion into competitive factors of production, active usage. In this light, especial topicality is attributed to the issues related to technical- methodological analysis of the national welfare resources in the system of the global competitive resources, to determination of their role in the process of social innova- tive reproduction, conditions and mechanisms of their conversion into the innovative factors of production as well as integrated evaluation of the human capital of the country, its efficient usage and higher rate of innovation oriented development of the economic subjects and of economy overall. The strategy of the economic subjects policies’ economic-oriented modernization has to be based, due to the aforementioned facts, upon evolutional-cyclical, informa- tional-innovative paradigm of the economic development theory and upon resource analysis, in accordance with which national welfare in the postindustrial society plays the role of an integrated resource for the innovative economic trends. One witnesses not only a different nature of the input of national welfare into the reproduction proc- ess, but its various composition, i.e. apart from traditional material elements, which have cost measurement (revenue level, volume and structure of the personal con- sumption fund etc.), greater importance is attributed to its social elements – level and quality of education of the population, level of its health, housing conditions, degree of security within the society, quality of the social-ecologic habitat, social capital, social-economic mentality, condition of general and spiritual culture in the society, set of the symbolic benefits etc., which do not have market cost and, often, which have the nature of social benefits, i.e. they create general social conditions for fulfillment of a person, for creative freedom. Features of National Welfare Innovative Potential Parametric Indication … 341 In accordance with the above, the economic subjects’ innovative social-economic policies have to comprise not only the innovative processes direct support strategies and mechanisms, but also person-oriented, comfortable general social conditions for innovation-oriented development of economy creation, realized in the form of wel- fare, better life standards and insuring efficient reproduction of the human capital [1]. Orientation towards bigger human and social capital and, consequently, bigger in- vestments in the anthropo-social capital in the process of intellectual resources social reproduction as integral parts of the national welfare, constitute the basis for forming its innovative, resource-reproducing functions. It follows that the principal problem of continued innovative economic development consists of the national welfare into the innovation-initiating human and social resources – factors of production inherent to the “economy of knowledge” social-economic conversion mechanisms. Such the national welfare components institutional conversion into the innovative resources, human and social benefits economic composition is their comparative advantages (competitiveness) capitalization within the framework of countries’ integration in the world economic relations,, i.e. transformation of said advantages into the source of the added value and objects of the global companies, business, integrated structures, states innovative activities. The innovative rent, which is received due to national welfare reproduction and its conversion into the innovative-intellectual production economic resources, constitutes an economic basis of the innovation oriented development. Within the framework of the cluster theory, “network economy” – the innovative economic rent plays the role of the result of the national welfare components, situated in the country, efficient usage. The variety of the innovative rent categories is due to different categories of benefits – resources of national welfare, which constitute the source of the rent forma- tion. The research is based upon such founders of the innovation-oriented economic de- velopment theory as D. Bell, A. Buzgalin, V. Inozemtsev, N. Kondratjev, S. Kuznets, B. Kuzyk, G. Mensch, B. Milner, R. Nizhegorodtsev, D, North, V. Ovchinnikov, J. Osipov, D. Tis, E. Toffler, J. Schumpeter, J. Jakovets etc. The national welfare potential evaluating scientific basis of the welfare economy in the innovation-oriented development of economy system is considered in the works of the following authors: J. Bentham, S. Valentej, L. Walras, A. Marshall, L. Nesterov, V. Pareto, A. Pigou, A. Smith, J. Hicks, L. Erhard etc., who analyze the problems of the benefits value, of wealth formation, its distribution, conditions for market balance as a principal factor for social welfare, problems of harmonizing individual and social welfare judging by different criteria. D. Buchanan, J. Galbraith, J. Mill, W. Eucken, J. Rawls, V. Cherkovets, R. Ehrenberg etc. analyze a great number of social- economic factors, which affect the social welfare growth in the market economy. Different aspects of the national welfare role identifying in the of innovation- oriented economic development system are researched in the works of P. Aguillon, R. Barro, A. Varshavskij, J. Vinslav, S. Glazjev, I. Diskin, J. Coleman, V. Kostjuk, D. Lvov, V. Makarov, N. Moisejev, N. Rimashevskaja, D. Rodrik, S. Rosefielde, K. Salomon, A. Sen, R. Solow, J. Stiglitz, M. Todaro, F. Fukuyama etc. Study of their works allowed specifying scientific interpretations of national welfare from the new institutional evolutional cyclical paradigm of economic development point of view. 342 E. Lazareva The nature and specificity of national welfare systems functioning taking into account their correlations with innovative development of economy and its different institu- tional structures were researched by A. Auzan, P. Drucker, V. Ivanter, G. Kleiner, A. Prokhorovskij, V. Tambovtsev, F. Hayek, J. Jasin etc. The methods of national welfare resources parametric evaluation are studied in the works of S. Ajvazjan, G. Becker, N. Zubarevich, I. Maslova, M. Mozhina, R. Nure- jev, L. Ovcharova, V. Polterovich, J. Rjumina, A. Shevjakov etc. Applicable mecha- nisms and decision making technologies in the sphere of national welfare resource management are analyzed in the works of M. Baskova, O. Bogomolov, A. Dynkin, M. Musin, O. Pchelintsev, S. Rosenfeld, S. Sampler, V. Tretjak, T. Schultz, M. Jagolnit- ser, etc. Theoretical analysis of such phenomena as “informational civilization” (R. Abde- jev, S. Djatlov, M. Kastels, S. Parinov, F. Jansen), “national innovative systems” (K. Bagrinovskij, M. Bendikov, O. Golincheko, I. Dezhina, J. Lotosh), “intellectual capi- tal” (E. Brooking, A. Gaponenko, M. Malone, T. Sakaya, L. Edwinsson), “cluster development strategy” (T. Anderson, A. Weber, M. Iversen, A. Isaksen, N. Kal- juzhnov, R. Kachalov, J. Christensen, B.-A. Lunvall, A. Ljamzin, L. Markov, N. Na- grudnaja, P. Nertog, L. Nesta, M. Porter, M. Enright) was also important for paper’s conception making. Fig. 1. National welfare as the innovative process rent-yielding factor Acknowledging high importance of the aforementioned scientists research and not- ing the fact that there are fundamental approaches for exposing separate facets of the topic considered in this article, it is necessary to underline, however, that hitherto one has not realized an approach related to a complex evaluation of the national welfare as an integrated resource of the nation oriented economic development, one has not ex- Features of National Welfare Innovative Potential Parametric Indication … 343 posed its innovative-reproductive function within the framework of involving into the economic system such factors as knowledge and intellect of the nation. Little practical research has been made with regard to the issue of modernizing mechanisms of its conversion into innovative economic resources. Insufficient conceptual-methodological development of the resource approach to the national welfare analysis in the system of the innovation-oriented economic de- velopment; resource support of the innovation vector of economic development in conjunction with its theoretical-applicative topicality determined the purpose of the research. 2 Intermediate Results of Stages of Research The purpose of the article is to form a methodological basis and to elaborate a theo- retical-conceptual model and information-analytical tools system of the national wel- fare innovation-reproduction function, conditions, mechanisms and implements of using its resources in the interest of developing an innovative-economic system com- plex analysis. Achieving the set goal determined the necessity and logical sequence of solving a set of stage-by-stage theoretical-applicative tasks. The results of fulfilling said tasks could be formulated as follows: 1. Innovation-oriented development of the present day national economies within the framework of long-term global evolutional trends is more and more determined by the national welfare level and its dynamics. The national welfare resources accumula- tion induces higher volume and quality of human capital, higher labor efficiency, modernization and efficient innovation-oriented national economies development. 2. The national welfare structure encompasses not only traditional material bene- fits-resources characteristic for pre-industrial and industrial societies (real monetary revenue, volume and structure of the personal consumption fund, housing conditions, employment etc.), moreover, it includes new benefits-resources, having higher mar- ginal utility (level/quality of education and health of the population, quality of the social-ecologic habitat, freedom of access to new technologies and scientific discover- ies, social capital etc.). The definitive result of mentioned factors involving into the innovative productive cycle is revenue in the form of innovative rent creation which insures competitiveness of the entire production process. The economic composition of such national welfare resources conversion is capi- talization of their competitive advantages in the course of countries integration in the world market and network world-economic relations, especially, in the high-tech spheres, based upon high quality of the human capital, in other words, transformation of said advantages into the source of the added value and into the objects of global investment activities. Within the framework of the innovation-oriented dynamics, national welfare thus assumes the function of its resource-factor, increment of which within the world and state structure of social-economic relations becomes a key pre- requisite for the innovative economic development trend. 3. Globalization processes exert contradictory influence upon economic mecha- nisms related to the national welfare resources usage aimed at the support of innova- tion-oriented development. On one hand, they broaden the innovative-economic space of the country and the possibilities of converting the resources of its national welfare 344 E. Lazareva competitive potential into innovative-intellectual resources, on the other, globalization brings about an additional impetus for bigger inter-state asymmetry, polarization of the countries innovative development. Macroeconomic indicators analysis character- ize the level of the countries integration into the global innovation-oriented economy, which showed a high level of interstate developmental inequity and lack of competi- tiveness of a set of components of the national welfare. This situation decreases the level of converting separate components of the national welfare into innovative eco- nomic development resources. 4. The national welfare resources accumulation-consumption (reserve-flow) values correlation is a distinctive indicator of the innovative reproduction process cyclical development. During stagnation, low rate of economic dynamics, accumulated na- tional welfare is depleted (as the result of mobilizing its certain part in order to insure innovative economic growth), and during rise, high growth rate, the situation is oppo- site, national welfare is accumulated due to added national revenue, creating thus an integrated basis for a long-term incremental trend of social economic innovation- oriented development. 5. Greater role and larger scale of the economic development innovative factors change traditional perception of the classic stages of the modern expanded produc- tion. The stage of accumulating intangible assets - factors of production, which create the innovative economy resource basis becomes the initial and principal stage in the new scheme of reproduction economic relations. This stimulates national welfare accumulation with the view of achieving higher productivity, fist of all, of the intel- lectual resources, of human capital, creation of the institutional habitat, beneficial for elaboration and distribution of innovations, and due to these factors higher rate of innovative economic dynamics. 6. The need to convert of the accumulated tangible and intangible national welfare resources into innovative development factors is embodied in the new priorities and strategies of the long-term state economic policy, in accordance with which the inno- vative growth of economy is due to observance of the principal of correlation and balance of the imperatives of economic efficiency, social justice and ecologic stability as the three principal criteria of the innovation-oriented reproductive development of a high aggregation level. Moreover, this includes the fact that the state innovative policy assumes a new objective function – the function of balanced social-economic interests of the national, regional (local) and global economic subjects in the process of accumulation, reproduction and usage of the national welfare resources with the view of innovative growth. Studies of the new model of subject-object relations in the national welfare repro- duction system and conversion of its elements into resource sources for innovative growth showed that the “network reality” conditions makes topical the issue of elabo- ration of a collective strategy for the the national welfare development, in which the aspect of “co-operation” prevails over the aspect of “competition”, and the classical model of the civil society, based on legal definitions of liberalism and market regula- tion, is replaced by the corporate community model (fig. 2). Features of National Welfare Innovative Potential Parametric Indication … 345 Fig. 2. Subject-object relations’ conceptual model Coordination and balance of the specific interests of innovation-oriented develop- ment subjects in reproduction of national welfare as a national benefit, based upon joint advantages, trust and state-personal partnership make one of the important methodological principles for the corporate strategy formation, which helps to opti- mize management and to achieve higher efficiency of national welfare usage with the view of national economic growth. Determination of the ideal “hierarchical chain” of interests of the economic subjects and orientation of the adequate stimulation policy towards it is one of the alternatives for realizing the coordinative methodological principle. 346 E. Lazareva 7. The national welfare resources inclusion into the economic asset balance of the country signifies their interpretation as a source for added value in the long-term in- novative cycle of economic dynamics. The most adequate approach to economic evaluation of tangible and intangible components of national welfare is a modified variant of the Hartwick-Solow princi- ple, in accordance with which it is necessary to consider the innovative rent as the principal source of the national revenue, a part of which is channeled into national welfare accumulation, that brings about a higher resource potential of the long-term innovation-oriented economic development – it is re-invested in better quality of the human capital and amelioration of the social-ecologic conditions of its reproduction (education, healthcare, fundamental science, social infrastructure, lower environment pollution). 8. Rent revenue, which is received due to efficient usage of the new categories of intangible benefits – informational, innovative, infrastructural, intellectual benefits that directly insure reproduction of human capital, play a greater role in the national economies innovation-oriented modernization of national economies. As it is shown by the analysis of said processes in Russia, taking into account the existing institu- tional deficits, underdeveloped nature of venture business, lack of a systematic, high- quality network habitat, favorable for diffusion of innovations and weak interest of the economic subjects in their elaboration and implementation, capitalization of the present innovative potential of national welfare (infrastructural, educational- intellectual, informational welfare) is rather difficult, whereas innovative rent is gained only in separate, isolated cases. As the result a considerable part of the existing tangible and intangible national welfare resources, fist of all, intellectual and human resources, is not capitalized. This fact brings about a lower competitiveness of the country. 9. Need to indicate and insure elaboration of the national welfare components con- version into innovative economic resources mechanisms presupposes analysis and evaluation, to be executed in the state management system, regarding the level and dynamics of reproducing its four components – quality of the population proper, ma- terial life standards of the population, quality of the social habitat and quality of eco- logic state of the natural-economic complex. Complex evaluation of the aforemen- tioned four components of the national welfare with the view of achieving innovative economic growth is based methodologically upon usage of sophisticated theoretical- analytical set of implements including a set of formalized methods and models of determining latent connections between national welfare and innovation-oriented economic growth (which form a unity of the innovative reproduction process) as well as evaluation of the innovative effects due to a higher level of converting different national welfare components into factors of innovative growth. The elaborated set of implements allows to analyze efficiency of the existing na- tional welfare resource structure, to expose its limiting components and to form on this basis a strategy for a long-term economic policy, aimed at development of institu- tions, which increase national welfare resources competitiveness and the level of their conversion into productive sources of innovative economic growth. A distinctive feature and advantage of the elaborated model set of implements is the possibility to use it in order to accumulate analytical information regarding the results and parameters of economic, social, ecologic strategies related to accumulation Features of National Welfare Innovative Potential Parametric Indication … 347 and increment of the national welfare resources with the view of achieving a higher national economic dynamics and to thus provide (as opposed to the traditional imple- ments) a more adequate evaluation of the mechanisms used in the state economic policies related to support of the innovation-oriented economic development trends (figure 3) [2]. 10. Diagnostics effected on the basis of the set of implements with regard to the na- tional welfare as an integrated resource of innovation-oriented economic development of Russia state-of-the-art within the global coordinates framework (figure 4) and integral innovative effect of its increment showed that due to realized innovative wel- fare management strategies (including strategies of a higher level/quality of education and lower sickness rate of the population, higher buying power of its monetary in- come per person and lower level of poverty, development of the social infrastructure, higher social-territorial mobility and level/conditions of employment of the popula- tion, development of smaller business and greater freedom of entrepreneurs, creation of a dynamic information infrastructure and better access to technologies and science etc.), the indicators of the Russian economy subjects may be increased approximately by 1.5 times mostly by means of better social sphere quality [3]. The innovative effect due to national welfare increment and its transformation into innovative economy resources indicators: level of economic subjects’ innovative ac- tivities, level of conversion of national welfare into innovative growth competitive factors, innovative rent capitalization level are the key parameters which characterize the proportions between accumulation and consumption of the national welfare re- sources state policy. The quotients, calculated (with regard to Russia) with the view of proposed struc- tural model empiric verification and reflecting the dependence between the dynamics of the innovative activity of the economic subjects parameters and the parameters of the national welfare resources (average expected lifetime, GDPPPP per person, Gini index and ecologic stability index) showed that, within the integrated effect indicator among the four basic components of national welfare, greater importance is held by the social sphere resources which reflects the priority of social, socially-advantageous benefits – social capital accumulation sources reproduction. These sources are charac- terized by such important properties from the point of view of the innovative growth as: positive network effects and their higher marginal utility in the course of their use; therefore, the level of the national welfare resources into factor sources of innovative growth conversion greatly depends upon the state of the social sphere – the elasticity quotient was 1.724 and the correlation quotient was 0.671, then, following the order of lower dependence one has the ecologic habitat quality – accordingly 0.463 and 0.324, quality of the population – 0.137 and 0.393 and the material life standards – 0.057 and 0.442. 348 E. Lazareva Fig. 3. Model set of implements for state strategy’ analytical evaluation Features of National Welfare Innovative Potential Parametric Indication … 349 Fig. 4. National welfare state-of-the-art estimation 11. In the course of interregional comparison of the conditions and existing limits for realizing the policy of state national welfare resources reproductive proportions between accumulation and consumption optimization, one detected a domineering correlation between the economic subjects innovative activity indicator and indica- tors of the achieved level of conversion. The calculations effected by the author ac- cording to special methods showed, in particular, the following typology of de- pendence of the features of the economic subjects innovative activity on the pa- rameters of domineering kinds of national welfare resources (social sphere re- sources) into innovative growth factors conversion which determine the priorities of the long-term state economic policy: for an economy, which is characterized by a low, medium, high fully realized dependence of the features of the economic sub- jects innovative activity upon the parameters of the social sphere resources con- version, priority belongs, therefore, to the strategy of developing social infrastruc- ture and higher level/quality of employment of the population, strategies of devel- oping smaller business and greater freedom of entrepreneurs, strategies of easier access to scientific achievements and to new technologies, information infrastruc- ture development (figure 5). The detected innovative effects indicate the priorities of the social-economic policy, in which the main role belongs to investments into the innovative national welfare resources: housing conditions, social and informa- tion infrastructure, science, education, healthcare, culture etc. 350 E. Lazareva 3 Conclusions The obtained results show that due to social conditions, factors and motives of behav- ior more important role, social capital resources greater importance, it is necessary to elaborate a harmonized systematic program of innovation-oriented long-term eco- nomic policy modernization and to create a favorable social-economic climate in the country on the basis of the existing national welfare. The systematic approach means a reconsidered hierarchy of social-economic pri- orities within the framework of the person oriented innovative economic growth para- digm. In this light the state has the following tasks of paramount importance: amelio- ration of the overall conditions of employment and population housing, recreation of the salary reproductive function (first of all, on the basis of adequate evaluation of the level/quality of education); accelerated development of the intangible investment complex and social infrastructure, realization of human-saving social programs; con- sistent industrial policy which would activate innovative activity mechanisms and socially responsible behavior of the corporate subjects which are capable of making their contribution into development of national welfare and human potential of the nation. Fig. 5. Spatial strategic “developmental crystal” of Russian national welfare In the present situation efficient mechanisms of balanced innovation-oriented eco- nomic development may be formed only on the basis of the state, civil society and business integrated efforts in order to achieve consistent expansion and rectification of opportunities for the representatives of different social, professional and territorial population groups via reproduction of the national welfare resources as a social bene- fit. This has to be reflected in the system of innovation-oriented long-term social eco- nomic development strategic management. Features of National Welfare Innovative Potential Parametric Indication … 351 The national welfare resources will be realized in an efficient way within the framework of an innovative economy, only if there is a stable need for them from the reproductive process. The strategic task is to bring about long-term correlation of the national welfare resources demand and supply in the innovative development of the economy. The developed theoretical analytical tool allows to evaluate not only efficiency of accumulation and usage of the national welfare resources, moreover, it makes possi- ble to determine the innovative effect due to a higher level of their conversion into the sources of innovative growth. References 1. Lazareva, E. I.: National Welfare as an Integrated Resource of Innovation-Oriented Devel- opment of Economy. Theoretical-Methodological Aspect. Publishing House of the South- ern Federal University, Rostov-on-Don (2009) (In Russian) 2. Lazareva, E. I.: Strategy of National Welfare in Interest of Innovative Economic Growth Development: Results of Systematic Parametrical Indication. Economic Newssheet of the Rostov State University, 3, 65–74 (2010) (In Russian) 3. Aleshin, V. A., Lazareva, E. I.: National Welfare Increment as the Imperative Institutional Determinant of Regional Systems’ Development in the Innovative Processes’ Globaliza- tion Context. Social Inequality and Economic Growth, .4, 9–18 (2012) Matrix Analogues of the Diffie-Hellman Protocol Alexsander Beletsky1, Anatoly Beletsky1 and Roman Kandyba1 1 Department of Electronics National Aviation University of Kiev, 1, av. Cosmonaut Komarov, 03680, Kiev, Ukraine alexander.beletsky@gmail.com, abelnau@ukr.net, romankandyba@mail.ru Abstract. This paper presents a comparative analysis of several matrix analogs of the Diffie-Hellman algorithm, namely, Yerosh-Skuratov and Megrelishvili protocols, as well as alternative protocols based on irreducible polynomials and primitive Galois or Fibonacci matrices. Binary matrix is primitive, if the se- quence of its powers in the ring of residues mod 2 forms a sequence of maxi- mum length ( m sequence). Offer alternative protocols and discuss ways to improve the reliability of their. Keywords. Encryption key exchange protocol, the irreducible polynomials, a primitive element of Galois field, primitive binary matrix Key terms. Research, CryptographyTheory, MathematicalModelling 1 Introduction The Diffie-Hellman algorithm (DH-algorithm) [1] assumes that two subscribers – Alice and Bob both know the public keys p and q , where p is a large prime num- ber, and q is a primitive root. Subscriber Alice generates a random big number a , computes A q a mod p and sends it to Bob. In turn, Bob generates a random big number b , computes B qb mod p and sends it to Alice. Then subscriber Alice raises number B received from Bob to her random power a and calculates Ka B a mod p qba mod p . Subscriber Bob acts similarly, calculating b ab Kb A mod p q mod p . It is obvious that both parties receive the same number K because K a K b . Then Alice and Bob can use this number K as a secret key, e.g. for symmetric encryption because a foe who intercepts numbers A and B faces with virtually unsolvable (in a reasonable time) the problem of calculation K , under the condition, that numbers p , a and b were chosen big enough. Matrix Analogues of the Diffie-Hellman Protocol 353 2 Yerosh-Skuratov Protocol In order to form a secret encryption key in the public network by subscribers Alice and Bob, the authors [2] propose to use DH protocol in the cyclic group of matrices M , and the matrix M is considered as public information. It is assumed that Alice generates a random index x , calculates the matrix M x and sends it to Bob. In turn, Bob generates a random index y , calculates the matrix M y and sends it to Alice. Then both subscribers raise the matrices obtained from a partner in their secret powers and calculate the sheared matrix (encryption key) K M xy M yx . The matrix M must be a high-order matrix (at least 100); so, the authors assert (by the way, without a proof), cracking key has invincible complexity. However, in [3] it has been proved, that Yerosh-Skuratov protocol can easily be cracked based on the generalized Chinese remainder theorem. 3 Megrelishvili Protocol The essence of the protocol [4] is following. Binary initialization vector V and primi- tive matrix M of order n are accepted as a public key. Subscriber Alice generates a random index x , calculates the vector Va V M x and sends it to Bob. In turn, Bob generates a random index y , calculates the vector Vb V M y and sends it to Alice. Then Alice computes the key K a Vb M x V M y x , and Bob computes the key Kb Va M y V M x y . It is quite obvious that using such data exchange protocol, both parties receive the same private key K , because K a K b K . The algorithm of generating the matrices in Megrelishvili protocol is fairly simple and can be explained by the following calculation scheme 1 0 1 0 1 1 0 1 0 1 M 1 1, M 3 1 M1 0 , M 5 0 М3 1 , (1) 0 1 0 1 0 0 1 0 1 0 As it follows from (1), the matrices M i , i 1, 2, , are matrices of odd order only that can cause some difficulties when they are used in cryptography. This shortcom- ing was remediated by replacing matrices of type (1) by primitive matrices of an arbi- trary order that is synthesized based on the so-called generalized Gray transforms [5]. The essence of these transforms is explained below. The matrix form of direct (for simplicity denoted by number 2) and inverse (de- noted by number 3) classical Gray transforms (codes) [6] can be presented in the form 354 A. Beletsky, A. Beletsky and R. Kandyba 1 1 0 0 1 1 1 1 0 1 1 0 0 1 1 1 (2) 2 : ; 3 : ; 0 0 1 1 0 0 1 1 0 0 0 1 0 0 0 1 where as an example, the order of the matrix n is set n 4 . Matrices (2), which we call left-sided Gray transform matrices, are in correspon- dence with the right-sided transform matrix defined by the following relations: 4 : 121 2T ; 5 : 131 3T , (3) where 0 0 0 1 0 0 1 0 (4) 1 : 0 1 0 0 1 0 0 0 is the matrix (operator) of the inverse permutation. The set of operators (2) – (4), supplemented by the operator 0, or e (identity ma- trix), forms a complete set of simple Gray operators. From the elements of simple Gray operators, one can form so-called composed Gray codes (CGC) generated by the product of simple (elementary) Gray codes. The simplest examples of CGC 121 and 131 can be seen in (3). Both simple and composed Gray codes have a number of re- markable properties. Firstly, the corresponding transformation matrices are nonde- generate and, therefore, are reversible. Secondly, there are simple inversing algo- rithms for CGC. And, finally, there are “crypto-order” CGC which have the property of primitiveness. Examples of such codes are given in Tab. 1. Table 1. Gray Composite codes delivering binary matrices property of primitiveness The order of the matrix (n) 32 64 128 256 2244424 22533435 2425535 22533435 2442224 22534335 2433534 22534335 12242253 24334225 2435334 24334225 12242443 25224334 22524224 25224334 12252242 222524424 22533334 2222535224 Suppose M is a primitive binary matrix generated by the CGC G . With respect to such matrices, the following assertion can be easily proved by the test method. Assertion. The primitiveness of matrices M is invariant to the group of linear transformations of the CGC G generating matrix M and transformations of similarity over these matrices. Matrix Analogues of the Diffie-Hellman Protocol 355 The group includes the following operators: cyclical shift, assess statement, inversion and conjugation as well as arbitrary combinations of these operators. Trans- formation forms matrix M p , which is similar to M and determined by the rela- tion M p = P M P 1 , where P is a permutation matrix. 4 Alternative Protocols This section proposes two options for alternative matrix protocols of secret key ex- change on the open channel of communications. The procedure for the formation of the encryption key K in the first version of the protocol is based on the use of two public and one private key for both subscribers. As a public key a binary initialization vector V of n order and any irreducible polynomial (IP) n of n order are chosen. Private keys are primitive (forming) elements of the Galois field GF (2n ) over the IP n , from which the subscribers (Alisa and Bob) form the primitive secret trans- ( ) formation matrices Gn a and G( b ) respectively. The element of the field n GF (2n ) is primitive over IP n , if the minimum rate e , at which (e 1) mod assumes the value e 2 n 1 . Matrix G( n ) we call Galois matrices. The synthesis of algorithm for such matrices is explained on a concrete example. Let’s IP 8 100101101 , and the generating element (GE) of subscriber Alisa a 111 . We obtain 1 1 1 1 0 1 1 1 1 1 1 0 1 1 0 1 1 1 1 0 0 0 0 0 . 0 1 1 1 0 0 0 0 A Ga 0 0 1 1 1 0 0 0 0 0 0 1 1 1 0 0 (5) 0 0 0 0 1 1 1 0 0 0 0 0 0 1 1 1 According to (5), the procedure of filling in the matrix Ga is carried out under the following scheme. First, the GE a is arranged in the bottom row of the matrix. The elements of this row in the left from the GE elements are filled with zeros. Subse- quent rows of the matrix (in the direction from bottom to top) are produced by a shift of previous lines. If left element of shifted line is 0, then the cyclical shift by one bit to the left (circular scrolling clockwise). In the case where the left element of shifted line is 1, the conventional shift of the line on one bit to the left and 0 is written to the vacant right element in line. Digit capacity of these lines is one bit more than the or- der of the matrix. The vectors corresponding to these lines are given to the residue 356 A. Beletsky, A. Beletsky and R. Kandyba modulo IP n that returns them the capacity, which coincides with the order of the matrix n . Subscriber Bob forms similarly the Galois matrix B Gb using his primi- tive element b . The introduced Galois matrices have some interesting properties. First, the matrix product is commutative, i.e. A B B A . At the same time, secondly, if at least one of the GE is not a primitive of the IP, the commutative property of matrices is lost. Based on the above properties of Galois matrices a key exchange protocol was pro- posed. We consider that initialization vector V and the IP are known. Alice chooses a secret primitive over GE a , forms a Galois matrix A , calculates the vector Va V A and sends it to Bob. In turn, the subscriber Bob selects a primitive GE b , forms a matrix B that calculates the vector Vb V B and sends it to Alice. After that, both parties multiply vectors obtained from the partner, in own secret Galois matrix. Thus, a shared secret key K will be formed by the fact that the product of primitive Galois matrices over the same IP is commutative, and this implies the identity K a Vb A V B A K b Va B V A B . Instead of Galois matrices G , Fibonacci matrices F can be used in the protocol with the same success. Fibonacci matrices are associated with Galois matrices by equation F G, or F = G ; G F , where means the operator of right transposition, i.e. transposition with respect to the auxiliary diagonal matrix. In the second alternative embodiment of the protocol the secret key K is com- puted in two rounds. In the first round, which repeats the above-considered first ver- sion of the protocol, a common to both subscribers secret binary vector of n th or- der V p is formed. On the basis of this vector, Alice and Bob compute the common permutation matrix P . One can propose different ways of constructing matrices P . Let us consider one of them. Let’s n 8 and N is the decimal equivalent of the vec- tor V p . The task is to create permutation matrix P8 of order eight for value N . Choose one or another way of numbering elements of matrices P8 from 0 to 63. Cal- culate the value n8 N mod 64 and write 1 in that element of the matrix, whose number is equal n8 . After that, delete from the matrix P8 the row and column, which contains 1. We obtain a matrix P7 of 7-th order, whose elements are numbered from 0 to 48. Find the value n7 N mod 49 , which is determined by the location 1 of the matrix P7 and, consequently, in the matrix P8 . Following the proposed method, one can simply construct a permutation matrix P of any order. Let proceed to the second variant of the encryption keys protocol. This protocol uses two public keys, which are the initialization vector V , and the irreducible poly- Matrix Analogues of the Diffie-Hellman Protocol 357 nomial , and also two private keys. These keys are generated by Alice and Bob as a random primitive over IP Gees and . The protocol runs in two rounds. In the first round based on public keys V , and secret GE network operators calculate the total permutation matrix P . The second round is performed in the following or- der. Alice chooses a primitive over GE a , forms Galois matrix A , then similar matrix Ap P A P 1 , computes a vector Va V Ap , and sends it to Bob. In turn, Bob chooses a primitive over GE b , forms Galois matrix B , then similar ma- trix B p P B P 1 , computes a vector Vb V B p and sends it to Alice. After that, both parties multiply vectors obtained from partners on their secret similar Galois matrix. Thus, the shared key K will be generated due to the fact that the matrices Ap and B p maintain the properties of primitiveness and commutatively of primary ma- trices A and B , respectively. 5 Protocol of Vagus Keys One of the major drawbacks of alternative algorithms key generation algorithms for open key cipher infrastructure, in particular the mentioned above the way of synthesis Galois matrix (by the diagonal fill method), is that it could be easily compromised. To prove that, let's see the vector Va V G (fn a ) , (6) created by Alice. By the theory of polynomials of one variable x , we know that product of any polynomial n x power of n by x is equivalently either simple shift of polynomial for one bit left or incrementing the power of polynomial, x n x n 1 x . (7) Taking formula (7), let's represent the Galois matrix G (fn a ) the power of n by, x n 1 x n 1 n2 n2 x x G (f ) (m od f n ) (m od f ) E , (8) n n x x 1 where E the unit matrix. From formulas (6) and (8) we can get, 358 A. Beletsky, A. Beletsky and R. Kandyba Va V a (mod f n ) , (9) where all parts are known, except a . Solving the equation (9), we found: a Va V 1 mod f n . (10) For example, let's use the matrix G (fn a ) , given by expression (5), where n 8 , a 101101 , f8 101001101 , so f8 is public, a is private keys of protocol. As initialization vector we choose V 11010010 , that corresponds to invert by modulus f8 vector V 1 110010 . By formula (9) we get Va 10111111 . Putting the Va and V 1 is the right side of expression (10) and taking modulus f8 of vectors multiplica- tion results, enemy (Eva) is getting private key a of Alice. The same way, Eva could found secret key b of Bob. After secret keys a and b are found it's trivial to calculate secret key K . The security of alternative protocols could be increased up to security level of algo- rithms based on problem of factorization of modular multiplication of big numbers if we assume that there is secret parameter , both known to Bob and Alice. The modification of protocol [6] is the be following. Assume, there are authorized subscribers that have secret parameter as binary vector of n order. Parameter could be transported from Alice to Bon (or otherwise), e.g. by RSA protocol. Alice is generating random of n order number a and computing generating element a a mod f n , (11) by means of generating element Alice is forming Galois matrix G f a , calculating n vector Va V G (fn a ) and sends it to Bob. In the same way, Bob send to Alice vector Vb V G (fb ) , where b b mod f n . n As it shown above, generating elements a and b could be easily computed, so authorized subscribers Alice and Bob, but not Eva, could calculate secret parameter of partner. As example, by formula (11) Bob calculates a a 1 mod f n , that gives him and Alice ability to calculate secret key K a b mod f n . Key K as well as any function of it, could be taken as a secret parameter K for session key generation for public key cipher channels. We call that way of key generation – protocol (algorithm) of vagus keys. Vagus keys algorithm could be used in both motioned above protocols. The major benefit of vagus key generation algorithm is protection from "man in a middle" type of attack. It's been archived by including in Galois matrices key generation elements of secret element , known only by Bob and Alice. In case of secret element is changed Matrix Analogues of the Diffie-Hellman Protocol 359 by element e of Eva, makes it impossible to Eva to calculate parameters a , b as well as general cipher key K . 6 Conclusions The article analyzes the known matrix algorithms for exchanging encryption keys between subscribers of a network of open communication channels. The algorithms are based on the modified asymmetric Diffie-Hellman protocol. The essence of the modification is reduced to replacing the large prime numbers of Diffie-Hellman algo- rithm by assurance nondegenerate primitive binary matrices of high order. Methods of synthesis of these matrices are proposed based on both the generalized Gray codes, and irreducible polynomials. New key exchange matrix protocols have been devel- oped. The protocols developed are superior for cryptographic strength to known cryp- tographic protocols, particularly Yerosh-Skuratov and Megrelishvili protocols de- scribed in this paper. The proposed variants of vector-matrix protocols for exchanging by cryptographic keys on open communication channels have a good prospect to be applied for sym- metric encryption in computer networks protected from the substitution of data, pro- viding the necessary level of protection of private keys from unauthorized access. These protocols can make a strong competition to more resource-intensive RSA pro- tocol. References 1. Diffie, W., Hellman, M. E.: New Directions in Cryptography.IEEE Transactions on Infor- mation Theory, IT-22(6), 644–654 (1976) 2. Eros, I. L., Skuratov, V. V.: Addressing Message Transmitting Using Matrices Over GF (2). Problems of Information Security. Computer Systems, 1, 72–78 (2004) (In Russian) 3. Rostovtsev, A. G.: On the Matrix Encryption (Criticism Yerosh-Skuratov Cryptosystem), http: www.ssl.stu.neva.ru/psw/crypto/rostovtsev/Erosh_Skuratov.pdf (In Russian) 4. Megrelishvili, R. P., Chelidze, M. A., Besiashvili, G. M.: Unidirectional Matrix Function - High-Speed Diffie – Hellman’s Analog. In: Proc. 7-th Int. Conf. Іnternet - Education - Sci- ence 2010. VNTU, Vіnnitsya, 341–344 (2010) (In Russian) 5. Beletsky, A. Ja., Beletsky, A. A., Beletsky, E. A.: Gray Transformations.V.1. Fundamen- tals of the theory. V. 2. Applied aspects. NAU Publishing House, Kiev (2007) (In Russian) 6. Beletsky, A. Y., Beletsky, A. A.: Synthesis of Primitive Matrices over a Finite Galois Fields and their Applications. Information Technology in Education: Collected Works, 13. Kherson: KSU, 23–43 (2012) (In Russian) Are Securities Secure: Study of the Influence of the International Debt Securities on the Economic Growth Darya Bonda1 and Sergey Mazol2 1 Belarus State Economic University, Minsk, Belarus bondadasha@gmail.com 2 Academy of Public Administration, Minsk, Belarus mazols@yandex.ru Abstract. The paper studies the interdependence of the amount of international debt securities, amounts outstanding by country (borrowers) and the GDP growth by country. The author have chosen 34 countries, that represent every region included in the BIS classification, that is developed countries, offshore centers, developing Europe, Latin America, Asia and the Pacific and Africa. It was found that the excessive amount of such type of securities in comparison with GDP leads to slowdown in the economic growth next year. Keywords. International debt securities, Economic growth, Financial crisis Key terms. Development, MathematicalModel 1 Introduction The last financial turmoil has revealed the drawbacks of the existing global financial system. Surprisingly, the worst crisis since the Great Depression has offered a range of opportunities to the world society: to examine the system, exclude “toxic” elements and introduce new methodology to financial regulation. During the last decades new avenues for financing were creating, deepening the fi- nancial system aside from widening the choice of monetary instruments [1] that have caused overestimation of assets and, consequently, financial collapse. Despite this issue is under thorough control of Bank of International Settlements, Securities and Exchange Committee, International Derivative and Swap Association, Securities Industry and Financial Market Association, every scientists, analyst, gover- nor, outstanding person and a regular student has its own interpretation of how the crisis works, its causes and consequences. Are securities secure … 361 One of the reasons for the growing financial instability was the excessive amount of various types of securities both in national economies and international arena as well as the complexity of the securities issued. New types of financial instruments usually at first are accepted as great invention of humanity, then, especially during recessions are usually blamed for crisis for speculation reasons [7]. After the recovery, they are still widely spread all over the world. Futures, options and other derivatives have experienced such an attitude [4]. International debt securities are considered to be a financial instrument. The amount of securities outstanding in 2007, i.e. country’s liabilities, could prevent countries from sustainable growth in 2008. To the author’s point of view, it is reasonable to study the interdependence of the amount of International debt securities, amounts outstanding by country (borrowers) and the GDP growth by country. The presence of such interdependence can allow us to criticize this type of securities and advise the countries to minimize their usage for the sake of sustainable economic growth. 2 Results The authors are analyzing the interdependence of the amount of international debt securities outstanding in 2007, and the economic growth, expressed in GDP index in the research. Debt security is a negotiable financial instrument serving as evidence of debt [5]. The statistics on international debt securities issues cover long-term bonds, notes, short-term money instruments [2]. Debt securities include government bonds, corpo- rate bonds, CDs, municipal bonds, preferred stock and collateralized securities (such as CDOs, CMOs, GNMAs). Debt securities may be protected by collateral or may be unsecured, which underlines the importance of scrutinizing them as one of the key instruments of securitization. Collateralized debt obligations are considered to be a risky instrument as far as their coupons and principal repayments are dependent on a diversified pool of loan and bond instruments, either purchased in the secondary mar- ket or from the balance sheet of an original asset owner (Handbook of Securities Sta- tistics). Consequently, through assessing the value of underlying assets, collateralized debt obligation as well as other asset-backed securities spread risk while diversifying it, meanwhile creating a range of credit derivatives. These instruments are widely used to make the debt more liquid and make the money lent work as if they were not borrowed and, furthermore, get a margin. Therefore, the author finds it crucial to pay significant attention to this kind of financial instrument as a mean of spreading risk of insolvency of an entity within the international scale. The BIS definition of international securities (as opposed to domestic) is based on three major characteristics of the securities: the location of the transaction, the cur- rency of issuance and the residence of the issuer. International issues comprise all foreign currency issues by residents and non-resident in a given country and all do- mestic currency issues launched in the domestic market by non-residents [2]. GDP in current price, purchasing power parity, is the second element of the re- search. It was chosen as an indicator of the national output, combining real and finan- 362 D. Bonda and S. Mazol cial sector, thus reflecting the size of the economy. The amount of international debt securities can be compared to the GDP, as both indicators reveal the capacity of the countries’ economies. In the figure 1 one can see historical correlation between the international debt se- curities and GDP in current prices (data from [1], [2], [3], [8]), which shows that the interdependence between 2 components really exists, in addition, during 2002-2008 the line shows different slope. In 2004 and 2005 the inclination is lower, which means the slowdown in GDP growth rate and IDS amounts and, on the contrary, in 2007 and 2008 the graph indicates the rise of the world economies. Historical Correlation between IDS and GDP, world, 2002-2008 70000 60000 50000 GDP, bln UDS 40000 30000 20000 10000 0 0 5000 10000 15000 20000 25000 30000 IDS, bln USD Fig. 1. Historical correlation between IDS and GDP, world 2002-2008 As mentioned in the title, the graph shows the world’s tendency. The last economic crisis has damaged major economies leaving some small and developing economies untouched, despite of its scale. This means that by-country analysis is necessary to provide the real evidence of such correlation. The authors have chosen 34 countries that represent every region included in the BIS classification that is developed countries, offshore centers, developing Europe, Latin America, Asia and the Pacific and Africa. Although the most variation in the amount of securities outstanding had been noticed while scrutinizing the data from developed countries the data utilized represents each continent. Africa is represented by Egypt, Lebanon, Saudi Arabia and UAE, Asia and the Pacific – the Philippines, Singapore, Japan and China, Developing Latin America – Argentina, Brazil, Colom- bia, Costa Rica, Developing Europe – Belarus, Bulgaria, Czech Republic, Estonia, Russia, Ukraine, offshore centers – by the Bahamas, Developed economies – by Aus- tralia, Austria, Belgium, Canada, Finland, France, Germany, Greece, Iceland, Italy, Norway, Spain, Sweden, the UK and US. The X, independent variable, is the amount of international debt securities by country outstanding in 2007 divided by the nominal GDP in billion of USD in 2007, whereas the dependent variable is GDP growth rate in 2008 in comparison with 2007. The linear regression is shown on figure 2. Are securities secure … 363 Correlation between IDS/GDP 2007 and GDP growth rate 2008/2007 y = -10,071x + 20,058 R2 = 0,5193 35 GDP growth rate 2008/2007 30 25 20 15 10 5 0 -50,00 0,50 1,00 1,50 2,00 2,50 3,00 3,50 -10 -15 IDS/GDP coefficient Fig. 2. Correlation between IDS/GDP 2007 and GDP growth rate 2008/2009 The model is described by formula 1: GDP= 20,058-10,071*COEF (1) where GDP – nominal growth rate in 2008 compared to 2007; COEF – IDS/GDP coefficient as a ratio of IDS amounts outstanding 2007 and GDP (in current prices, power of purchasing parity, 2007) The slope is negative, which means the opposite correlation between variables, the bigger the coefficient in 2007, the lower the GDP growth rate in 2008, which makes sense and confirms the author’s theory. The correlation coefficient, which shows the fraction of relationship between variables is -0,72 or 72% of opposite relationship. Therefore, the relation is considered to be strong. The determination coefficient is 0,52 or 52%, which means the variation of the dependent variable is explained by 52% by the independent one. The linear regression has been chosen because the goal of the author was to find the existence of the relationship between variables, and in addition, negative one, how- ever, admitting the complexity of the relationship, not exactly finding the most appro- priate one. Among the obstacles which prevent linear model from being “best fit” are, first of all, the different level of financial market development and, thus, the vital need for specific financial instruments usage. For example, Belarus is not yet ready for devel- oping credit derivatives, besides, the amount of the securities outstanding has been the same for a couple of years, that means that the amount of securities is not the princi- pal reason for economic crisis. Another reason is overstated prices in some countries which result in the high inflation level, which increases the GDP index far too high to depict the precise relationship between variables. Examples can be Ukraine, Russia and Argentina. As far as offshore centers are concerned, their GDP doesn’t always 364 D. Bonda and S. Mazol illustrate the real production but the value of financial operations, so they need indi- vidual approach. However, there are some deviations in the model: for example, Ice- land – at the lowest point, because of negative GDP growth, or the US and The UK with the lowest GDP growth in 2008/2007 and highest amount of securities within examined countries. In the real economic world one can hardly obtain “pareto efficient” outcomes, thus no sector of the economy can be better off without making another worse off [6]. Therefore, the increase of the international debt securities, i.e. rise in liquid liabilities, results in lowering the GDP growth rates. Even though the GDP measures only mar- ket production and cannot totally used as a measure of country’s well-being [6], a better measurement of economic activity hasn’t been offered yet. Thus, the influence of financial sector on the overall economy proves the existence of “systemic external- ity” [6] of the financial market. Ranging Countries 35 3,50 30 3,00 IDS/GDP COEFFICIENT 2007 Nominal GDP growth rate 25 2,50 20 2,00 15 1,50 10 1,00 5 0,50 0 0,00 Brazil Sweden Czech US UAE Iceland Ukraine Lebanon Canada Japan France Finland Austria Spain Singapore Greece Australia Colombia Estonia Costa rica China Saudi Arabia Bulgaria Argentina Ukraine Russia Belgium Bahamas Norway Belarus Philippines Germany Italy Egypt -5 -0,50 Fig. 3. Ranging countries The figure 3 is built with the purpose to prove the main idea of the research: the bigger the volume of IDS outstanding (here IDS/GDP coefficient to make it adjust- able to different economy sizes), the lower the GDP growth rate. Though, the graph doesn’t totally reflect the inverse dependence, it shows the tendency and proves the idea for the majority of countries: while the slope of the GDP growth rate curve is positive, the slope IDS/GDP coefficient curve is negative. Among the factors that prevent the ideal illustration of the theory is the nominal character of the variables. One of the elements is the security market capitalization which can not possibly be turned into real one, therefore the usage of nominal GDP growth rate is reasonable. Another factor is the uniqueness of economic development of the countries, like the inflation and unemployment level and the unique interde- pendence of the economic sectors. Consequently, it is necessary for the countries to develop its own targeting rules, using its own IDS/GDP ratio and work out the model of the development that will be adequate for a single economy specifications. Are securities secure … 365 Perhaps, it is quite possible to work out the limit in IDS/GDP ratio which countries shouldn’t exceed in order not to affect sustainable development, but this work re- quires, first of all, individual approach and demands profound knowledge on eco- nomic development of each country. 3 Conclusions The overall significance of the model is to show the impact of the issuance of interna- tional debt securities on the economic growth of the countries. Since the slope of the regression line is negative, the excessive amount of such type of securities in com- parison with GDP leads to slowdown in the economic growth next year. Due to the fact that the issuance of securities is partly managed by governments and financial organizations (in case of the US – Securities and Exchange Committee (SEC), for example), the issuance of them can and should be regulated. After the first wave of financial crisis has gone some rating agencies, for example, S&P have offered down- grading system for some risky instruments, thus protecting the market from distribut- ing of excessive risk, as well as SEC has been promoting a new bill to the White House for a while (Reuters). This information shows that global financial society is already trying to react and eliminate some of the causes of the financial crisis, how- ever, not yet fruitful. The global financial crisis has proved the necessity of the permanent monitoring and control of the financial system, especially with regard to financial architecture and innovations [6]. The research has revealed that the volume of the international debt securities should be a subject of the country’s guideline to optimal control and efficient rules for financial policy. Increase in the IDS/GDP ratio may lead to finan- cial instability and increased involvement in the global financial market. In addition, secure financial instrument usage with careful risk control limits allows financial and, therefore, the whole economic system to reach sustainable development. References 1. Bank for International Settlements: Financial Market Developments and their Implications for Monetary Policy. №39. BIS, Basel (2008) 2. Bank for International Settlements: BIS Guide of Securities Statistics. №39. BIS, Basel (2009) 3. Bank for International Settlements: BIS Quaterly Review, December 2009. BIS, Basel (2009) 4. Chibrikov, V.: History of the Derivatives. Economic Issues, 3 (2008) 5. International Swaps and Derivatives Association: Handbook on Securities Statistics, ISDA (2007) 6. Stiglitz J.: Regulation and Failure in New Perspectives on Regulation, In: The Tobin Pro- ject, pp. 11–24, Cambridge (2009) 7. Suetin A.: About the Reasons for Financial Crisis. Economic Issues, 1 (2009) 8. World Bank Database, www.worldbank.org How to Make High-tech Industry Highly Developed? Effective Model of National R&D Investment Policy Oksana Moiseeva1 and Sergey Mazol2 1 Belarus State Economic University, Minsk, Belarus oksuta13@mail.ru 2 Academy of Public Administration, Minsk, Belarus mazols@yandex.ru Abstract. The paper validates the relations between the share of public and pri- vate R&D spending and the effectiveness of national R&D sector. It states that in order to implement effective and profitable “high-tech policy”, governments have to intensify the share of business sector in Gross Domestic Expenditures on R&D. At the same time it is necessary to preserve the definite “government share” in R&D investments, as reduction of it up to certain extent gives the negative effect. Keywords. High-tech industry, R&D investment policy, Private v. Public R&D Spending, Exports of High Technology Products Key terms. Development, MathematicalModel 1 Introduction It is impossible to deny that people all over the world benefit from new technologies which lead to healthier lives, greater social freedoms, increased knowledge and more productive livelihood. Each day sees additions to the literature, much of which in- cludes reports on the establishment or expansion of R&D facilities and programs that are designed to take the best advantage of highly qualified resources. Nowadays there are practically no governments and politicians that would miss a chance to stress the importance of innovations in economy. According to the judgments of some experts, GDP growth of developed countries up to 50-90% is determined by technological progress and innovations [7]. The developing countries, in their turn, extremely need competitive high-tech industry, not only because being usually one of the most profit- making and cost-efficient industries it contributes to economic prosperity by itself, but also because technological achievements give them a chance to promote and make competitive on the global arena all other economic sectors, narrowing in this way the economic gap between the highly-developed countries and the developing ones. How to Make High-tech Industry Highly Developed? … 367 Still the results of R&D policy in the countries of post soviet space frequently leave much to be desired. The most prominent achievements in the sphere of industrial R&D belong to the most developed countries such as USA, Japan, European Union. Current literature is replete with reports on the expanding R&D activities in China, India, South Korea and Singapore. Meanwhile Belarus, Ukraine, Uzbekistan, Moldova still cannot boast prominent commercial achievements in R&D. That’s why this research paper aims at analyzing and investigating of those factors and incentives that turn national innovative efforts, resources and potential into visible and profitable high-tech results. Research and development (R&D) comprise creative work undertaken on a system- atic basis in order to increase the stock of knowledge, including knowledge of man, culture and society, and the use of this stock of knowledge to devise new applications [2]. The UNESCO Institute for Statistics claims that the clearest trend in global R&D activity between 1996 and 2005 was the increasing percentage of GDP devoted by countries all over the world to R&D (R&D intensity has more than doubled in 9% of the countries surveyed, including China, Thailand, Tunisia and others; in 48 out of 89 countries surveyed the percentage of GDP devoted to R&D has significantly in- creased) [4]. Surely, sustained R&D investment is a key to economic growth. But those are strong words that are easy to follow in good economic times, but more diffi- cult to follow in bad economic times. R&D expenditures are among the first to be cut during recessions. Preliminary data (official statistics on R&D are available only until 2007) suggest that companies have reduced their R&D investment in the aftermath of the crisis. In 2008 the industrial companies despite the challenging economic times continued growing their R&D budgets, expanding by nearly 6,1%, or more than $60 billion, from what they spent in 2007. Despite their good intentions, when the down- turn turned from mild to severe, industrial firms were forced to cut their R&D budg- ets. Total industrial R&D spending dropped by 1% or nearly $10 billion overall in 2009 from what was spent in 2008 [6]. These findings are consistent with historical trends showing that R&D expenditure exhibits larger variations than gross domestic product (GDP) over the business cycle. Hence, any drop in GDP would result in an even larger decrease in R&D expenditure [3]. The 2010 Global R&D Forecast, created by Battelle analysts and the editors of R&D Magazine, predicts overall global R&D will increase 4.0% in 2010 to $1,156.5 billion from $1,112.5 billion spent in 2009. This increase will mostly be driven by continued spending by China and India, who will drive a 7.5% increase in Asian R&D. American R&D spending is expected to increase 3.2% to $452.8 billion, while EC spending will only increase 0.5% to $268.5 billion in 2010. This forecast espe- cially stresses a trend of falling the spending of both the Americas (U.S., Canada, Mexico, Brazil, and Argentina) and the EU behind the spending levels seen in Asian countries (India and China). Even Japan, the 2nd largest R&D spender in the world, is now trailing the level of spending by China and India [6]. It is really hard to measure innovations, as its manifestation within the economy is larger and more complex than what one indicator or index can capture and reflect. Many aspects of technology creation, diffusion and human skills are hard to quantify. 368 O. Moiseeva and S. Mazol Still in order to estimate nation’s technological achievements and the level of innova- tive progress it is possible to use a great variety of indicators. The most frequently used ones are the following: The number of patents granted to residents (per million people), the number of new trademarks Receipts of royalty and license fees (US$ per person) The number of researchers in R&D (per million people or per thousand employees) Population with tertiary education and youth education achievement level, new science and engineering (S&E) graduates per 1000 population Science and engineering degrees (% of all new degrees) % of firms with new-to-market product innovations (as % of all firms) Sales and exports of high technology products and many others However, most of the indicators mentioned above describe only the quantitative side of innovation process, but not the efficiency of national R&D investment policy. Furthermore, some of these indicators are not representative due to considerable legis- lation differences among the countries (for example, low patenting activity in India and China is explained mostly by underdeveloped system of intellectual property rights’ protection than by lack of innovations) [3]. That’s why this research paper concentrates mainly on the share of high-technology exports as % of manufactured exports as the most representative indicator of competitive commercialization of na- tional scientific researches (Y-variable). While speaking about the factors of successful innovation policy, it is important to remember that there are no ideal models in complex economic systems, and each economic or social parameter is subjected to multitude of different impacts and fac- tors. As regards national high-tech development, it is affected by such economic con- ditions as: International openness to trade and investments (assessed by such indicators as export and import ratio to GDP (Exp+Imp/GDP*100%), the level of trade weighted average tariff) The commitment to market values and developed market economy infrastructure The accessibility to the venture financing for start-ups The competitiveness of national economy on the global arena The volume of R&D spending (in billions of $) Industry innovation expenditures The influx of direct foreign investments and many others Public and private R&D expenditures (% of GDP) Research and development (R&D) expenditure is one of the most widely used measures of the innovative efforts of firms and countries. Most surveys devoted to the technological achievements of the countries concentrate the attention mainly on the level of R&D intensity of this country as the main factor. But while R&D as a percent of GDP figures are bandied about as indicators of the strength of the national com- mitment to scientific research, they have relatively little meaning in terms of just how that investment contributes to the growth and welfare of the country. The more important data are those that tell you who is providing the funding, who is doing the work, how the money is being spent, and what the priorities, thrusts, and How to Make High-tech Industry Highly Developed? … 369 directions are. In brief, it is the internal structure of the R&D enterprise and the roles and interplays among the different sectors that have a bearing on the manner in which the investment in R&D has the desired societal benefit outcomes of economic secu- rity, improved health care, and the like. The R&D expenditure is generally broken down among 4 sectors: business enterprise, government, higher education and private non-profit institutions. In this research the share of business financed R&D was se- lected for thorough econometrical analysis (X variable). 2 Results The basic hypothesis based on the preliminary insights into the statistical data sug- gests that, in order to implement effective and profitable “high-tech policy”, govern- ments have to intensify the share of business sector in GERD (Gross Domestic Ex- penditures on R&D). But at the same time it is necessary to preserve the definite “government share” in R&D investments, as reduction of it up to certain extent gives the negative effect. Figure 1 illustrates the average indications of high-technology exports (red line) and business expenditures on R&D (blue line) during the period 2000-2005 for 20 coun- tries (the countries were arranged in order of business R&D share extension) [1], [3], [7]. We can see from the graph that the supposed rule is valid for the countries disposed in the range of 0-60 % share of business sources in R&D expenditures: the higher share of business sector in R&D means the higher indications of high-tech export. Fig. 1. The average indications of high-technology exports and business expenditures on R&D. Business R&D – % of Gross Domestic Expenditure on R&D financed by Business Sector; HTexport – % of high-technology export in total manufactured export However we cannot fully rely on average indications, as each country has its own peculiarities, historic and geographic conditions and many other factors that can de- termine high-technological specialization of export. Moreover, here only 20 countries were taken into account. 370 O. Moiseeva and S. Mazol That’s why it is more interesting and important to examine the changes in the share of high-technology exports depending on the changes in the structure of R&D financ- ing. The Model description: X (∆privR&D) – shift in the share of business sector in R&D financing. Business R&D expenditures as % of total R&D expenditures – the indicator reflects the per- centage of total investment in research and development originating from the business sector; Y (∆HTexp) – shift in the share of high-technology exports as % of total manufac- tured export during the period 2000-2005. High technology export is exports of products with a high intensity of research and development. They include high-technology products such as those used in aerospace, computers, pharmaceuticals, scientific instruments and electrical machinery [4]. The statistical analysis of the data across 63 countries in the world for the period of 2000-2006 intended to reveal the correlation between changes in R&D expenditures structure and export structure. It will be studied linear regression. According to the statistical analysis it was revealed the following general tendency: the share of the high-tech export increases in most of the countries surveyed along with the growth of business financed share in R&D investments. The econometric model on the basis of the statistical data has the following outlook (figure 2): ∆HTexp= –1,8071+0,7129*∆privR&D (R2=0,636), which means the increase of high-technology export share by nearly 0,7% if the share of private R&D expenditures grows by 1%. Fig. 2. The linear regression With the purpose of further analysis the countries were classified into 3 categories: The countries with traditionally high share of private sector in R&D (>60%). This group includes such countries as Belgium, Denmark, Finland, Germany, Ireland, Is- rael, Japan, China, Luxembourg, Switzerland, USA (figure 3). How to Make High-tech Industry Highly Developed? … 371 Fig. 3 The countries with high share of private sector in R&D The countries with medium share of private R&D expenditures in the range from 40% to 60% are composed of such countries as Austria, Brazil, Croatia, Cuba, Czech Republic, France, Hungary, Netherlands, Spain, Great Britain (figure 4). Fig. 4 The countries with medium share of private sector in R&D The countries with traditionally low share of private sector in R&D (<40%) are such countries as Azerbaijan, Belarus, Bulgaria, India, Iran, Latvia, Pakistan, Poland, Portugal, Russia, Ukraine. Fig. 5 The countries with low share of private sector in R&D 3 Conclusions 1. For 3 groups of countries (with different level of business expenditures) the trend line has the positive angle, which confirms the basic hypothesis. The peculiarity here is that the elasticity of high-tech exports to the private R&D investments is higher for the 2nd group of countries (where 40-60% of R&D is financed by busi- ness sector). It is the diapason in which the most drastic changes in export struc- ture happen with the increase or decrease of business share in R&D investments. 372 O. Moiseeva and S. Mazol Also it is important to highlight that the 3rd group of countries (with low participa- tion of business sector) mainly has the tendency to declining share of high-tech ex- port (most of the countries are located in the 3rd quadrant on the graph). 2. The share of business investments in R&D to the extent more than 80% may cause the decline in high-tech export. So the government expenditures are an important factor of accelerating the further R&D investments. According to the Harrod– Domar theory more investment leads to capital accumulation, which generates economic growth. Regarding the R&D investment policy, it is possible to make an assumption that expenditures of the government on R&D create the basis and in- dispensable minimum from which further R&D investment activity is multiplied. 3. Time lag according to statistical data is no more than 1 year. Explanation of the results, received in the research. The government is involved mainly in financing the fundamental investigations and basic research (basic research is experimental or theoretical work undertaken primar- ily to acquire new knowledge of the underlying foundation of phenomena and observ- able facts, without any particular application or use in view [2]), which implies low degree of commercializing of that kind of R&D. The main commercial projects of the government in R&D sphere are concentrated in such fields as defense, healthcare, space programs, infrastructure. The governments of the developing countries invest in import-substituting industries, which also mean low level of transforming the financed R&D into high-technology export. Finally, even in the cases when the government takes the lead in innovation financ- ing and implements different governmental programs for innovative development, it cannot respond better to the changing market necessities and conditions, than private investors and companies that are interested to the maximum extend in the commercial success of their investigations. The governmental programs on the other side fre- quently tend just to expand the range of goods, but not the technological structure of industry and its qualitative parameters. Policy recommendations. It is important to focus on increasing efficiency in R&D spending rather than meet- ing a specific spending level. The efficiency and competitiveness of R&D investment policy, in its turn, can be achieved by expanding of the role of business sector in R&D financing up to 70-80% (it is important to use economic incentives such as tax ex- emptions, for example) It is necessary to preserve the government spending on R&D within the level of 15- 20% of the total expenditures, maintaining at the same time flexibility in allocating public R&D funds. The government should concentrate mainly in the basic researches sphere and accelerate in this way other fields of R&D investments. References 1. OECD: Main Science and Technology Indicators, http://www.oecd-ilibrary.org/science- and-technology/main-science-and-technology-indicators/volume-2008/issue-2_msti-v2008 -2-en-fr;jsessionid=2obsve2k0wjt3.x-oecd-live-01(2008) How to Make High-tech Industry Highly Developed? … 373 2. OECD: Factbook 2008: Economic, Environmental and Social Statistics, http://www.oecd- ilibrary.org/economics/oecd-factbook-2008/world-population_factbook-2008-graph1-en (2008) 3. OECD: Science, Technology and Industry Scoreboard, http://www.oecd- ilibrary.org/content/book/sti_scoreboard-2009-en (2009) 4. UNDP: Human Development Report 2007/2008, http://hdr.undp.org/en/reports/global/hdr2007-2008 (2007) 5. Battelle and R&D Magazine: Global R&D report 2008, http://www.asiaing.com/2008- global-r-d-report.html (2008) 6. Battelle and R&D Magazine 2010: global R&D funding forecast, http://www.rdmag.com/topics/global-r-d-funding-forecast?page=2 (2010) 7. World Bank database, www.worldbank.org Econometric Analysis on the Site “Lesson Pulse” Alexander J. Weissblut Kherson State University, 1, 40 rokiv Zhovtnya Street, 73000, Kherson, Ukraine veits@ksu.ks.ua Abstract. In this article the site “Lesson pulse” is considered, as the tool allow- ing the teacher to receive the objective information on a course and results of a lesson in a mode online. However adequate interpretation for results of such in- terrogations is impossible, while we will not separate true students from others. Besides, interpretation for results of interrogations and the decision-making, grounded on it, demands to realize, what exactly this concrete group means by clearness of an explanation, objectivity of the marks etc. For anonymous inter- rogations it means necessity of correlation and regression analysis for their re- sults and an estimation of the statistical significance of the received results. So it means necessity of econometric analysis. Keywords. Factor, statistical, econometric, analysis, correlation, decision- making Key terms. Research, Management, Model, KnowledgeManagementProcess, KnowledgeManagementMethodology, MathematicalModeling 1 Introduction In this article the site “Lesson pulse” is considered, as the tool allowing the teacher to receive the objective information on a course and results of a lesson in a mode online. It allows for the student or the pupil at any moment to react to a lesson course, having answered one or several questions, for example: 1. Is it interesting to you at a lesson? 2. Is it accessible (clear) an explanation? 3. Are you tired? Whether arranges you the rate of an explanation? 4. There are at you questions to the teacher? 5. Whether the marks are objective? (Formulations of questions are defined by the teacher). As a result of an average of these responses the site produces on the monitor screen the data about a lesson state, its "pulse" in a mode online. At any moment the teacher can ask to answer all simul- taneously such or more profound groups of questions (their examples are given be- low). So, he can measure the “lesson pulse” just at this moment. Such interrogations Econometric Analysis on the Site “Lesson Pulse” 375 do not demand computer auditorium by all means: they can be carried out on one tablet, and then results can be transferred to a site. However adequate interpretation for results of such interrogations is impossible, while we will not separate true students, for which educational process is a consider- able part of their life, from those, who would prefer to keep far away from it. Besides, interpretation for results of interrogations and the decision-making, grounded on it, demands to realize, what exactly this concrete group means by clearness of an expla- nation, lesson atmosphere, objectivity of the marks etc. For anonymous interrogations it means necessity of correlation and regression analysis for their results and an esti- mation of the statistical significance of the received results. So it means necessity of econometric analysis. 1. All groups of questions considered further have been chosen in result of "brain- storming" where students of fourth year study of the Faculty of physics, mathemat- ics and informatics at the Kherson state university acted as experts. This expert in- terrogation has been constructed by a technique of " six hats of thinking ” E. Bono [1], which provides the maximal openness and relaxedness of participants. In all cases the opinion has unanimously been expressed, that the given set of questions is full and fair. 2. Then students of speciailties “physics”, “mathematics”, “informatics” and “pro- gram engineering” of the Kherson state university have been interviewed under such essential requisites. The respondents estimate each question from 0 (at firm “no”) up to 10 (at firm “yes”). He arbitrarily sets a name of the folder containing his interrogation (i.e. his key). The volunteer – a participant of interrogation – col- lects all folders in one main folder and sorts them here (i.e. shuffles). Only after that the main folder was transferred to the teacher: this simple and open procedure guaranteed to participants anonymity of interrogation. Alternative and technically simpler variants are answers to the site and to a tablet: the variant choice is defined by a kind of interrogation and level of trust of an audience to the interviewing teacher. 3. Results of interrogation then are transferred to a site “Lesson pulse”, which is real- ized in language PHP and uses database MySQL (see [2]). The queries realizing now on the site give out results of the econometric analysis of interrogation. They include the plural correlation analysis of factors and an estimation of the statistical importance of the received results with use of criteria Student and Fisher (see [3]). The site interface is oriented to the user, generally speaking, nothing knowing about the econometric analysis. 2 The Analysis of Interrogations on Results of Lesson and Feedback Results of interrogation about lesson and interrogation Feedback are, of course, abso- lutely various depending on a lesson, a teacher, an audience etc. However the correla- 376 A. J. Weissblut tion analysis of factors led to similar outcomes (at 20% a significance level by crite- rion of Student). Everywhere below we use the interrogations of 421 groups (special- ity “mathematician”), having typical species (fig. 1). 1 0,8 0,6 0,4 0,2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 -0,2 -0,4 -0,6 Fig. 1. Histogram for distribution of correlation coefficients Q1 Here is the histogram for distribution of correlation coefficients between answers to a question “Whether the lesson was pleasant to you?” and following factors: 1. Whether it is accessible (clear) an explanation? 2. Whether arranges you the rate of an explanation? 3. Are you tired at a lesson? 4. Lesson atmosphere: is it comfortable, is it pleasant to you at a lesson? 5. Is the statement filled enough by examples? 6. Objectivity of the marks, which have been put down at a lesson. 7. Are you having some questions to the teacher? 8. Do you still want a lesson on this theme? 9. Have you prepared for this lesson? 10. Are you intending to continue studying at home? 11. Accordance of a lesson to tasks of independent (home) work. 12. Is it interesting to you at a lesson? 13. Have you taken out something useful at a lesson or are sorry about spent time? The most significant factors had appeared (in decreasing order) 1 (0.91), 4 (0.87), 12 (0.83), 13 (0.75) 5 (0.63), 9 and 11 (0.59). Objectivity of marks is only further (0.51) and inverse correlation – 0.39 for 7 specifies that for the majority the good lesson is such after which does not remain questions to the teacher (fig.2). Here is the histogram for distribution of correlation coefficients between answers to a Feedback question “Whether the teacher is pleasant to you?” and following fac- tors: 1. Whether lessons were pleasant to you? 2. Estimation by student of the knowledge received at lesson. 3. Is it accessible (clear) an explanation? Econometric Analysis on the Site “Lesson Pulse” 377 1 0,8 0,6 0,4 0,2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 -0,2 -0,4 Fig. 2. Histogram for distribution of correlation coefficients Q2 4. How much are accessible (clear) and authentic answers to questions of students? 5. The explanation is filled enough by examples. 6. Using of various approaches at training. 7. Whether the teacher aspires to interest, motivate students? 8. Lessons atmosphere: is it comfortable, is it pleasant to you at a lesson? 9. Availability of the teacher, his inclination to listen to students, to conduct discus- sions with them. 10. Knowledge of a subject by the teacher. 11. Insistence (regular and frequent enough control of knowledge). 12. Punctuality (comes in time at lessons). 13. Possession of an audience (students do not sleep and do not make too much noise at lessons). 14. Objectivity in estimation of the student by the teacher. Whether criteria of esti- mation in all subgroups are identical? 15. Accordance of a lesson to control tasks. The most significant factors appear (in decreasing order) 8 (0.92), 7 (0.85), 6 (0.775), 4 (0.75), 9 (0.72), 3 (0.675), 13 (0.58). Only further with factor of correlation 0.51 follows 1 - whether lessons were pleas- ant to you. And major factors of estimations of the teacher and lesson are considera- bly differing. Further the histogram of differences between factors of correlation for questions “Whether the teacher is pleasant to you?” and “Whether lessons were pleas- ant to you?” is resulted. The factors much more essential at an estimation of a teacher, than a lesson are 6 (using of various approaches at training) and 7 (whether the teacher aspires to interest, motivate students). On the contrary, at an estimation of a lesson it is much more es- sential factors 14 (accordance of a lesson to control tasks) and 10 – insistence (regular and frequent enough control of knowledge): probably, according to students, insis- tence it is good for lesson and it is not so good for the teacher. 378 A. J. Weissblut 0,6 0,4 0,2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 -0,2 -0,4 -0,6 Fig. 3. Histogram for distribution of correlation coefficients Q3 Certainly, the correlation matrix contains decomposition on factors also for each of 15 questions. So it is found out that 5 (the explanation is filled enough by examples) is most closely connected with 15 (accordance of a lesson to control tasks); 3 (are you tired at a lesson) with 7 (presence of questions to the teacher); 13 (possession of an audience) with 14 (objectivity in estimation of the student). It is interesting to compare 12 (is it interesting to you at a lesson) with 13 (have you taken out something useful at a lesson) from interrogation about results of the lesson (fig.4). 1 0,8 0,6 0,4 Interesting 0,2 Useful 0 1 2 3 4 5 6 7 8 9 10 11 12 -0,2 -0,4 Fig. 4. Histogram for distribution of correlation coefficients Q4 As we see, from the student’s point of view, it is interesting and it is useful is not the same. So 4 (lesson atmosphere) correlates with the factor interesting much more, while factor 5 (is the statement filled enough by examples) with 11 (accordance of a lesson to tasks of independent (home) work). Econometric Analysis on the Site “Lesson Pulse” 379 3 The Analysis of Interrogations about the Factors Influencing a Lesson Unlike interrogations about results of lesson and Feedback results of interrogations about factors of influence on a lesson course are close enough in different groups. The histogram for distribution of interrogation requisites on the relation to lesson (in 421 group) is below (fig. 5). 10 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Fig. 5. Histogram for distribution of correlation coefficients Q5 Here: 1. Is study pleasant to you? Is it interesting to you? 2. Whether you believe that education is “road to the future”? 3. Is your speciality pleasant to you? 4. Is the program of training for your speciality satisfying you? 5. Whether satisfies you teaching level at the university? 6. Whether on own will you have chosen university and a speciality? 7. Would you like to change the speciality or to receive additional higher education? 8. Whether your attendance of lessons is regular? 9. Do you regularly prepare homework? 10. Whether there were at you conflicts to teachers? 11. Were you afraid of an exception of university? 12. Do you wish to take part in scientific work, in Olympiads on your speciality? 13. Whether often to you fellow students address for the help in lessons? 14. Do you wish to enter postgraduate study after training end? 15. What’s the time you spend for preparation for lessons (on the average hours per day)? And further similar results of interrogation on external factors (fig. 6): 380 A. J. Weissblut 10 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 Fig. 6. Histogram for distribution of correlation coefficients Q6 Here: 1. Close dialogue with teachers. 2. Accessibility of the Internet at university. 3. Readiness of an auditorium for a lesson (working capacity of projectors, com- puters, the software; comfort of an auditorium). 4. Presence of enough points for the centralized feeding. 5. Accessibility of contacts to the future employers. 6. Accessibility of summer improvement. 7. Participation in scientific work. 8. Teaching level at the university. In a correlation matrix under all these factors there are only few factors which cor- relations are close to 1. These are factors: 1. Whether your attendance of lessons is regular with factors (a) do you regularly prepare homework (0.87) (b) teaching level at the university (0.84) (c) participation in scientific work (0.63) (d) have you prepared for this lesson (0.59) (e) accessibility of summer improvement (– 0.5) 2. Do you regularly prepare homework with factors (a) whether your attendance of lessons is regular (0.87) (b) teaching level at the university (0.815) (c) have you prepared for this lesson (0.66) (d) participation in scientific work (0.56) (e) accessibility of summer improvement (– 0.52) Econometric Analysis on the Site “Lesson Pulse” 381 3. Teaching level at the university with factors (a) whether your attendance of lessons is regular (0.843) (b) do you regularly prepare homework (0.815) (c) whether on own will you have chosen university and a speciality (0.65) (d) have you prepared for this lesson (0.59) (e) participation in scientific work (0.56) (f) accessibility of summer improvement (– 0.55) Besides them correlation factors above 0.7 appear still only twice: between factors whether there were at you conflicts to teachers and were you afraid of an exception of university (0.85); and between factors participation in scientific work and is the program of training for your speciality satisfying you (0.74). Occurrence in such line the factor teaching level at the university is, probably, the best compli- ment for Faculty of physics, mathematics and informatics of the Kherson state univer- sity for all its history. Our main task is to use the mental orientation, fixed thus in the correlation analysis of factors, for separating true students, for which educational process is a considerable part of their life, from those, who would prefer to keep far away from it. Using already cited data and the following table 1 Table 1. Correlation analysis of factors Factor Average value Root-mean-square deviations Teaching level at the university 702 2.17 Regularly attendance of lessons 8.85 2.3 Regularly prepare homework 8.4 2.6 We choose as a differentiating sign between groups the factor regularly prepare homework. In this case mutual correlations of defining sign are closer to 1; and the dispersion is more, that testifies about more variability of respondents under this fac- tor. Besides, among others selected it more corresponds to such sign on common sense. 4 Results of Interrogations about Lesson and Feedback on Subgroups To the selected differentiating sign among 20 respondents of group 421 the 12 par- ticipants is allocated, who for a question do you regularly prepare homework have answered with 10 or 9 points. The additional subgroup consists of 8 respondents. Whether there correspond such subgroups to required division into true students and the others? Below there is the histogram for average results of interrogation about lesson on the allocated subgroups (fig. 7). 382 A. J. Weissblut 12 10 8 Prepare 6 Don't prepare 4 2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Fig. 7. Histogram for distribution of correlation coefficients Q7 So, the factors considerably different in subgroups (in decreasing order of modules of differences between average values in subgroups) are: 2 Whether it is accessible (clear) an explanation? (7.92 – 4.25 = 3.67) 1 Whether the lesson was pleasant to you? (7.1 – 4.38 = 2.72) 12 Accordance of a lesson to tasks of independent (home) work. (9.91–7.5 = 2.41) 6 Is the statement filled enough by examples? (9.41 – 7 = 2.41) 10 Have you prepared for this lesson? (8.75 – 6.5 = 2.25) 13 Is it interesting to you at a lesson? (7.92 – 5.87 = 2.05) 14 Have you taken out something useful at a lesson? (9.1 – 7.4 = 1.7) 5 Lesson atmosphere (7.66 – 1.25 = 1.41) 9 Do you still want a lesson on this theme? (3.66 – 2.65 = 1.01) The averages of additional group are more only twice, there are: 4 Are you tired at a lesson? (6.5 – 7.62 = – 1.12) 3 Whether arranges you the rate of an explanation? (5.5 – 6.37 = – 0.87) Last result is strange at first sight, but steady for all groups and it is easy to explain this phenomenon psychologically: as less adjusted the student for study, the more he would like acceleration, faster course of time. Further there are similar results for interrogation Feedback (fig. 8). 12 10 8 Prepare 6 Don't prepare 4 2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Fig. 8. Histogram for distribution of correlation coefficients Q8 Econometric Analysis on the Site “Lesson Pulse” 383 Here the factors considerably different in subgroups are: 5 The explanation is filled enough by examples (9.17 – 6.37 = 2.8) 6 Using of various approaches at training (6.36 – 4 = 2.36) 16 Accordance of a lesson to control tasks (8.9 – 6.87 = 2.03) 4 How much are accessible (clear) and authentic answers? (7 – 5.25 = 1.75) 9 Lesson atmosphere (6.36 – 4.85 = 1.51) The obtained data corresponds to a hypothesis about required division into groups, anyway they don’t contradict it. 5 The Latent Division in Group The site “Lesson pulse” offers also division of group into classes with a given value of mutual correlation: between two respondents from one class it is possible to find a chain of respondents of this class so, that the correlations of answers between con- secutive respondents of this chain not less than the given value. Such division into subgroups allows finding out distinctions in the group, which is not appreciable di- rectly. At mental interrogation about factors of influence on lesson and the set minimum level of mutual correlation 0,6 in test group 421 splitting into 3 classes has turned out: from 4, from 5 and from basic subgroup of 11 respondents. Let's compare averages of the basic class to averages of the first and second subgroups under those factors in which appreciable differences have come to light. 1. Is the program of training for your speciality satisfying you? 2. Would you like to change the speciality or to receive additional higher education? 3. Do you wish to take part in scientific work, in Olympiads on your speciality? 4. Do you wish to enter postgraduate study after training end? 5. Participation in scientific work. 6. Readiness of an auditorium for a lesson. 7. Accessibility of summer improvement. 8. Accessibility of contacts to the future employers. 12 10 8 Class 1 6 Class 2 Main class 4 2 0 1 2 3 4 5 6 7 8 Fig. 9. Histogram for distribution of correlation coefficients Q9 384 A. J. Weissblut Respondents from classes 1 and 2 much less than the basic group are satisfied by the program of training of the speciality (point 1). They would like to change the spe- ciality or to receive additional higher education essentially more than the basic group (point 2). Their difference is the sharpest comes to light in point 3: unlike the basic group they at all do not wish to take part in scientific work or in the Olympiads on the speciality. So, apparently, the speciality for them has lost now appeal. Respondents from a class 2 does not interest in the postgraduate study (point 4), however they do not against scientific work (point 5). The main thing, they have the most interest in contacts to employers (point 8). Apparently, it is search of the application out of the speciality. Respondents from a class 1 are focused differently: they have a little inter- est in scientific work and employers (points 5 and 8), but they wish to enter post- graduate study (point 4). References 1. De Bono, E.: Six Thinking Hats. Penguins Books (1997) 2. PHP Book, http://www.phpreferencebook.com (2012) 3. Hansen, B. E.: Econometrics. Textbook, http://www.ssc.wisc.edu (2012) Decision Supporting Procedure for Strategic Planning: DEA Implementation for Regional Economy Efficiency Estimation Karine Mesropyan1 1 Institute of Socio-Economic and Humanitarian Researches of Southern Scientific Center of RAS, 41 Chekhova prospect, 344006 Rostov-on-Don, Russia karineemesropyan@gmail.com Abstract. The algorithm of decision supporting procedure based on Data En- velopment Analysis (DEA) along with the Malmquist Productivity Index is suggested in the paper. The procedure’s core consists of evaluations complex for preliminary data processing and adjustment as well as creating of analytic materials in the field of regional strategy planning. The crucial study issue is to define boundaries of DEA applicability in this field and to eliminate DEA shortcoming, such as the scores dependence on a set of inputs and outputs. The efficiency scores of Russian regional agrarian sector are obtained in order to verify the procedure and add knowledge to current indicators’ systems of re- gional economic efficiency by improving approach objectiveness. It is shown how obtained results can be applied in the strategic planning to increase effec- tiveness of state regional policy activity. Keywords. Efficiency, Malmquist Index, Data Envelopment Analysis, Region, Procedure, Strategy Planning Key terms. DecisionMaking, MathematicalModel, Methodology, Develop- ment, Management 1 Introduction Theoretical model of this study is based on the Pareto-Koopmans concept (1951) [1]. System technology is efficient by Pareto-Koopmans if and only if the object does not have an opportunity to improve its resource (input) or product (output) without sacri- ficing some other input or output. Charnes et al. (1978) have proposed Data Envel- opment Analysis (DEA) based on this concept of efficiency that was combined opera- tional research tools within works of Koopmans (1951) and Farrell (1957) [2, 3]. DEA is a non-parametric frontier approach for comparative efficiency measure- ment in which a set of similar objects with multiple inputs and outputs is analyzed. 386 K. Mesropyan The aim of this study is to suggest the procedure for providing strategy planning by analytical reports based on DEA scores implementation. It is obvious that productivity analysis by DEA has at least three current issues. The first is to define a set of objects which will be compared in the study. The second is to formulate convenient conditions for concrete models’ modifications using. The third issue is to improve discrimination capability. Therefore, efficiency assessment procedure by DEA is primarily based on the following grounds: the formation of ob- jects’ set to be compared, identification of inputs and outputs, and model selection. Taking into account issues mentioned above, it is necessary to adapt basic DEA models and its implementation. Furthermore, DEA procedure is considered as a core of the evaluation of regional economy efficiency scores. This paper consists of five parts. We state the main issues in this, first, part. This study’s background is presented in the second part. Part 3 deals to description of sug- gested evaluation procedure. The applying of investigation of the Russian regions agrarian sector to management tasks by using the procedure is reported in the part 4. We make conclusions in the last part. 2 Theoretical and Methodological Background DEA application has a big number of advantages. First of all, a calculation of an inte- grated assessment is produced for each region reflecting the efficiency of input factors using for output products. Besides, the Pareto-optimal set of efficient regions in the multidimensional space of inputs and outputs is being obtained. Secondly, it is unnec- essary to attract an expert knowledge in a priori assignment of weights for variables corresponding to inputs and outputs. Despite of this, using of additional data on re- gion external factors is helpful for creating the right model. Thirdly, it is very impor- tant that there are no restrictions on the functional form of the relation between inputs and outputs. The study is carrying out with the hypothesis that DEA implementation needs the formal procedure in order to obtain stable scores and apply research results to analytic background of current regional strategic planning. The multilateral and penetrating analysis of DEA possibilities and its application’s restrictions are presented in Dyson et al. (2001), Cook and Seiford (2009) [4, 5]. Along with these works there are reviews of this method application, for instance, in papers of Avkiran and Parker (2010), Liu et al. (2012) [6, 7]. The common bases pro- ductivity measurement presented in Caves et al. (1982) [8]. The application possibilities of Malmquist Productivity Index in different in- tertemporal comparisons are described in the research of Färe and Grosskopf (1996) [9]. Tsuneyoshi et al. (2012) used Malmquist Index for the comparative analysis of 97 countries calculated by DEA models for period 1981-2004 [10]. Yamamura and Shin (2008) determined the nature of inequality impact on capital accumulation and growth performance by evaluation DEA indexes from 1965 to 1990 [11]. According to review presented in [12], although there are a DEA advantages, the general method’s shortcoming is considered as crucial because the scores signifi- Decision Supporting Procedure for Strategic Planning … 387 cantly depend on a set of inputs and outputs. This study suggests the special proce- dure for DEA implementation for the needs of regional strategic planning. It is a re- sult of attempting to eliminate the mentioned DEA drawback and provide the decision process of strategic planning by analytic materials. Golany and Roll (1989), Em- rouznejad and De Witte (2010) offered procedures of DEA application which are very useful for common case [13, 14]. This study based on results of these works. 3 Efficiency Estimation Procedure Different levels of the regional economy scale and the return to scale effect are con- sidered as a reason of inequality between regional output performances. That is why the model with variable return to scale is suggested for this study. This model was introduced by Banker et al. (1984). Data for a research by DEA is presented by a number of indicators in form of the matrix of inputs Xt={xtij} and matrix of outputs Yt={ytkj}. The efficiency criterion for a multidimensional assessment of an object is to assign some input and output parame- ters for all objects, some weights and then to calculate and maximize the ratio for each object: (1) where: j – index of the estimated production facility, j=1…, n; xij – matrix of input parameters that reflect the system resources, i=1,…,m; ykj – matrix of outputs which reflect the products of system, k=1,…,s; ui /wk – weights for outputs/inputs. According to the DEA framework, this function should be maximized under re- strictions for all objects: (2) The Malmquist Productivity Index is calculated using such DEA efficiency scores for evaluation of total factor productivity change: (3) The suggested procedure for regional efficiency assessment has the complex of procedures for preliminary data processing and adjustment (fig.1). 388 K. Mesropyan Fig. 1. Five stages of Efficiency Estimation Procedure The dual linear program model for evaluation the criterion given above is: (4) where: η – comparative efficiency score of region j (j=1…, n); λj – dual model variables. If η<1 then the region belongs to inefficient set, otherwise (η = 1) it is a part of Pareto set. According to procedure carrying out, the result of all evaluations using (3), (4) is a set of regional types by dynamics character. Generally, the result of DEA application is the set of scores also which shows the ways of comparative efficiency improvement for each inefficient region. Neverthe- less, this issue is not treated in this procedure because it requires the special attention and investigation due to its complexity. 4 Evaluation by Using Procedure We examined the issue of inequality of regional economy performance for the period from 2008 to 2010. Federal State Statistics Data is used from www.gks.ru [15]. We Decision Supporting Procedure for Strategic Planning … 389 evolved a set of indicators that can be used in a broader context in order to identify factors influencing on a regional underdevelopment. Stage 1. The set of indicators for models consists of resources and results of re- gional agrarian sector performance. Indicators are reported in Table 1. Stage 2. Set of model’s variables defined this way: five resources are taken as in- puts while three results are taken as outputs. The volume of region population is con- sidered as the special variable for the set’s normalization. Table 1. The set of regional performance indicators Type Indicator Resources number of cattle, thousand heads (x1) organizations acreage under crops, ha (x2) average number of employees, thousand people (x3) power capacity, thousand horsepower (x4) equipment park (tractors), units (x5) Results gross grain yield, thousand tons (y1) production of milk, thousand tons (y2) production of livestock and poultry, thousand tons (y3) Variable for Nor- malization volume of region population, thousand people Stage 3. Next, the homogeneity of conditions was checked for all agrarian regional systems, and asymmetry of land’s quality founded out. The input called “organiza- tions acreage under crops” is adjusted by the coefficient of cadastral value of agricul- tural land. The rule of ratio of variables’ number and objects’ number is kept, there- fore, modeling is made for 53 quite similar agrarian Russian regions. Additional restriction to weights is used in order to improve the discrimination ca- pability of the model. Direct restriction on the ratio of the quantity of employees and the power capacity presented by the following ratios: (5) (6) (7) where: 390 K. Mesropyan price_x3 – regions’ average monthly salary; price_x4 – regions’ average energy price; x3, x4 – inputs which are taken in the normalized forms according to the second stage. Stage 4. According to (3) and (4), the calculation cycle is done, and obtained scores are insensitive regarding the model parameters changes. It was approved by decreasing of the set of analyzed objects. Besides, the Malmquist Indices values are similar to current expert opinion on the character of current tendencies of technologi- cal progress changes in the industry for analyzed period. Stage 5. The quantitative scores combined with qualitative evaluation of risks and conditions of regional development allow finding out the regions taxonomy by using obtained knowledge on type of efficiency dynamics. The procedure has conducted from the first to the fifth stage given in Fig.1. The most significant agrarian regions of Russia are located on the Southern terri- tory which consists of two state districts, namely Southern Federal District and North- ern Caucasus Federal District. There are two strategies for these regions: Southern Federal District Strategy and Northern Caucasus Federal District Strategy [16, 17]. Obtained results can be part of analytic reports of these policy development docu- ments (tables 2-3). Development scores are presented for period 2008 - 2010 in the tables. The indicator is equal to «+»/«–» in the case of positive/negative dynamics. Table 2. Efficiency Scores for the Southern Federal District Strategy Region Development Region Score Type Republic of Adygeya ++ Stable Growth Republic of Kalmykiya ++ Stable Growth Krasnodar Region –+ Unstable Decline Astrakhan Region +– Unstable Growth Volgograd Region ++ Stable Growth Rostov Region +– Unstable Growth Table 3. Efficiency Scores for the Northern Caucasus Federal District Strategy Region Development Region Score Type Daghestan –– Stable Decline The Ingush Republic ++ Stable Growth Republic of Kabardino-Balkariya –– Stable Decline Republic of Karachaevo-Cherkesiya –– Stable Decline Republic of Northern Osetia Alaniya –+ Unstable Decline The Chechen Republic –– Stable Decline Stavropol Region ++ Stable Growth Decision Supporting Procedure for Strategic Planning … 391 Although analyzed period covers the crisis years, the agrarian production of the South of Russia shows the reserve of stability. Besides, it is brought out that the Southern regions belong to Pareto-Efficient set of Russian regions. Thus, only 4 regions among 13 of the South are estimated as having the stable de- cline. The economic development opportunities of this regions are significant, never- theless the considerable potential of regions is not using. Such indicators’ further analysis can be used for adjustment of scenario data tasks in the field of regional development foresight and strategic planning. The obtained results also can be suitable for equalization policy design in order to steady the level of regional efficiency during long-term period. In addition to this, it is important that risks, conditions and possible consequences of the policy should be assumed for each scenario of regional development. 5 Conclusions The verification of suggested procedure along with the DEA model demonstrates the positive results that approve the possibility of the procedure application to prospective studies in the field of production analysis as well as strategic management. As it was shown, the obtained results can be part of the quantitative investigations for the current strategies of development policy. In addition to this, the development scores can be used together with regional development risks and opportunities analy- sis, indicators of economical efficiency, such as gross domestic product per capita, enterprises profitability, etc. Thus, obtained scores will add knowledge to current indicators’ systems of regional economic efficiency and improve approach objective- ness and effectiveness of state regional policy activity. This article is conducted within Program of the Presidium of Russian Academy of Sciences № 32 “Fundamental Issues of Polyethnic Region Modernization in Terms of Tensions Growth”. References 1. Koopmans, T. C.: An Analysis of Production as an Efficient Combination of Activities. Activity Analysis of Production and Allocation. Cowless Comission for Research in Eco- nomics. Monograph No. 13, New York: Wiley, pp. 15–32 (1951) 2. Farrell, M. J.: The Measurement of Productive Efficiency. J. of the Royal Statistical Soci- ety, Series A (General), Part III. 120, 253–281 (1957) 3. Charnes, A., Cooper, W. W., Rhodes, E.: Measuring the Efficiency of Decision Making Units. European J. of Operational Research, 2, 429-444 (1978) 4. Cook, W. D., Seiford, L. M.: Data Envelopment Analysis (DEA) – Thirty Years On. Euro- pean J. of Operational Research, 192, 1–17 (2009) 5. Dyson, R. G., Allen, R., Camanho A. S., Podinovski V. V., Sarrico C. S., Shale E. A.: Pit- falls and Protocols in DEA. European J. of Operational Research, 132, 245–259 (2001) 6. Avkiran, N. K., Parker, B. R.: Pushing the DEA Research Envelope. Socio-Economic Planning Sciences, 44, 1–7 (2010) 392 K. Mesropyan 7. Liu, J. S., Lu, L. Y. Y., Lu, W.-M., Lin, B. J .Y.: Data Envelopment Analysis 1978–2010: A Citation-Based Literature Survey. Omega (2012) 8. Caves, D. W., Christensen, L.R., Diewert, W.E.: The Economic Theory of Index Numbers and the Measurement of Inputs, Outputs and Productivity. Econometrica, 50(6), 1393– 1414 (1982) 9. Färe, R., Grosskopf, S.: Intertemporal Production Frontiers: With Dynamic DEA. Kluwer Academic, Boston, MA (1996) 10. Tsuneyoshi, T., Hashimoto, A., Haneda, S.: Quantitative Evaluation of Nation Stability. J. of Policy Modeling, 34, 132–154 (2012) 11. Yamamura, E., Shin, I.: Effects of Income Inequality on Growth through Efficiency Im- provement and Capital Accumulation. MPRA Paper No. 10220, http://mpra.ub.uni- muenchen.de/10220/ (2008) 12. Mesropyan, K., Goryushina, E.: Economic Heterogeneity and Political Instability: Experi- ence and Prospects for Cross-Country Comparisons. The Region Economy: Problems, Findings, Prospects, Issue 13. OON RAN, ISERH SSC RAS, Volgograd, 58–66 (2012) (in Russian) 13. Emrouznejad, A., De Witte, K.: COOPER-Framework: A Unified Process for Non- Parametric Projects. European J. of Operational Research, 207(3), 1573–1586 (2010) 14. Golany, B., Roll, Y.: An Application Procedure for DEA. Omega, 1(3), 237–250 (1989) 15. Federal State Statistics Data, www.gks.ru (in Russian) 16. Southern Federal District Strategy, http://www.minregion.ru (in Russian) 17. Northern Caucasus Federal District Strategy, http://www.minregion.ru (in Russian) Applying of Fuzzy Logic Modeling for the Assessment of ERP Projects Efficiency Andriy Semenyuk Lviv Academy of Commerce, Tugan-Baranovskoogo. 10, 79005 Lviv, Ukraine andriy.semenyuk@gmail.com Abstract. ERP software is one of costly and crucial projects for business in- vestment. It is known that nowadays Enterprises can rarely afford to implement long-term projects, in most cases the duration of implementation varies from 3- 4 months (automation of individual store retail chain) and 1-1.5 years when it comes to big projects. Only successful combination of analytical tools and methodologies will allow the project to realize and implement ERP-solutions for commercial enterprises on time and according to set business requirements. This paper proposes a practical assessment model which applies both the fuzzy analytic logic approach and the Expert Judgment method to evaluate whether the ERP software implementation project has succeeded or not. Keywords. Modeling, Efficiency, Project, Information Technology, Implemen- tation, Enterprise, ERP Key terms. Model, Methodology, Process, Management, Approach 1 Introduction Today it is especially important for Ukrainian Enterprises to be capable to analyze all the environmental aspects within they operate and be able to plan all needed resources with the most possible accuracy. To gain such competitive advantages within chang- ing economy and unstable markets, for particular Enterprise is not enough just to have the most modern production lines or good educated personnel it also requires to pos- sess some advanced technologies and modern information management systems that will quickly allow to react and adapt all the further changes. That is why more and more Enterprises in Ukraine from different sectors of economy are choosing to im- plement the Enterprise Resource Planning (ERP) system. ERP plays an important role to integrate organization's information and functions in all areas of enterprise activities and within the variety of departments and finally results in successful operation on the market. However ERP implementation as a pro- ject itself is costly and time consuming, it also can lead to loss of many valuable re- sources of the company in case of wrong approached and not efficient way of imple- 394 A. Semenyuk mentation. So it is critically important for the Enterprises to understand and clearly realize all the value achieved from ERP initiative [2]. Many factors are essential in determining the efficiency of the ERP projects. Since most of these factors are qualitative and relations between them are very complicated, determining their exact quantitative values is quiet difficult. Using the combined methods of Expert Judgment and Fuzzy logic can be helpful to simplify the calcula- tions and finally leads to a more precise result to determine generalized efficiency value of the implemented ERP-project. In this research, we intended to gather optimal KPIs values from different Enterprise activities and departments based on Experts Judgment data and design a combined Fuzzy Model to assess the efficiency for ERP- project [4]. 2 Problem Statement and Goals of the Paper The most common aspects and issues related to ERP systems development and im- plementation methods of ERP-projects seems to be widely discovered observed and investigated in works of many foreign as well as Ukrainian researches and scientists [1], [2], [4], [5]. However the problems related with particular to the efficiency of such projects are not enough covered. In view of this subject of the research is still topical. The main aim of the paper is to present a model for evaluating the effectiveness of ERP-project implemented on the markets of underdeveloped economic systems with involves a combination of fuzzy logic and expert judgment methods. Because the initial data for measuring the effectiveness of ERP-project is mostly inaccurate and variable so the use of fuzzy logic techniques to enhance the data gathered from the expert is really feasible here. 3 Proposed Model and Approach In order to develop the model for the assessment of ERP-project efficiency first of all it was conducted a set of related Expert Judgment questionnaires sessions, for the proposed assessment methods and optimal performance indicators or key performance indicators (KPI) values, in other words KPI is a type of performance measurement. An organization may use KPIs to evaluate its success, or to evaluate the success of a particular activity in which it is engaged. Sometimes success is defined in terms of making progress toward strategic goals, but often success is simply the repeated, peri- odic achievement of some level of operational goal (e.g. zero defects, 10/10 customer satisfaction, etc. Accordingly, choosing the right KPIs relies upon a good understand- ing of what is important to the organization [9]. The basics input data of our research are KPIs of ERP-projects that were received from managers of ERP-projects, and classified according to the criterion of "trend change". Determination of further scope of KPIs and the major ERP success factors also is based on variety international consulting agencies reports KPIs values, and ERP-projects Applying of Fuzzy Logic Modeling for the Assessment … 395 statistics obtained from such reports was additionally verified with the involved experts [6], [7], [8]. As the result it has been allocated four main groups of indicators: 1) X - perform- ance "increase group of KPIs" (the actual value of which increased for the Enterprise after ERP implementation), 2) Y - values "reduce group of KPIs" (the value of which decreased for the Enterprise after ERP implementation) 3) W – project financial and investments indicators. 4) Z - Generally optimized qualitative KPIs for different proc- ess aspects within Enterprise. To be able assess the effectiveness of ERP-project we developed a structural com- bined model (depicted on Fig. 1). The model contains of methodological approaches and structure of key performance indicators to determine the effectiveness of ERP- project. Summarized decision tree inference, which is presented at the bottom of the model, reflects the hierarchy of input variables. Fig. 1. Combined Model of ERP Projects Efficiency Assessment Developing a questionnaire for determining the Fuzzy (if-then) Rules with respect to the three optimal KPIs values obtained from Experts as inputs and ERP project efficient value as output. Validity and reliability of the questionnaire was confirmed and they were distributed among the ERP practitioners. Fuzzy rules as the basis for determination of the conditions of the company KPIs have been formed and entered into Fuzzy system through MATLAB software. Currently there are variety of software tools and system for applying the fuzzy logic calculations available on the market (CubiCalc 2.0 RTC, CubiQuick, FIDE, Flex Tool, FuziCalc, FuzzyTECH, JFS, MATLAB - Fuzzy Logic Toolbox, RuleMaker etc). Each of these products has its own strengths and weaknesses, however as soft- ware platform for our research it was decided to go with MATLAB Fuzzy Logic 396 A. Semenyuk Toolbox (FLT) as most appropriate tool in particular because of the integrated nature of the MATLAB environment that also provides functions, applications, and a simula- tive block for analyzing, designing, and simulating systems based on fuzzy logic, widely used not only in academic and research institutions but by industrial enter- prises as well. To determine the value of each factor, different questionnaire was prepared to collect related information for the KPIs values. Also validity and reliability of mentioned questionnaire was confirmed and it was distributed among managers and experts in that domain. Calculated means obtained from questionnaire, results has been inputted to the Fuzzy System (see fig. 2). Final results were analyzed and the efficiency of the imple- mented ERP project was determined by this software. Fig. 2. Identifying the membership functions for each input and output variable To design a Fuzzy system by MATLAB following: X, Y, Z, W are lingual values that also represents an organizational major KPIs groups and E is the lingual value of summa- rized project Efficiency (see fig. 3 and fig. 4). Fig. 3. The Fuzzy system of assessing the efficiency of the ERP implementation in MATLAB software Applying of Fuzzy Logic Modeling for the Assessment … 397 Fig. 4. Drawing a conclusion in Fuzzy system 4 Conclusions Proposed approach and model of determining the efficiency level of the implemented ERP-project with fuzzy logic can be used by variety of ERP practitioners, project managers, top management personnel of Enterprises that implements ERP-system for advanced analysis of the actual project results. Fuzzy rules should be further validated and formed by consulting with ERP practitioners and their results will be entered into knowledge base of Fuzzy system. Methods of fuzzy logic computing combined with provided expert judgment aimed optimize the speed of the decision-making on project efficiency level and simultaneously provide more accurate assessment abilities. References 1. Finney. S., Corbett. M.: ERP Implementation: a Compilation and Analysis of Critical Success Factors. Business Process Management Journal, 13(3), 329–347 (2007). 2. Allen, D., Ken, T., Havenhand, M.: ERP. Critical Success Factors: an Exploration of the Contextual Factors in Public Sector Institution. In: Proc. 35th Hawaii International Confer- encr on System Sciences, pp. 244–247 (2002) 3. O'Leary, D.: ERP Systems: Modern Planning and Enterprise Resource Management. Se- lect, Implement, Utilize. Vershina, Moscow (2004) 4. Savavko, M.: IS Fuzzy Expert. Publishing House of I. Franko Lviv National University, Lviv (2007). 5. Nozdrina, L.: Applying of Fuzzy Logic Modeling for the Assessment of ERP Projects Ef- ficiency. In: Proc. 5th Int. Sci. Conf. Project Management: Status and Opportunities, pp. 1–2, NUS, Nikolaev (2009) 398 A. Semenyuk 6. Gartner: Information Technology Research and Advisory Agency. http://www.gartner.com/technology/home.jsp 7. Panorama Consulting. Consulting Firm with Expertise in ERP Software, http://panorama- consulting.com 8. IDC. Global Provider of Market Intelligence, Advisory Services, and Events for the Infor- mation Technology, http://www.idc.com/home.jsp?t=1365517508962#.UWQk-5NTCjg 9. Austin, R. D.: Measuring and Managing Performance in Organizations. Dorset House Pub- lishing (1996) Applying of Fuzzy Logic Modeling for the Assessment … 399 Appendix A. Example of ERP User Satisfaction Survey 400 A. Semenyuk Appendix B. Example of ERP Expert Judgment Questionnaire Mathematical Model of Banking Firm as Tool for Analysis, Management and Learning Victor Selyutin1,2 and Margarita Rudenko1 1 Research Institute of Mechanics and Applied Mathematics, vvs1812@gmail.com, ritusik@mail.ru 2 Economic Faculty of Southern Federal University 220/1, av. Stachki, 344090, Rostov-on-Don, Russia Abstract. An essential concern for banking firms is the problem of assets and liabilities managing (ALM). Over last years a lot of model tools were offered for solving this problem. We offer the novel approach to ALM based on trans- port equations for loan and deposit dynamics. Given the bank's initial state, and various deposit inflow scenarios the model allows provide simulations includ- ing stress-testing, and can be used for assessment of liquidity risk, for examine loan issue decisions to choose reasonable solution, and in the learning purposes. Keywords. Asset- liability management, Differential equations, Liquidity risk, Duration Key terms. Banking, Mathematical Modelling, Decision making 1 Introduction A banking firm is rather a complex system within the context of management prob- lem. It is caused by a considerable number of financial flows and the funds, having a various origin and differing by dynamic and probabilistic characteristics, and at the same time forming the unified system. Stable functioning of the system is provided due to hierarchy, external (prudential supervision) and internal regulators and restric- tions, and feedbacks. Among the mathematical models of banking firms it is possible to separate two ba- sic groups. There are models of optimization of assets portfolio (static, single and multi-period) using linear and dynamic programming mainly [1-2], and models of assets and liability management (ALM), using methodology and the technique of the stochastic differential equations [3-5]. One of the problem solved by models ALM is management of various risks (espe- cially credit risk and interest-rate risk), including the problem of default probability decrease. 402 V. Selyutin and M. Rudenko In connection with computer engineering development, from the middle of 70th years of the last century the computer models of banks focused on problems of plan- ning and decision-making support systems began to appear. However such projects had no further development [6-8]. Then we will turn our attention to one of the possible approaches to bank model- ling as a dynamic system, which can be called hybrid. The main tasks which the model developed must solve are the analysis and management of liquidity and stress- testing of a bank. In addition, it can be used for optimization of assets profile. Aggregation of elements of balance sheet can be varied according to the objectives of modelling and principles developing of state variables vector. We will use the fol- lowing simplified schematic (Tab. 1). Fixed assets of bank we will ignore, taking into account only financial flows. Ob- viously, balance sheet equation takes place: A= S+B+Q+X=Y+C+M=L , (1) where equity (capital) of a bank C is a balancing variable. For detailed modelling of credit risks, loans issued can be divided by categories of the debtors having various reliabilities. Division of deposits on time and demand is necessary for calculation of instant liquidity. It is ignored in considered below version of the model for simplicity. Table 1. The aggregated balance sheet of commercial bank. Assets (A) Liabilities (L) Loans issued Business Debt (Y) Time deposits (X): Private customers On-demand deposits (buyer`s credits, mort- and current ac- gage etc.) counts Other banks Inter-bank credits (M) Securities Shares (Q) Bonds (B) Reserves (S) Cash Equity, including retained profit Rest fund, loan loss re- of last periods (C) serves etc. Formally it is possible to mark three groups of operations in the balance-sheet ta- ble: Reallocation of assets between separate items Reallocation of liabilities between separate items Identical change of assets and liabilities at one period Though the bank opens a position in liabilities with grant of a loan (opening of a credit line) at one time, from the formal point of view this operation is resolved into reallocation of asset`s items. Mathematical Model of Banking Firm … 403 Similarly, if the deposit remains unclaimed in maturity date it either is prolonged, or is transferred in demand deposits (with no interest accruing or with the minimum percentage) according to contact conditions. Actually, in this case there is a realloca- tion of liability`s items. At last, when interest on loans (or other types of income or expenses) are received (or repaid), at one time it is changed both assets, and liabilities, own capital of bank increases or decreases. 2 Model with Certain Terms of Loans and Attracted Funds The main difficulty in modelling of assets and liabilities dynamics is concerned with necessity taking into account terms of loans and deposits. Due to these the state vari- ables must depend on two parameters - current time (t) and current "age" () or the remained term to maturity (T-). That is why dynamics of the issued loans can be described by following transport equation: x x u ( t , ) (2) t T In addition X ( t ) x ( t , ) d - total amount loans issued, 0 T X * ( t ) x ( t , ) e d - present value of loans, Т – term of loans. 0 Movement of time deposits is described similarly: y y v ( t , ) t (3) T Y ( t ) y ( t , ) d - total amount of time deposits, 0 T Y * ( t ) y ( t , ) e d - present value of time deposits, T – term of deposits. 0 Variables u(t,) and v(t,) denote the flows of issued loans (temporary outflow of financial resources of bank) and deposits (temporary inflow) distributed by time tak- ing into account amortization (interest payment or installment credits). Accordingly, total inputs of loans U(t) and deposits V(t) (or its present values U*(t) and V*(t)) is described as: T T U (t ) u (t , ) d , U * ( t ) u ( t , ) e d 0 0 T T V (t ) v (t , ) d , V * ( t ) v ( t , ) e d 0 0 Solution of the equations (2-3) can be represented in the closed form: 404 V. Selyutin and M. Rudenko t x ( t , ) u ( , t ) d ( t ) 0 t y (t , ) v ( , t ) d ( t ) 0 where () and ()- initial distributions of loans and deposits by "age", or may be obtained by use corresponding equations with finite differences. Dynamics of reserves (S) and equity (C) is described by the equations including stochastic members which consider random nature of change in value of shares and possible loans losses: dS U t V t X X Y Y B B M M Z t dt Qdt QdWt xt dJ t dC X X Y Y B B M M Z t dt Qdt QdW t xT dJ t where dWt – increment of Wiener stochastic process, dJt – increment of compound Poisson process with exponential distributed size of jumps (loan losses), Z(t) – opera- tion expenses and payment for dividends; xT(t) – repayment of a loans in maturity date, X, Y, B, M – accordingly interest on loans, deposits, bonds income, cost of credits; - average portfolio return of trading securities, - volatility of securities portfolio. Investments in liquid assets - shares Q(t) and bonds B(t) can be considered as some parameters of management and to be calculated, proceeding from structure of assets chosen or planned by bank taking into account loan demand. Similarly, the volume of received loans M(t) can be select depending on bank`s requirement in financial re- sources. It is necessary to add the equations of dynamics of duration to the equations of movement of assets and liabilities to model liquidity risk taking into account change of interest rates If r – the annual interest rate, so in this case Macaulay duration for an asset x(t,) is defined by expression: T 1 D x t T x t , e d , X * t 0 where = ln(1+r). Similarly duration of another financial flows y(t,), u(t,), v(t,) are calculated. It is possible to show that dynamics of duration is described by any of presented below the equations which is chosen according to liquidity research tasks. dD x U * (t ) D u (t ) 1 ( t ) D x D x dt X * ( t ) dD x U * (t ) xT (t ) ( D u (t ) D x ) 1 D x D x dt X * (t ) X * (t ) Mathematical Model of Banking Firm … 405 dD x x (t ) ( t )[ D u ( t ) D x ] 1 D u ( t ) T D x dt X * (t ) where dX * 1 (t ) dt X * Model (2-3) has been transformed to system of difference equations and realized as computer program [9]. The user independently chooses one of two operating modes of the program: calculation in case of predefined planning horizon, or calculation with possible correction of parameters, setting physical speed of calculation. The program is interactive as the user can change values of some key parameters in the process of calculation, without interrupting its work. As key parameters are cho- sen: a fraction of cash invested in various kinds of assets, revenues (interest rates), a duration of demand deposits, credit demand, inflow of deposits, crediting scenarios (distribution of loans by time). Dynamics of inflows and outflows of cashes; diagram of change of durations of as- sets and liabilities; distributions of loans and deposits, and also input flow by time are displayed on the screen of computer. A stress-testing is provided in the program. The user can choose the period of stress-testing and such stresses-scenarios as decrease in inflow of deposits, decrease in duration of deposits (the scenario of outflow of deposits); decrease in accessible volume of attracted funds on the interbank market. 3 Model with Fixed Terms of Lending and Borrowing Model (2-3) presented above is rather difficult in numerical realization and does not allow to consider some important facts, for example, dependence on interest rates from different terms of lending or borrowing. Therefore we will consider simplified modification of previous model under supposing that terms of loans and deposits are fixed. It is possible to fix the most typical terms according to the classification used in the bank reporting, in spite of the fact that terms of loans (or deposits) can be arbitrary. Both loans and time deposits are structured by terms as follows: till 30 days, from 31 till 90 days, from 91 till 180 days, from 181 days till 1 year, from 1 year till 3 years, over 3 years. Accordingly, it is possible to establish several typical periods Tk for each of them time transactions are described by the partial differential equation of the first order x x a( , x) (4) t with a boundary condition x(t,0)=u(t) and the initial condition x(0,)=(). Initial and boundary condition should be consistent, that is u(0)=(0). Here t - current time, 0t<, - elapsed time since the moment of settlement of transaction ("age" of an loan or deposit), 0<T, a(,x)- value of "amortization" of an loan or deposit (inflation, installment credit etc.). 406 V. Selyutin and M. Rudenko Similarly (2-3), the variable x(t,) is the allocated variable characterizing some credit tools, accounted in assets or in liabilities (loans for limited period, time depos- its, interbank lending or borrowing, coupon bonds or other assets and liabilities with the fixed term of repayment). Further it will be assumed that a ( , x ) x (5) i.e. repayment of credits occurs proportionally to their volume with coefficient , which is not dependent on age. It can be used and other schemes (when credit repay- ment begins not at once and (or) occurs in advance established equal shares. It is easy to verify that the solution of the equation (4) looks like a travelling wave x(t , ) u (t ) exp(x) (6) For consistency an initial and boundary conditions at t<T it is necessary to pre- determine u(t) on an interval t[-T,0). From (4) - (6) follows x(0, ) ( ) u( ) exp( ) (7) and after replacement for -t, u (t ) (t ) exp(t ) under -Tt<0 (8) The total value of the considered loan (or deposit) are obtained by integration on age T X ( t ) x (t , ) d 0 (9) Substituting (6) in (9), we have T X ( t ) u ( t ) exp( ) d (10) 0 Integrating (4), we obtain the ordinal differential equation dX u (t ) X x (t , T ) u (t ) X u (t T ) exp( T ) (11) dt As assets with different terms of repayment are in portfolio of assets or liabilities, so it is possible to replace scalar variable X(t) in (11) with vector. Vector`s compo- nents are financial tools with different terms of repayment dXk uk (t ) k X k xk (t, Tk ) uk (t ) k X k uk (t Tk ) exp( kTk ) (12) dt For simplicity further we will suppose Tk=k, where k - the term, expressed in months. Time tools (issued loans, bonds, interbank credits, time deposits) from the mathe- matical point of view are similar, that is why we will consider them in the context of one and only construction, giving the general designation: Xk - to time tools in assets and Yk – in liabilities. Then the previous model can be presented as: dX k uk (t ) k X k xk (t , k ) uk (t ) k X k uk (t k ) exp( k k ) (13) dt dSt St dt St dWt f (t )dt (14) Mathematical Model of Banking Firm … 407 dQ dY dX dZ k k k X k k Yk g (t ) f (t ) (15) dt k dt k dt dt k k dYk vk (t ) k Yk yk (t , k ) vk (t ) k Yk vk (t k ) exp( k k ) (16) dt dZ Z w(t ) (17) dt Dz where w(t) - inflow of on-demand deposits , vk(t) - inflow of time deposits and bor- rowed funds; f(t) - purchase (+) or sale (-) trading securities (t/s); g(t) – operation costs on carrying out of activities of bank; - securities portfolio return; - volatility of securities portfolio; Wt- Wiener stochastic process; k - interest on the time deposits and borrowed funds; k - interest on issued loans; Dz - duration (characteristic turn- over time) on-demand deposits. It is easily to obtain the equation of dynamics of equity by differentiation of bal- ance equality and corresponding substitutions (13) - (17). As follows, dC dS k X k k Yk t f (t ) g (t ) (18) dt k k dt For simplicity it is supposed complete withdrawal of deposits after term in this ver- sion of model. However it is easy to take into account possibility of prolongation of the deposit or its transfer in category on-demand deposits. It is considered that divi- dends are not paid. Besides, credit risks (default risk, or a delay of payments) are not considered, that also it is possible to take into account by entering of corresponding adjustments. It is considered that interests on the attracted funds and the received credits are paid according to accrual. However it is easy to set and other scheme in which interests are accumulated on depositary accounts and are paid after term of deposit. Let k X k X and - k Yk Y - structure of time loans and deposits. Besides, for simplicity we will assume that there are no investments in trading se- curities. Then dynamics of the capitals are described by the equation: dC X k k Y k k g (t ) , (19) dt k k It is giving evident representation about sensitivity of dynamics of capital to changes of main parameters of assets and liabilities. Main objective of shareholders and bank management is the increase in capital: dC max (20) dt subject to restrictions on financial resources and risks (credit and market, loss of li- quidity, bankruptcy). 408 V. Selyutin and M. Rudenko 4 Conclusions The approach to mathematical modelling of cash flow moving in asset and liability accounts of the commercial bank based on the partial differential equations is novel and has no analogues in the literature. At the same time, the given approach is quite logic as reflects process of change of actives simultaneously in time and on "age". Depending on particular theoretical or practical problems the given approach can be realized in the various modifications, two of which are presented in the article. As the preliminary testing has shown, the computer program created by use model (2-3) allows provide various simulations, including stress-testing, and can be used in the educational purposes to provide the best understanding of the dynamic processes taking place in banking firm. It is necessary the further development of the offered modelling approach such as improvement of program tool and also, as required, model detailed elaboration to use these models as part of decision support system for asset and liability management in commercial bank. The modified model (13-18) has been proposed for these goals. References 1. Chi, G., Dong, H., Sun, X.: Decision Making Model of Bank’s Assets Portfolio Based on Multi-period Dynamic Optimization. Systems Engineering – Theory & Practice, 27(2), 1–16 (2007) 2. Kruger, M.: A Goal Programming Approach to Strategic Bank Balance Sheet Manage- ment. Banking, Financial Services, and Insurance. In: Proc. SAS Global Forum, Paper 024–2011 (2011) 3. Kosmidou, K., Zopounidis, C.: Asset Liability Management Techniques. Handbook of Fi- nancial Engineering, pp. 281–300, Springer Science+Business Media, LLC (2008) 4. Mukuddem-Petersen, J., Petersen, M.A.: Bank Management via Stochastic Optimal Con- trol. Automatica 42, 1395–1406 (2006) 5. Mulvey, J.M., Shetty, B.: Financial Planning via Multi-stage Stochastic Optimization. Computers & Operations Research 31, 1–20 (2004) 6. Solyankin, A.A.: Computerization of the Financial Analysis and Forecasting in Bank. Fin- StatInform, Мoscow (1998) (in Russian) 7. Robinson, R.S.: BANKMOD: an Interactive Simulation Aid for Bank Financial Planning. J. Bank Res. 4(3), 212–224 (1973) 8. Moynihan, G.P., Purushothaman, P., McLeod, R.W., Nichols, W.G.: DSSALM: a Deci- sion Support System for Asset and Liability Management. Decision Support Syst. 33(1), 23–38 (2002) 9. Alekseev, I.V., Selyutin, V.V.: Interactive Computer Model of Bank`s Asset and Liability Dynamics. Terra Economicus 9(4), Part 2, 42–47 (2011) (in Russian) 2.2 1st International Workshop on Methods and Resources of Distance Learning (MRDL 2013) Foreword 1st International Workshop on Methods and Resources for Distance Learning (MRDL) has taken place on 19-22 of June 2012, in Kherson, Ukraine in conjunction with 9-th International Conference on ICT in Education, Research, and Industrial Applications: Integration, Harmonization, and Knowledge Transfer (ICTERI 2013). Distance learning is an important application field that intensively uses information and communication technologies in education. MRDL workshop will bring together reports dealing with the problems of resource maintenance, pedagogic and didactic methods of use of distance learning technologies. The scope of the MRDL workshop includes the following topics: Virtual laboratories: Mathematical and informational models and educational tools for virtual labs, covering in particular the distance learning courses in phys- ics, chemistry, biology and other disciplines. Design and development of electronic learning tools and resources: Design, development and use of electronic learning resources. Compatibility and integra- tion of electronic resources in distance learning systems. Computer-aided learning systems: Design, development and use of computer- aided learning systems. Pedagogical innovations in distance learning: Experience in developing and implementation of new pedagogical methods of distance learning using ICT. Open distance courses and tutors training in author's course content. Monitoring of learning quality: Methods and tools for the estimation of quality of knowledge in distance learning systems, testing, rating systems, feedback. Quality of electronic learning resources: Standards for electronic resources for distance learning. Modeling of and experience in using quality management sys- tems for electronic learning resources. Information of teaching and educational institutions: Modeling and experience in the use of management systems in information processing and other manage- ment processes at educational institutions. Experience in the use of distance learn- ing technologies in a traditional educational process. A blind peer-review process by at least two reviewers with expertise in the area has been carried out. As a result, 22 submissions have been accepted as reports, 3 of them are presented in this edition. We would like to thank the authors for their submissions and our Program Committee members for their reviews. June 2013 Vladimir Kukharenko Yulia Zaporozhchenko Hennadiy Kravtsov What Should be E-Learning Course for Smart Education Natalia V. Morze1 and Olena G. Glazunova2 1 Borys Grinchenko Kyiv State University n.morze@kmpu.edu.ua 2 National University of Life and Environmental Sciences of Ukraine e_glazunova@yahoo.com Abstract. The article deals with problems of creation and use of e-learning course for smart education. Structural features, the ratio of form and content of the smart course elements and its properties: individual learning paths, content personification, the use of training elements with links to public information re- sources, interactive training elements, multimedia, communication and coopera- tion elements are substantiated. Keywords. Smart education, e-learning course, informal learning, individual learning path, services of social networks, Content Learning Management Sys- tems Key terms. KnowledgeEvolution, KnowledgeManagementMethodology, Di- dactics, KnowledgeManagementProcess, ICTInfrastructure. 1 Introduction Modern information society is gradually transformed into Smart Society, as noted by sociologists, philosophers, specialists in IT sector, educational specialists, etc. This concept implies the new quality of society, in which a set of technological means, services and Internet used by trained people, leads to qualitative changes in the inter- action of subjects that allow receive new effects – social, economic and other benefits for a better life [1]. During Smart Society formation the paradigm of education and educational tech- nology is naturally changing. The tasks of training of the new format specialist, suc- cessful and competent to work in the Smart Society rely on the new universities – Smart Universities where the integration of technological innovations and the Internet can provide a new quality of the educational and scientific processes, the results of training, scientific, innovation, educational, social and other activities. The conceptual basis of the Smart University is a large number of different scien- tific sources, and information and educational materials, multimedia resources (audio, 412 N. V. Morze and O. G. Glazunova graphics, video) that can be easily and quickly designed, assembled to a certain set, adjusted individually for each student, his/her need and peculiarity of educational activity and the level of educational achievements. 2 Problem It is obvious that in conditions of development of Smart Society the educational para- digm will also change. Smart Universities will perform new functions. Accordingly, the requirements for e-learning courses that ensure students’ needs in educational resources will change. Our mission is to substantiate theoretically the properties of such e-learning course, its structure and components, and to test the effectiveness of its use experimentally as well. 3 The Presentation of the Main Research and Explanation of Scientific Results 3.1 Characteristics of the Smart University 5 key characteristics of the Smart University can be distinguished: social orientation, mobility, accessibility, technological effectiveness and openness [2]. Social orientation consist in the personalization of education, building of the indi- vidual education cards (Smart-card), organization of the efficient communication and collaboration in education, cooperation, application of design and game techniques, communication via social networks services, etc. The second, equally important, feature of the Smart University is mobility. Mobil- ity should be understood not only in the narrow interpretation - as an access to the educational content through mobile devices and their use for scientific researches, payment transactions, implementation of feedback with the teacher or the representa- tives from the dean office or departments, etc [3]. Mobility is important as an access of each student and teacher to the educational services from any place and at any time. Accessibility as feature of the Smart University is characterized by a single point of entry to e-learning and scientific databases, media library, information kiosks, online resources and access control systems to them, etc. Technological effectiveness provides a viability of the Smart University IT infra- structure by the means of cloud-technologies, innovative technologies of virtualiza- tion, open interfaces, based on the principles of simplicity, modularity, scalability, etc. Openness in the system of the Smart University foresees availability of the open repositories of educational materials for forming e-learning courses and providing training for students, open access to scientific articles and conducted researches and their results [4]. What Should Be E-Learning Course for Smart Education … 413 3.2 Infrastructures of the Smart University Modern university should have the appropriate infrastructure to support the require- ments of the Smart Education. In particular, the activity of e-learning center, multi- media center, scientific laboratories with the relevant open virtual environments and open resources, library, including electronic one with the open access to the resources, multimedia classrooms and computer labs should be based on the use of advanced campus network with Internet access, including one on the basis of wireless technolo- gies, cloud infrastructure, technologies of mobile access to e-learning resources, sys- tem of distributed access to the resources. The effective functioning of such a sophis- ticated infrastructure is impossible without a united center of data processing, from where materials are distributed to the structural units: institutions, faculties and de- partments, regional branches, dormitories, academic buildings, student centers, etc. (Fig.1). Effective activity of the Smart University will enable to realize not only the tasks of formal, but nonformal and informal training as well. According to studies [5,6] nonformal and informal learning takes 70% in the total structure of the educational process and only 30% of learning is formal, in other words structured by years and semesters of training, learning plans and programs that is usually provided by the institution. 3.3 Features of the Smart Education Smart Education sets a number of tasks for teacher on which performance depends the effectiveness of teaching and students motivation to nonformal and informal learning, which is based on the students skill to study independently. To interest the modern student, who has access to a large number of high quality modern electronic materials that can be easily found in the Internet, by conventional text linear (non-multimedia) materials, only presented in electronic format, nowadays is almost impossible, espe- cially in formal training. We should create such resources that will integrate multime- dia, text, feedback tools on the basis of specific teacher’s individual recommendations and external electronic resources that will meet individual needs and characteristics of the modern student - regular user of the Internet resources and social networks. There- fore the integral components of information and educational environment of the mod- ern university should be: institutional repository of knowledge with full-text elec- tronic educational and scientific resources; educational portal, which provides elec- tronic support of all student’s learning activities for each discipline in the form of e- learning courses with individual tasks and distinct and clear evaluation criteria that are implemented with tools and methods of forming assessment; video portal with multimedia resources for teaching and research activities; wiki portal as an environ- ment to provide teamwork and collaboration; online services based on the use of Web 2.0 and Web 3.0 services and technologies, etc. One of the main tendencies in the development of Smart Education is the openness of learning systems – placing the educational content openly available to students around the world, the development of systems with open code, development of 414 N. V. Morze and O. G. Glazunova knowledge-sharing under the scheme "student-student", "teacher-teacher", "students- teacher" and "students-teachers" [7,8]. An important step in the development of the idea of massive open electronic courses was the adoption of the UNESCO declaration on global policy on the issue of open e-courses, which sets the task of developing standards for electronic courses, providing synergy in access to them, conducting educational seminars on the development of courses and their use, collaboration be- tween scientists and teachers, education quality assurance [9]. Fig. 1. SMART University infrastructure 3.4 Properties of E-Learning Course for Smart-Education Some scholars define electronic course as didactic computer environment that con- tains classified material from the relevant scientific and practical field of knowledge that is combined by a single software shell, in which the following functional compo- nents are selected: information and navigation (meaningful connections, annotation and course structure, information, system of references, the searching system), infor- mative (interrelated informative elements of the course – theory, practice, guidelines, additional materials, information resources, including electronic and open), diagnostic (formative assessment tools in the form of clear evaluation criteria for all types of students activity, including self-assessment and mutual assessment, evaluation not only of academic achievements of students, but also evaluation of formation of skills of the 21-st century – to solve problems, work in team, communicate effectively and collaborate, etc., the testing system of current, intermediate and final control) [10]. Electronic course for Smart education should provide flexible learning of the students in an interactive learning environment, which allows him to adapt quickly to the envi- What Should Be E-Learning Course for Smart Education … 415 ronment, to study in any place, at any time on the basis of free access to content all over the world. In our opinion, the electronic course for Smart Education can be rep- resented as a certain scenario or trajectory of educational events how to work with electronic resources in the form of knowledge-map that leads to the achievement of learning effect and has the following properties: Flexibility – enabling rapid resources editing and making adjustments in educa- tional trajectory Availability of individual learning scenario, in other words, the possibility to draw up an individual educational program for each student from the set of training ele- ments Integration of training elements with other open information resources Focusing on the learning needs of the student, the personification of content Interactivity of learning elements of the course, the maximum use of multimedia technologies (videocasts, animation, video tutorials, screencasts, etc.) Feedback between the teacher and the student in the course Availability of training elements to ensure effective communication and coopera- tion of students between themselves and with the teacher, in particular based on the design technology [11] Availability of game educational elements Providing communication through modern services of social networks [12] Creation of e-learning courses usually is carried out with the help of Content Learning Management Systems. To create an effective e-learning course for Smart education not only available electronic resources of information and educational envi- ronment of the University should be used, but also open external information re- sources and Web services that will serve as sources of educational and informational materials for electronic course and as means of communication and cooperation (Fig.2). Information and educational environment of the university should be focused on solving the problem of joint creation and use of academic knowledge for the needs of students and teaching staff of the university. On the one hand, the teacher by himself adds academic resources to the information and educational environment, such as video clips and video tutorials posted on educational video portal and on the other hand, he has the possibility to use available public resources for creating e-learning course. So, to create an electronic course it is sufficient for the teacher to actualize material that is available from other sources, submit it in accordance with the above mentioned properties and criteria of evaluation of its quality, add the necessary train- ing elements of the course according to the adopted structure and develop an individ- ual learning scenario for each student, consider the individual evaluation criteria of educational achievements of students and developed skills of the 21st century. 3.5 Structure of E-Learning Course for Smart-Education Analysis of papers devoted to the creation and use of e-learning courses [13, 14] led to the conclusion that in the issue of the course structure they should be focused on the modular principle of its construction. When structuring the content of educational 416 N. V. Morze and O. G. Glazunova subject by the principle of training modules each module should consist of intercon- nected theoretical, empirical and practical components of the content, each of which would carry out an independent function. Thus the educational discipline module is an information and didactic unit, in which the approach to structuring the whole into parts is unified. It has a complex structure that includes goal of its integral develop- ment, objectives, content and results with the corresponding system of formative as- sessment. Fig. 2. Sources of electronic course formation for Smart education Furthermore, the structure of the e-course for Smart education should provide availability of: Tools to build individual learning trajectory (prior surveys, questionnaires, tests, formative assessment tools, including check-lists and tables of evaluation criteria, etc.) Multimedia presentations of summarizing character, video resources, interactive electronic manuals, external Web resources with multimedia theoretical material Links to external public resources including articles, conference proceedings, re- search materials, etc. Discussions on the forums, feedback with teacher, webinars and other Web ser- vices Intermediate control elements during the lessons and formative evaluation instru- ments, final control in the form of control tasks and tests, element of reflection Each element of the training course must meet certain standards and be evaluated using criteria that are accepted at the level of educational institution [15]. Approximate structure of Smart course is shown in the Figure 3. Example of the course topic, created in the CLMS Moodle environment, presented on training and information portal NUBiP Ukraine (http://it.nubip.edu.ua/course/view.php?id=21). What Should Be E-Learning Course for Smart Education … 417 Further we will focus in more detail on the features of the e-learning course struc- ture for Smart education. Fig. 3. Structure of the electronic training course for Smart education 3.6 Formation of Individual Learning Trajectory For the modern student, who has formed basic IT competences, there is a need not only in the access to the resources, but mostly in the navigation knowledge-map, "guidebook" to knowledge, that can be found in information space, as it is important to help student to find quality resources. And this is a complex task for untrained student. Smart education using Smart courses of the new model is the most comfort- able and modern teaching model for such cyber-students. To build individual training 418 N. V. Morze and O. G. Glazunova trajectory of the student in the electronic course you can use several approaches. One of them lies in the prior survey and testing of the students in terms of competence in the course educational material and the preparation of educational trajectory on the results of such survey and their identified learning needs (Fig. 4). Thus the survey is based on extensive use of formative assessment tools that provides self-assessment and mutual assessment. Fig. 4. Stages of the individual trajectory construction During the experimental study of the introduction of e-learning course of the new sample for the students of the "Computer Sciences" specialty a survey was conducted for assessing their competencies on the subject under the scale: "have a good knowl- edge", "be partially familiar", "heard something", "not familiar". Then each student was offered a test for competence in the training material, which he/she "has a good knowledge of" and "is partially familiar" with. According to the survey and testing results individual learning trajectory was built for each student or group of students. In other words, sequence of learning elements of the course was chosen, which stu- dent should study. Moodle platform that we used to create the course allows to make each training element available to a particular group of students. Therefore each stu- dent or group of students receives an individual set of training elements of the course. Training course is adapted for personal characteristics of each student that allow to implement personally oriented approach and to develop an individual training pro- gram. At the same time the course itself does not changed, but the methods of presen- tation, set of tasks for performance and the tools, methods of evaluation and control are changed. What Should Be E-Learning Course for Smart Education … 419 3.7 Presentation Educational Material in the Theoretical Resources of Smart Course Peculiarity of the new model electronic course is also the diversity of the theoretical learning resources presentation forms. Besides the theoretical material must be deliv- ered by 60-70% in the multimedia interactive form, we also note the necessity to choose the method of material delivery, depending on the level of its teaching. The theoretical material in electronic course can be delivered on the following four levels: phenomenological, analytical and synthetic, mathematical, axiomatic [16]. Each level has its peculiarities in the delivery of educational material (Fig. 5). Fig. 5. Ratio of levels and methods of educational material presentation in the theoretical re- sources Phenomenological level is characterized by the descriptive way of presenting edu- cational material. Therefore, these materials should be delivered in the form of mul- timedia presentations, interactive electronic manuals with graphics, multimedia and video elements. Analytic and synthetic level is characterized by the necessity of pre- senting of the theory of individual phenomena in naturally logical language that cre- 420 N. V. Morze and O. G. Glazunova ates background for phenomena and processes forecast on a qualitative level. For this level animation resources with elements of cognitive graphics should be prepared that will be able to demonstrate the nature of the phenomenon and its dynamic changes. Video tutorials with explanation and demonstration of the logic of the processes as well as sound screencasts will be also effective. The mathematical level is character- ized by the use of mathematical tools for modeling, theorem proving, examples of solving problems, etc. Therefore, conventional textbooks are not enough to deliver such material. It is necessary to create resources in the form of video lectures, video lessons, and text resources should be reduced to the minimum amount – in the form of handbooks with basic rules, formulas, theorems, etc. Educational material of the axiomatic level can be presented in the form of video tutorials, e-manual and multi- media presentations. Also it is necessary to actively use the links to external resources that cover material from the considered topic. Such resources will add credibility to the course and allow students to familiarize with additional sources of educational materials. 3.8 Presentation Learning Tasks in the Smart Course Another feature of the e-course for Smart education is the existence of elements for communication and cooperation between the students in the performance of tasks of mastering theoretical material, practical tasks, research projects, etc. Web 2.0 ser- vices, online services, social networks provide tools for organizing discussions, col- laboration, counseling. These elements are embedded in the course directly through the platform that is used, or by reference to it. While performing tasks students should use modern information and communication technologies effectively. Usually for mastering the theoretical teaching material students (Fig.6) are offered tasks on writing essays, composition of the related bibliography, writing summaries, para- phrasing theoretical information of a small amount in the form of "question-answer", compiling a glossary of terms under the certain topic, performing descriptive works, making instructions for the implementation of various operations, plotting grid plans and schedules. In order that such tasks become interesting for students it is necessary to use Inter- net resources, and present the result of the performance in the electronic form using modern information technologies. Tasks on mastering practical skills include solving problems, performance of ex- ercises, graphical works, practical works, calculated works, designing, modeling, compilation of practical situations from their own experience and on the basis of prac- tical training, performing analysis of enterprise activity. Students should be offered to solve such tasks using virtual laboratory workshops and specialized software. Tasks on forming research activity include implementation of individual research tasks, writing term papers, graduation works, participation in the educational projects. Such types of tasks involve creative activity of the students, which should be carried out by means of modern computer technologies, teamwork and communication. And one of their peculiarities is the use of formative assessment tools for their assessment that clearly guide the student to achieve learning goals in all types of educational What Should Be E-Learning Course for Smart Education … 421 activity that are presented specifically, clearly and should be achievable for each of them. Fig. 6. Types of tasks to master practical skills Thus, the main features of the students practical work organization using e-course in the Smart education is the availability of tools for collaboration, communication, combination of different information technologies in the performance of tasks. But we should not forget that tasks should have practical significance, and contain de- tailed information on their implementation, evaluation criteria and support resources. We offer such pattern for the formulation of the task in the e-course (Fig.7) 3.9 Results of Experimental Research In the course of research conducting we proposed a new model of electronic learning course for students of shortened training period. As a result, after questioning and testing six groups of students were identified who studied in different educational trajectories, successfully completed training course and demonstrated 13% better academic progress compared to the group of students who studied in one training trajectory. At the same time, the participants of the experimental groups performed larger volume of tasks on the depth study and worked extra theoretical study material according to the references to the external information resources. In addition, teach- ers-participants of the experiment, indicated that the presence of distributed environ- ment of the opened resources in the university allows to create e-learning courses applying much lesser efforts and requires lesser time compared to the case when the 422 N. V. Morze and O. G. Glazunova course is created from the beginning. Teachers are able to use ready resources for creating elements of the course – presentations, video recordings, electronic versions of manuals and guidelines, a database of scientific publications, etc. Fig. 7. Pattern for the formulation of the task in the e-course 4 Conclusions Thus we note that the electronic course that has the properties required in the view of Smart education is an effective tool for nonformal and informal learning, in which most motivated students are interested now for obtaining high-quality knowledge, not only a diploma of higher education. For efficient organization of learning activity in the conditions of Smart education modern university should have distributed informa- tion and educational environment that will enable to concentrate open electronic learning resources and to move knowledge into a distributed network, actively use the Web 2.0 services, mobile technologies, management system for learning content for delivering knowledge to the students and the interactive exchange of information data and training materials with them. In the future the development of such approach is possible due to the joint development and use of the open educational content reposi- tory by the universities based on the technologies of the Smart education. What Should Be E-Learning Course for Smart Education … 423 References 1. Tikhomirov, N.V.: Global Strategy for the Development of Smart-Society. MESI is on a Smart-University, http://smartmesi.blogspot.com/2012/03/smart-smart.html (In Russian). 2. Measuring the Information Society 2012, Committed to connecting the world, http://www.itu.int/dms_pub/itu-d/opb/ind/D-IND-ICTOI-2012-SUM-PDF-R.pdf (In Rus- sian). 3. Traxler, J.: The Learner Experience of Mobiles, Mobility and Connectedness. Evaluation of Learners’ Experiences of e-Learning Special Interest Group. http://www.helenwhitehead.com/elesig/ELESIG%20Mobilities%20ReviewPDF.pdf (2010) 4. McAuley, A., Stewart, B., Siemens, G., Cormier, D.: The MOOC Model for Digital Prac- tice. http://www.elearnspace.org/Articles/MOOC_Final.pdf 5. Kuharenko V. M.: Formal, Informal, Informalne and Social Studies. In: Modern Educa- tional Technology in Education, pp. 114–124 (2012) (In Russian) 6. Mapping Informal and Formal Learning Strategies to Real Work. http://performancexdesign.wordpress.com/2011/05/04/mapping-informal-and-formal- learning-strategies-to-real-work 7. Helmer, J.: A Pair of Key Trends for this Year Learning: MOOCs and OA, http://www.smart-edu.com/moocs-and-oa.html 8. Open Educational Resources, http://www.unesco.org/new/en/communication-and- information/access-to-knowledge/open-educational-resources/ 9. Pawlowski, J.M., Hoel, T.: Towards a Global Policy for Open Educational Resources: The Paris OER Declaration and its Implications, White Paper, Version 0.2, Jyväskylä, Finland, (2012) 10. Bezdolny A.V.: Model of E-Learning Course as a Means of Organizing the Self-Training, http://cyberleninka.ru/article/n/model-elektronnogo-uchebnogo-kursa-kak-sredstva- organizatsii-samostoyatelnoy-podgotovki (In Russian) 11. Gnedkova O., Lyakutin V.: Methodological Recommendations of Internet-Services Usage in Distance Learning System “Kherson Virtual University”. Information Technologies in Education, 10, 183–187 (2011) (In Russian) 12. Ravenscroft, A.: Dialogue and Connectivism: A New Approach to Understanding and Promoting Dialogue-Rich Networked Learning. International Review of Research in Open and Distance Learning, 12(3), http://www.irrodl.org/index.php/irrodl/article/view/934 13. Osin A.V.: E-Learning Resources in a New Generation of Questions and Answers, http://www.ed.gov.ru/news/konkurs/5692#g10 (In Russian) 14. Mosher, B.: Five Myths About Informal Learning, http://www.smart-edu.com/stati- korporativnoe-obuchenie/pyat-mifov-o-neformalnom-obuchenii.html 15. Morze N. V., Glazunova E. G.: Quality Criteria for E-Learning Courses. Information Technologies in Education, 4, 63–76 ( 2009) 16. Deryabina G. I., Losev V. Yu.: Creating E-Learning Courses: Studies. Samara. Univers – groups (2006) TIO – a Software Toolset for Mobile Learning in MINT Disciplines Daniel Sitzmann1, Dietmar P.F. Möller1, Karsten Becker2 and Harald Richter3 1 University of Hamburg, MIN Faculty, FB Informatik, AB TIS, Building F, Vogt-Kölln-Str. 30, 22527 Hamburg, Germany {sitzmann, dmoeller}@informatik.uni-hamburg.de 2 Hamburg-Harburg University of Technology, Institute of Computer Technology, Schwarzenbergstr. 95E, 21071 Hamburg, Germany k.becker@tuhh.de 3 Clausthal University of Technology, Arnold-Sommerfeld-Str. 1, 38678 Clausthal, Germany richter@tu-clausthal.de Abstract. A web-based tool-set called TIO was created for mobile learning with emphasis on study courses in mathematics, informatics, natural sciences and technology (MINT). Mobile learning is a variant of E-learning which is based on mobile user devices with Internet access. It was tested for numerous software and hardware configurations users may have and proved to be techni- cally working. TIO consists of a modified version of the open-source E-learning system ILIAS and a tool set. The experience with TIO was that mobile learning is useful for MINT subjects provided that numerous end user devices are sup- ported and several text systems as well. We found that mobile learning is espe- cially useful for professionals because they can learn in their free times in a flexible way. Finally, we found that an emotional component should be existent to make mobile learning more lasting. Keywords. Mobile learning tool, MINT, single-source publication, social me- dia Key terms. KnowledgeEvolution, KnowledgeManagementMethodology, Di- dactics, KnowledgeManagementProcess, ICTInfrastructure 1 Introduction Mobile learning or M-learning is a new form of E-Learning. Mobile learning means that pupils, students, or professionals are learning via mobile devices such as note- books, E-books, handhelds, PDAs, smartphones, tablet PCs, iPads, iPods or gaming consoles. The term MINT stands for mathematics, informatics, natural sciences and TIO – a Software Toolset for Mobile Learning in MINT Disciplines 425 technology. The combination of both, i.e. the application of M-learning in MINT subjects is not yet found in literature but addressed in this paper. Because there is no software that supports M-learning for MINT subjects, it had to be developed. The scientific questions of this project are: how should a tool-set look like that optimally supports M-learning for MINT subjects, and what are the benefits and disadvantages of teaching MINT subjects by means of M-learning in general. The latter question will be answered in the future because TIO can be used for a subsequent pedagogical evaluation of the effect and the adoption of M-learning in MINT subjects. It is the technical basis and thus a prerequisite for assessing the benefits and disadvantages. The first research question will be answered in the following because tool-set called “Technical Informatics Online” (TIO) is presented which is a software platform for computer-aided distance learning that is web-based and that has an emphasis on MINT disciplines and on mobile learning. TIO is a front-end for authors of teaching material, and at the same time it is a user interface for learners. It provides for both groups spatial and temporal flexibility in creating content and in consuming it. Basically, TIO is for the editing of teaching material, its distribution to various mobile end-user devices, for managing study courses, and for learning these course materials. TIO supports so-called single source publishing and serves as a social me- dia for its users in order to make learning a deeper and thus longer-lasting experience. Single source publishing allows to create, to maintain, to retrieve and to deliver the very same content for many heterogeneous end user devices while it is stored only one time in one file. In order to achieve this, a TIO-internal XML-based data structure called “xml4tio” was defined that can be converted by TIO tools into various output formats which optimally support the respective end user device. Because these de- vices are very different in their capabilities, several presentation formats must be gen- erated out of the same source file, depending on the user's preferences and device type. TIO is based on a modified version of the open-source software ILIAS and of a TIO web application called TIOWA. It provides for numerous E-learning features, including chat rooms, multiple-choice tests, and for the management of the teaching content, the learning courses as a whole and their users. The rest of the paper is organized as follows: chapter 2 describes the state-of-the- art in M-learning, single source publishing and social. In chapter 3, An overview de- scription of TIO is given that explains its software components and used technologies. In chapter 4, the technical set-up of TIO is explained by means of block diagrams and xml data structures in more detail. Also the made extensions to ILIAS are described briefly here. In chapter 5, a report about field tests of the tool-set is given. The paper ends with a conclusion, an outlook to future work and a reference list. 426 D. Sitzmann, D.P.F. Möller, K. Becker, and H. Richter 2 State-of-the-Art 2.1 M-Learning M-learning is a modification of E-learning for the purpose of distance education and blended learning based on mobile end-user devices. An overview on M-learning can be found in [4], [5], [6] and [7]. Blended learning means that customers must show-up in a classroom for a fraction of about 20% of their time and are not allowed to study completely from remote. In theory, M-learning has several advantages: First, mobile devices are already widespread. Thus, no or little investment is needed for the user compared to a desktop PC. This is important for many young clients and for customers from developing countries. Second, mobile devices reflect the life style of the generation of ´digital natives‘ [8]. Thus the potential acceptance and use of M-learning may be higher than for classical learning styles. Third, mobile devices help professionals to consume content on top of their working hours for the purpose of life-long learning because of the access flexibility they get. They can use free time slots while travelling between work and home for learning, or weekends and holidays in a very flexible way. Fourth, Internet access allows quicker distribution of content and for significant lower media costs compared to all other distribution ways that are based on paper. Fifth, if the sensors of the end-user device such as the GPS position, the tilt- and acceleration- meters are engaged then learning can adapt dynamically to the current location and situation of the learner. In practice, M-learning faces several problems: First, small screen and key sizes hamper its applicability in all cases except of notebooks. As a consequence, a sophis- ticated layout and formatting of teaching material becomes less important because such formats may not be displayed. Additionally, limited user input must be tolerated because of keyboard restrictions. Second, a rel. slow Internet access and a limited battery life of the user device must be taken into account. Third, heterogeneous hard- ware and operating systems with no or small hard disks are common. Mobile devices have more options with respect to CPUs, RAM size, displays and operating systems compared to desktop PCs. This makes it very difficult to present and use teaching content equally on all devices. Numerous E-learning platforms are already in usage. An overview can be found in [9], for example. However, most of them are intended for use on desktop PCs and not for mobile devices. Furthermore, their features for presenting technical content with formulas for visualization of simulation results and for access to remote laboratories is mostly limited. Finally, Internet access via mobile devices is not explicitly sup- ported. TIO addresses these problems and supports text types that occur frequently in MINT subjects. 2.2 Single Source Publishing Single source publishing means that one internal storage format is used out of which diverse customer outputs can be created, such as html with various cascaded style TIO – a Software Toolset for Mobile Learning in MINT Disciplines 427 sheets and pdf. By single source publishing, a simpler updating process of content is possible that allows to reuse the same teaching material for many devices and in vari- ous contexts. Traditionally, single source publishing is implemented by a 1:1 correlation be- tween chapter and character formats in the source texts and, for example, the gener- ated html code for the user. Additionally, source texts may be augmented with tags and comments that give meta information about the text. From this semantic data, converters can create automatically various output formats, provided that sufficient meta data exists. TIO uses a combination of both. First, it converts teaching materials that is format- ted in a traditional way by chapters, sub chapters and emphasizes such as underline, italic and bold into the common intermediate storage structure xml4tio. Subsequently, it converts xml4tio into the needed user formats. Finally, it uses ILIAS and a self- developed web application called TIOWA to disseminate the requested content in the desired format via mobile Internet. 2.3 Social Media/Web 2.0 Social Media is a generalization of the term web 2.0. It denotes that users are not only consumers of content via download but also producers via upload. It denotes further- more that users are interacting with each other in the web. Examples of popular social network services are Facebook, twitter, YouTube and Wikipedia. TIO strives to be a social media for distant learning in the MINT disciplines in order to make learning a deeper experience because the distinction between learning and leisure are more flexible then, and because learning gets hereby an emotional component that exists also in a classroom but not in a computer system. 3 Description of TIO TIO combines E-learning via the Internet with mobile communication. It allows thus for learning from anywhere and at any time. TIO is focused but not limited to mobile user devices. It can be used also with desktop computers. Content creation can be accomplished by authors via the commercial text system “Adobe Frame Maker” [1] if technical manuscripts have to be written. A TIO tool converts these Frame Maker texts in xml4tio. If the editing of mathematical texts is required then a self-developed tool called LearnDSL (Learn Domain Specific Language) can be used for the conven- ient writing of formulas in a style similar to LaTeX but more user-friendly. LearnDSL converts mathematical texts into xml4tio. Finally, Open Office Writer [2] is supported by TIO for all other use-cases. A macro-based converter transforms Open Office texts again in xml4tio. TIO is based on the wide-spread ILIAS E-learning platform [3] for which several adaptions and extensions were developed. ILIAS implies PHP [10] as programming language and SCORM [11] as a content format. TIOWA in turn allows to upload teaching material and to convert it from xml4tio into SCORM and html. TIOWA uses 428 D. Sitzmann, D.P.F. Möller, K. Becker, and H. Richter Ajax [12] which provides for the students to interact with TIO as if it were a local application running on his hand-held device. Ajax employs several technologies such as Javascript [13], XML [14], HTML5 [15] and CSS3 [16]. An important part of TIOWA are several converters which are based on XSLT [17]. In principle, TIOWA and ILIAS could even be used independently from each other for the creation, storing and dissemination of teaching material. For example, TIOWA could also be con- nected to other learn management systems such as Moodle [24]. Aims and Features. Our extensions to ILIAS provide for the following features: First, authors can create teaching material in that text system that is preferable for them and their content. After content creation, support for an automated conversion into the intermediate xml4tio file format is provided by TIO. Xml4tio is an xml schema definition that is used to store content on a TIO server [25]. Second, formatting instructions stored in xml4tio is reduced to a minimum because small screens can not display sophisticated presentations that were created in Power Point, for example, in a satisfactory manner. Third, the use of the common storage format has the potential for easier access and for better maintaining a large amount of teaching material, as it is needed for a Bache- lor or Master courses, for example. Fourth, TIO supports multiple end-user devices and operating systems by engaging browsers as the only needed software at the end-user device. Fifth, TIO supports the learning process not only for full time students, but also for part time customers, together with training on the job for life-long learning of profes- sionals. Therefore, it can be used by universities and enterprises. Sixth, the teaching material is structured by TIO into ‘learn objects‘ which can be compared to book chapters but may contain audio, video, text, graphics, formulas, visualization of simulation and access to remote laboratories, as well as exercises and multiple choice tests for each topic. Seventh, TIO can administer a large number of students as customers and a large number of courses as well because it uses ILIAS which has proven to be a reliable system. Eighth, most of the TIO portals are multilingual, currently in German and English, which gives also a transnational aspect. 4 TIO Set-up TIO‘s set-up is based on three (virtual) servers, one for authors, one for students, and one for backup (Error! Reference source not found.). TIO – a Software Toolset for Mobile Learning in MINT Disciplines 429 Fig. 1. General hardware setup of TIO All servers use UBUNTU Linux. On the author‘s server, our TIOWA web application is the main software component, together with several converters between storage formats. TIOWA is the frame in which authors create, maintain and convert learn objects. Additionally, a relational data base is provided that cares for authentification and authorization of authors and system administrators, and that maintains different versions of teaching objects. Teaching objects are stored as files in xml4tio. In Error! Reference source not found., the software set-up of the author server is shown. On the customers server, ILIAS is the software which cares for student administration and examination, inter-student communication and content dissemination. In Error! Reference source not found., we see the block diagram of the customer server. Fig. 2. Set-up of the customer server 430 D. Sitzmann, D.P.F. Möller, K. Becker, and H. Richter C us tomer S erver T IO -IL IAS P latform T IO E x tens ions Multimedia C ours es , web 2.0, forums , notes , S tudents , T es ts , c hats , ques tions , etc . Mobile Devic es Audio & T emplate S etting s for us ability, Video adaptivity, interac tion Databas e xml4tio2s c orm Us er & C ours e Dis play and s earc h T IO Databas e c ontent L inked to F rom Multimedia Data T eac hing Material in S C O R M: HT ML 5/C S S for S tationary or Mobile Devic es and P DF T o C us tomers Fig. 3. Set-up of the customer server The information which student is enrolled in which course is stored in a MySQL data base. Audio and video recordings of teaching objects and whole lectures are stored as standard files with references to xml4tio and SCORM data structures. SCORM and xml4tio are both based on xml. SCORM is a collection of specifications and programming interfaces. It stores user preferences, records learning goals, logs learn progress and describes which resources a learn object has. A resource is a set of texts, pictures, audios, videos and URLs that is organized as a tree. Each tree resem- bles the chapters and subchapters of a traditional lecture of 45-90 minutes and can be augmented by meta data that describe the learn object. This meta data allows to search for content. Furthermore, possible sequences can be specified in which students can consume learn objects. Finally, the path through a sequence of teaching objects can be declared as a function of the answers the user gives in multiple-choice tests. The backup server provides for a safe operation by copying automatically data from the author and customer servers. In the following, all described components are explained in more detail. 4.1 The TIOWA Web Application For content creation, management and format conversion, TIOWA was programmed in PHP 5.4 and Javascript as a web application for the authors. It uses MySQL 5.6 [19] as data base for authors and administration, jQuery and Ajax for easier commu- nication with the user via dynamic web pages, and an Apache web server for page generation in HTML5 and CSS. TIOWA allows to upload content from an author‘s computer to the TIO server, to convert it into xml4tio, to update and backup it, to TIO – a Software Toolset for Mobile Learning in MINT Disciplines 431 move it to the customer server and, in the final step, to reformat the content for dis- play, depending on the respective output device. For this purpose, TIOWA accesses xml4tio data, author/admin data and SCORM data. From the viewpoint of xml4tio, TIOWA is an application for single source publishing. 4.2 XML4TIO for Single Source Publishing Xml4tio is a xml schema definition [24] and the core of TIO. For the conversion of teaching material into xml4tio, two files and one extra folder must be created: First, a ´container.xml´ file is established that stores a description of the teaching material as meta data. In case of B/M modules, the meta data are based on the Bologna module description [20]. Otherwise, the author must provide an arbitrary text as abstract. Ad- ditionally, container.xml binds together all teaching objects into lectures and all lec- tures about the same topic into one Bologna module. Second, a ´content.xml´ file must be created that stores the content of the Bologna module. The first layer of the syntax tree of xml4tio comprises the xml tags module, title, author, section and presentation unit. These tags can subsequently be specialized by additional tags and/or attributes in more layers. Especially remarkable is for example the ´media´ tag that can have as attribute ´picture´, ´animation´, ´applet´, ´scene3d´, ´sound´, ´video´ and ´experiment´. This offer shows the capability of TIO in media presentations for MINT subjects. 4.3 Converters for Storing and Disseminating Teaching Material There are converters from OO Writer, Adobe Frame Maker and LearnDSL into xml4tio. A reverse conversion for a so-called round trip is normally not possible be- cause nearly all character and chapter formatting instructions are deleted during the conversion process. However, if the author limits himself in his text system on the few formatting data that xml4tio has then he can also perform a roundtrip. Additionally, there exists a converter for transforming xml4tio into SCORM, html5 and pdf. These formats contain less formatting information than the authors‘ original documents. The main difficulty the converters are facing with stems from the fact that humans normally do not treat their lecturing materials as a formal text. This means, that beside the used character and chapter formats, several hand-made changes exist in real-world lectures. Such documents can not be converted. To detect these flaws, a xml4tio vali- dator is provided that checks whether the lecture is structured according to the xml4tio schema definition. Conversion of Teaching Material into XML4TIO. For the LearnDSL converter, own software in Java was written. For the OO Writer, a PHP software was created that transform valid OO texts into xml4tio. Prerequisite for that is that a prescribed LearnDSL and OO character and chapter format catalogue is used without any man- ual additions or modifications. Furthermore, specific rules have to be obeyed with respect to the structure of the teaching material. For the Frame Maker (FM) converter, basically the same restrictions hold as for the 432 D. Sitzmann, D.P.F. Möller, K. Becker, and H. Richter Writer converter. However, no software has to be programmed as in the LearnDSL or OO case. FM must only be set from unstructured to structured mode. Then it can di- rectly deliver the desired xml schema. FM uses as input for this schema a so-called Element Definition Document (EDD) file [21]. This means that the EDD let Frame Maker know which Elements are allowed and which composition of elements are legal. The EDD additionally prescribes how to format them elements. By means of a proper EDD, a structured text appears to the user of FM nearly as an unstructured text. The EDD in turn can be created out of a so-called document type definition (DTD) [22] in a two-step manner: first the DTD file must be manually created on basis of the xml4tio schema definition. Then, the DTD can be imported into FM, and an EDD is created automatically by FM. However, the resulting EDD file must be post processed manually to define the ap- pearance of chapter and character formats of the elements. Such format definitions are made by templates. The manual post processing of the EDD file is also needed to provide for so-called read/write rules which help to convert FM documents into cor- rect xml4tio by transforming FM elements into TIO tags and attributes. After that, the EDD file must undergo an automatic transformation by XSLT to create proper URLs, names and paths for pictures and other multimedia content ac- cording to the xml4tio schema definition. Therefore, an XSLT style sheet was created, and additional EDD rules were defined that call the style sheet. Finally, FM bundles the DTD together with the EDD that contains format tem- plates and read/write rules into a so called structured application definition. As soon as such a definition exists, the user can just save any structured FM document into xml4tio by simply clicking the "save as XML" button, and the conversion is done. Conversion of XML4TIO to SCORM Output. This converter is called ´xml4tio2scorm´. It is responsible for content presentation because the created SCORM file contains html for a browser, together with CSS formats and links to multimedia data such as pictures and graphics. However, the full potential SCORM has is not needed here. Only a TIO-specific subset is used that is subsequently stored in ZIP format in order to save (virtual) disk space. xml4tio2scorm creates a table of content for the selected study course, its html/CSS and pdf representation and a index for searching. This index is copied automatically as a file into the user space of ILIAS so that students can access it. Depending on the fact whether the target browser is located in a mobile or station- ary device and with respect to the device‘s screen size, either the table of content is presented simultaneously together with a video recording of the lecture and the text of the teaching material. Or in case of small screens, only one of these three streams is displayed, according to the user wishes. Additionally, it is taken into account that mobile devices may have only limited bandwidth for Internet access. Because of that, offline browsing of previously downloaded teaching material is supported, together with pdf viewing. TIO – a Software Toolset for Mobile Learning in MINT Disciplines 433 4.4 Extensions to ILIAS The following extensions to ILIAS were made: 1.) students can highlight and com- ment every line of the teaching material. These comments can be attributed to be ei- ther private, visible to the public or open for discussion in a special forum. This makes students to prosumers. 2.) students can ask the author of the teaching material via the Internet by placing their question directly into the script at the proper line which eases the communication between student and teacher/author significantly. 3.) Searching in the teaching material is possible by means of an index. 4.) Fonts, font sizes and colors can be configured individually to provide for better reception. 5.) A table of content can be displayed that helps to get a better overview of the learning material 6.) The SCORM output is adapted to mobile devices with individual screen sizes. The extensions make students into “prosumers” that contribute to their learn suc- cess by own comments and hints in the teaching material that are visible to all others. The improved questioning and chat room feature allows a „learning adventure“ and makes TIO into an Internet-based social media. 5 Software Tests and Practical Experiences For testing the software we received from Clausthal University of Technology three virtual servers which were also administrated by them, including backup and restart in case of system crash. On these servers, three web portals were created: 1.) A portal under http://webadmin.ti-online.org for accessing the TIOWA web applications. This portal is for authors and admins. 2.) A portal for general information about the TIO project and for project protocols under http://ti-online.org. 3.) A portal for TIO stu- dents and other customers to download learning content under http://ilias.ti- online.org/. This portal is also for teachers to manage TIO students and courses. It is based on a modified version of ILIAS as described. All portals are maintained by Typo3 [23] as content management system, the last portal is in German and in English language. 5.1 End-User Devices and Configurations TIO was tested with the following end-user devices and configurations: 1.) desktop PCs and notebooks with MS Windows XP/7/8 as operating systems, and with Firefox (>V2), Internet Explorer (>V6), Opera (>V8), Chrome (>V16) and Safari (>V5) as browsers. 2.) desktop PCs and notebooks with Ubuntu, Debian and Suse Linux to- gether with Opera and Firefox. 3.) Apple computers with Mac OS X (>V10.6), Safari and Firefox. 4.) Apple iPad with iOs (>V5) and diverse Android Tablets (>V4.0). 4.) Apple iPhone (V4) and diverse Android Smartphones (>V2.0), and 5.) Blackberry and Apple iPod. This was considered a comprehensive selection. 434 D. Sitzmann, D.P.F. Möller, K. Becker, and H. Richter 5.2 Practical Experiences In the years 2010-2013, teaching materials were created for TIO in Frame Maker, LearnDSL and Open Office Writer. This happened for the education in Technical Informatics, Computer Organization and Computer Networks at the Universities of Hamburg and Clausthal. Additionally, some video recordings of the lectures, simula- tions and animations are added to the teaching materials. After every semester, feed- back from the students was collected in written form and evaluated. According to that, parts of the software and the didactic presentation of the teaching material were im- proved. 6 Conclusion and Future Work A web-based software platform called TIO for mobile learning with emphasis on study courses in mathematics, informatics, natural sciences and technology was cre- ated. It was tested for numerous software and hardware configurations users may have and proved to be technically working. TIO consists of a modified version of the open- source E-learning system ILIAS and the TIO tool set. The ILIAS extensions were made to improve its usability for mobile devices and MINT subjects. While ILIAS is for students, TIOWA is for the authors for content creation, uploading and converting. It stores the teaching material in a xml schema definition out of which several output streams can be generated that depend on the end-user device and its screen size (sin- gle-source-publishing). Teaching material can contain texts with formulas, audios and videos, animations and visualization of simulation results. Finally, the xml4tio data structure also allows remote access to experimental labs located in a university for practical training. We believe that some of our implemented features will be useful for the generation of ´digital natives´ that must prosecute life-long learning. The general experience is that mobile learning is useful for MINT subjects pro- vided that diverse end user devices are supported because they differ a lot in their capabilities. Furthermore, it proved to be important to support various text systems such as Frame Maker, Open Office Writer and a variant of LaTeX because otherwise not enough authors can be found to develop teaching material. Furthermore, we found that mobile learning is especially useful for professionals because they can learn now in a flexible way in empty times slots such on train trips between work and home. Finally, an emotional component should be existent in mobile learning that makes it a deeper and more lasting. Thus chat rooms and other features with which clients can communicate with each other in the style of a social media were integrated. As a rec- ommendation, we suggest to combine mobile learning with classroom presence of approx. 10% of the study time to make the emotional component optimal. In the future, TIO must be approbated, tested and checked in practice over a longer period of time. Therefore, it will be used for online teaching a Master course in Tech- nical Informatics and for training and certifying of professionals for the purpose of life-long learning. These real-world applications will allow to conduct a pedagogical evaluation of the benefits and disadvantages of M-learning for MINT subjects. From TIO – a Software Toolset for Mobile Learning in MINT Disciplines 435 that experience, recommendations can be given to teachers and organizations about M-learning in general. The TIO tool-set is the technical prerequisite for that. References 1. Adobe Framemaker, http://www.adobe.com/products/framemaker.html 2. Open Office Writer, http://www.openoffice.org/ 3. ILIAS Learning Management System, http://www.ilias.de/ 4. Holzinger, A., Nischelwitzer, A., Meisenberger, M.: Lifelong-Learning Support by M- Learning: Example Scenarios, eLearning Magazine, 11, ACM, New York (2005) 5. Guy, R.: Mobile Learning: Pilot Projects and Initiatives, Informing Science Press (2010) 6. Kitchenham, A.: Models for Interdisciplinary Mobile Learning, Delivering Information to Students, IGI Global (2011) 7. Sampson, D. G., Isaias, P, Ifenthaler, D., Spector, J. M.: Ubiquitous and Mobile Learning in the Digital Age, Springer New York, Heidelberg, Dordrecht, London. (2013) 8. http://en.wikipedia.org/wiki/Digital_native 9. http://www.techworld.com.au/article/223565/10_open_source_elearning_projects_watch/? 10. PHP - Hypertext Preprocessor, Web Scripting Language, http://php.net/ 11. Sharable Content Object Reference Model (SCORM), http://scorm.com/scorm-explained/ 12. Ajax (Asynchronous JavaScript and XML), 13. Javascript, http://www.w3.org/standards/webdesign/script.html 14. Introduction in Extensible Markup Language (XML), http://www.w3.org/XML/ 15. World Wide Web Consortium. Standards, http://www.w3.org/standards 16. World Wide Web Consortium. Cascading Style Sheets, http://www.w3.org/Style/CSS/ 17. World Wide Web Consortium. XSL Transformations, http://www.w3.org/TR/xslt 18. World Wide Web Consortium. XML Schema, http://www.w3.org/XML/Schema.html 19. MySQL 5.6 Reference Manual, http://dev.mysql.com/doc/refman/5.6/en/index.html 20. Bologna process, http://ec.europa.eu/education/policies/educ/bologna/bologna.pdf and http://www.ehea.info/ 21. Element Definition Document, http://help.adobe.com/en_US/FrameMaker/8.0/help.html? content=Chap2-FrameMaker-Basics_098.html 22. Document Type Definition, http://help.adobe.com/en_US/FrameMaker/8.0/help.html? content=Chap2-FrameMaker-Basics_098.html 23. Typo3 - The Enterprise Open Source CMS, http://typo3.org/ 24. Moodle - https://moodle.org/ 25. XML4TIO, http://www.ti-online.org/XML4TIO Holistic Approach to Training of ICT Skilled Educational Personnel Mariya Shyshkina Institute of Information Technologies and Learning Tools of the National Academy of Pedagogical Sciences of Ukraine marple@ukr.net Abstract. The article intends to explore and estimate the possible pedagogical advantages and potential of cloud computing technology with aim to increase organizational level, availability and quality of ICT-based learning tools and re- sources. Holistic model of a specialist is proposed and the problems of devel- opment of a system of methodological and technological support for elaboration of cloud-based learning environment of educational institution are considered. Keywords. Learning environment, personnel training, cloud computing, holis- tic approach Key terms. KnowledgeEvolution, KnowledgeManagementMethodology, Di- dactics, KnowledgeManagementProcess, ICTInfrastructure 1 Introduction As it is now impossible to introduce advanced ICT while managing this process with- out mastering the ICT and other related pedagogical technologies, the main aim is to train ICT-skilled educational personnel. Cloud computing technology (CC) is to cre- ate a high-tech learning environment of educational institution, enhancing multiple access and joint use of educational resources at different levels and domains. On this basis it is possible to combine corporate resources of the university and other on-line resources, adapted to learning needs, within a unite framework. Cloud computing is used for resources supply and to support collaboration in the learning process in particular by means of mobile services. It requires the develop- ment of new approaches and models for designing of a learning environment. Among them there are those based on a holistic approach to learning [1], [5], [7], [9]. For this aim a set of instrumentation tools for cloud-based learning resources col- lection, elaboration and design, holistic models of learning environment and specialist models, and a system of methodological and technological support for the develop- ment of cloud-based learning environment of educational institution should be cre- ated. Holistic Approach to Training of ICT Skilled Educational Personnel 437 The purpose of the article is to identify trends and conceptual models of educa- tional personnel training within the cloud based learning environment. 2 Problem Statement The problem of training of qualified educational management personnel as well as teachers oriented on ICT based learning can nowadays hardly be taken independently from the processes of the innovative development of educational space formed within the school, region and educational system of a country or globally [1]. In this regard, there is a need for fundamental research focusing on the possible ways of developing an educational environment of educational institutions. It should take into account, the trends of improving ICT facilities while searching for new engineering technological decisions and new pedagogical and organizational models [1], [2]. The main focus is on shifting from mass introduction of separate software products, to an integrated and combined environment which supports distributed network services and cross- platform solutions. Emerging technologies of information and communication networks give a way for implementing a holistic approach to education and training of personnel. A holistic approach focuses on combining science and practice, training and production, funda- mental and applied knowledge and technological competencies with social and hu- manitarian [5], [9]. Above all it aims the development of public administration’s management skills in the educational field basing on a unite approach to learning design and management. This is a promising direction for the development of a field’s human potential. The innovative processes therefore, of the organization and development of learning environment, search for new approaches and models for specialist education and training becomes a matter of interest [11]. There is a problem of availability and valuable ways of learning resources deliv- ery, to achieve with their use the best pedagogical effect and to gain maximum learn- ing potential of ICT. This issue, may be hence partially solved if delivered by means of cloud computing technology [2], [8], [12]. The main advantage of this technology is the improved access to qualitative resources (and sometimes the only possible ac- cess to necessary recourses at all). The idea is simply to explore approaches for the modeling and estimation of CC-based learning process settings and valuable tools for its organization. 3 Education and Training of ICT-Skilled Management and Public Administration Personnel Public administrators are public servants working in public institutions, departments and agencies [10]. Specifically, they are concerned with “planning, organizing, direct- ing, coordinating, and controlling government operations” [6]. Specific sphere are public servants for education management. For such personnel to be efficiently trained, the need to develop novel approaches arises, as this sphere is mostly con- cerned with multi-disciplinary knowledge and requires skills on the merge of training, 438 M. Shyshkina learning and management. Due to the fact that most pedagogical innovations are also based on ICT the need in the sphere of education management also arises. There is a branch of pedagogical sciences dealing with theoretical and methodological problems of ICT in education use, psychological and pedagogical substantiation of these proc- esses, elaboration of ICT tools and resources for providing functioning and develop- ment of educational systems. So there should be specialized personnel to insure the processes of implementation, introduction and development of ICT-based learning technologies within the sector of public administration. There are significant needs in IT competent specialists in the sphere of public ad- ministration. Without ICT competence or competence in ICT for learning, problems with their adaptation at the workplace arise, as do problems with the necessity of ad- ditional and often profound training almost immediately after hiring. In some cases, a vague idea of future graduates about the real problems and conditions of work with innovative ICT infrastructures and ICT-based tools leads to lack of commitment to practical solutions of work situations thus to a low level of innovative inclusion. Formation of the innovative institution’s ICT infrastructure could solve some of the aforementioned problems [11]. Namely, it would bridge the gap between the process of training and the level of demand for their product. An environment that would bring together the learning resources of educational and industrial projects would be created, and would cover different levels of training; including the training of both students and pedagogical management personnel. According therefore to the high rates of development of both the global ICT mar- ket for the education sector, and the IT market of learning tools, the problem of train- ing professional staff for the domestic public administration and the IT-oriented sector of education management; personnel which are primarily prepared within higher and post graduate schools (e.g. universities and advanced training schools) being continu- ous, we conclude that modern approaches to the design of educational systems are a key point. It is unlikely that the current state of skilled personnel of management and public administration of education could be regarded as fully satisfactory for the needs of innovative development of ICT-based learning, for the required number of qualified professionals with appropriate structure and quality of training. The system of training and retraining of employees for public administration has not been properly formed. These problems should be considered within the context of development of an in- stitution’s and a region’s innovative environment as well as on national and interna- tional level [3], [11]. These processes have to do with the modernization of a learning environment in perspective of emerging ICT. Thus the developmental need of new models and approaches to personnel training arises, which will account for the mod- ernization of ICT infrastructure and integrate resources of different levels and use. Introduction of innovations into the educational environment of a state or a region, is highly concerned with the development of human resources of informatization on education [1]. It requires new types of skills and competencies which graduates often lack of. These skills include leadership, ability to approach a problem holistically, and the ability to critically evaluate achievement and self-assessment [9], [11]. It is the lack of qualified personnel and the absence of a strategic approach to ICT infrastruc- Holistic Approach to Training of ICT Skilled Educational Personnel 439 ture design that are among the reasons for an institution’s of professional education deficiency of a unite high-tech desisions. Nowadays content-technological process regarding the creation and use of ICT products, and in particular the electronic learning resources, requires fundamental background knowledge in both ICT and pedagogy. The approaches however for train- ing personnel today, do not sufficiently take into account the recent years’ innovative changes in the ICT industry, nor the real needs regarding the extent of such training. A mean for provision of users with relevant services of cloud computing technol- ogy is considered to be outsourcing; i.e. a service in a specific system to implement its core functions is required, offered and sold by another system external to this [3]. ICT outsourcing plays an important role in enhancing the scientific and technical level of ICT-systems of an educational institution as well as the efficiency of their operation and their development. It is a market mechanism incorporating the latest advances in the ICT sector and to satisfying user demand [2], [3]. The main problem in educational practice is the contradiction between on one hand the objective need for a continuous improvement of the software and the hardware power of training computer complexes, and on the other, the lack of personnel’s abil- ity (in both qualitative and quantitative manners) to maintain, manage and develop their ICT systems appropriately. The informatization hence of an educational institu- tion in terms of cloud computing and ICT outsourcing, will offer realistic solutions for both the deepening of informatization and improvement of ICT’s educational per- formance and use of information resources [2], [3]. The basic principles of such introduction should be: a tight relationship of learning with training and methodological support for tutors, focus on a specific educational task; modularity of learning; continuity of learning, sharing experience and formation and participation in professional association activities (including electronic) [2], [3]. In this process electronic distance learning systems should be actively used, based on the principles of open education, with the maximum possible use being of CC tech- nology and outsourcing. 4 What are Advantages of Cloud Computing Decision? 4.1 It is a Cost Effective Solution Being cost-effective the user can get (buy) products and services proposed by the virtual supermarket of ICT according to their needs (individual or group, collective, corporate), they may pay only for what has been bought (e -transport, e-content, e- services, virtual e-tools, a generic and subject software applications, network plat- forms - full range of cloud services along with services for the design and implemen- tation of ICT systems and their fragments ordered by the users, their warranty and post warranty service, maintain, upgrade and improvement, etc.) and only for the actual time of use of the purchased product [3 ]. This will allow users to avoid regular updating and upgrading of powerful general system software and hardware tools of their own ICT systems, avoiding a potential surplus of ICT products used from time 440 M. Shyshkina to time; fragmentary, not fully, as well as spare parts, reduction of requirements for information security of their own ICT systems, reduction of the number of their ICT services and requirements for professional competence of their employees and as a result, significantly reduce overall costs to support the operation and develop their ICT systems, to increase their social and economic return, their efficiency [2], [3]. 4.2 This is a Flexible Solution of ICT Infrastructure It is designed for increased flexibility and effective access to learning resources so as to build a unified and mobile infrastructure. On the basis of CC infrastructure all main aspects of interaction of a learner may be comprehended on the unite basis. Along to approach introduced in [1], among them there will be interactions between a learner and other learners; a learner and a teacher; a learner and a learning tool; a learner and educational institution; a learner and the society. This will lead to an environment of learning organization on the unite base, where collaboration between learners and a tutor, free and flexible resource access, learning activity within social inclusion into the environment of an educa- tional institution and the society will be enabled. The ICT support of learning is real- ized by means of cloud services. It is designed for adaptation to the rapidly changing external/internal environment, changing of task/competence requirements and devel- opment of modern pedagogical approaches. Due to the principles of open education [1], there is a need to create an innovative learning environment that will form and develop necessary professional skills. Among them are leader skills, collaborative skills, critical thinking, and the ability to view a problem in a holistic manner. These skills refer mostly to the demand of the sphere of public administration of education, as in alliance with them; a process of innovative development may be involved. This may be achieved on the basis of a holistic ap- proach to specialist training when the planning, design and resource management and learning activity of an organization and its monitoring, may be represented on a unite basis. It will be achieved through the unite development of different competencies: professional, fundamental, personal and technological. 5 A Holistic Model of a Specialist Holistic approach to education deals with the learning processes to be taken as unity of all main aspects of a personality development, for example such as mental, emo- tional and volitional. This is in tune with a meaning of the term “holistic” as com- pleteness, being impossible with disregard of some of its components. There are innumerous investigations devoted to the problems of holistic learning development in different aspects such as learning and teaching interaction, collabora- tion processes, engagement of both aspects of theory and practice to gain comprehen- sive view of a subject [7], [9]. Now there are important trends of research develop- ment in concern to modern ICT. For example, holistic view is to approach learning environment structure. Thus, the model of a learning environment, developed in [1] is Holistic Approach to Training of ICT Skilled Educational Personnel 441 to reveal main components and types of interactions within the different learning process settings. The notion of holistic learning occurs in relation to personnel training, concerning to different components and interactions within educational organization. It may touch upon certain types of activity, collaboration and resource management proc- esses, engaging thus the entire organization at all levels and developing a performance culture of personnel. There are different ways to approach peculiarities of specialist formation, namely in the aspect of personal or professional features. That concerning to modeling of professional competencies [5], especially in the sphere of educational management. Another aspect is about holistic models to develop leader skills [4], [9], which are more to traits of a personality. The proposed approach is based on holistic model of a specialist in the sphere of informatization of education presented in Fig.1. It concerns to Domain Competencies which would occupy fundamental knowledge of educational management and modern learning technologies and also ICT skills and ability to use e-learning tools. There are also Personal Competencies, such as leader skills, critical thinking, and capability to holistic view of a problem, responsibility and activity of an individual. As for profes- sional skills there are planning, design, resources management, cooperation and col- laboration skills, performance skills and ability for monitoring and self evaluation. Levels of Domain Competencies Education Fundamental knowledge of ICT skills management Skills for modern Skills for creating Dr.Sc educational tech- and use of e-learning PhD nologies tools Magister Specialist Bachelor Levels 5-9 Personality Professional Competencies Skill Leader skills Planning Critical thinking Design Holistic view of a Resources management Cooperation and collaboration problem Performance activities Responsibility Monitoring and self evaluation and activity Fig. 1. A holistic model of a specialist 442 M. Shyshkina All the components of a specialist’s competencies, skills and knowledge are con- sistently formed within the main level of education which corresponds to National qualification framework (levels 5-9). Cloud computing decision is a reasonable way to support holistic learning settings giving a platform for unite representation and access to learning tools for different levels and domain of education as also for different individuals and groups of users. Promising ways of assessing resources quality, while building a holistic learning environment are: A. Analysis of the most appropriate ways to use cloud computing technology to supplementing and structuring collection of educational learning resources, fill- ing it with the resources on this basis and organizing multiple access to their use B. Use of a certain set of educational resources for testing methods to evaluate the quality of their use within the cloud-based infrastructure of organization C. Recommendations on methods to replenish the collection, its prototyping and ways of structuring resources D. Elaboration of requirements to provide electronic resources, for collection re- plenishment E. Analysis of cloud computing technology outsourcing for optimal selection and use of resources’ collections F. Creation of recommendations to developers and material for replenishment and application of existing electronic learning resources G. Development of recommendations for dissemination and use of collections of electronic resources 6 An Expected Impact and Social Results of the Project The important step to wider application and introduction of new learning approaches and to gain most possible benefit from emerging technologies and ICT tools should be achieved through modernization and upgrading of ICT learning environment of edu- cational institutions, increasing of overall level of e-learning. To achieve these goals the main problem is to rise ICT and professional level of competencies of subjects of the learning process – managers, pedagogical personnel and staff and also personal of ICT departments. Just the people are the most valuable factor of empowerment of development and formation of social and economical sys- tems and educational systems in particular. Just the people are the most important resource which should be involved so as to improve the quality of these social sys- tems and to manage their purposeful and productive growth. By this reason develop- ment of tools and resources to train teachers and stuff is critical point because it really concern to all levels of educational systems functioning. The whole impact of implementation of learning tools and techniques based on cloud computing is aimed at: Broaden use of ICT in education aiming at wider take up by learners and teachers Effective public-private partnerships for introduction and managements of learning environment solutions Holistic Approach to Training of ICT Skilled Educational Personnel 443 More efficient introduction of ICT into the learning process through the exploita- tion of monitoring and assessment tools More timely and purposeful acquisition of skills and competences through ICT- based learning technologies, in educational establishments and public administra- tions Increased involvement with the adoption of learning digital technologies The important step to wider application and introduction of new learning ap- proaches and to gain most possible benefit from emerging technologies and ICT tools should be achieved through modernization and upgrading of ICT learning environ- ment of educational institutions, development of new learning approaches, creating more advanced learning technologies [1], [2]. Formation of innovative ICT infrastructure of the institution could solve some of the problems of development highly skilled educational and management personnel, bridging the gap between the process of training and the level of demand for their product. Due to development of cloud computing technologies opportunities, functionality and access to collections of electronic learning resources has significantly increased. In this regard, cloud computing is a promising direction of development of electronic resources’ collections (may be relevant for development of collections), as it allows the creation of a unified methodology for a single platform, a framework for devel- opment and testing, and for improvement and elaboration of integrated assessment methods’ quality. This gives an added value to available recourses [2], [11]. The social results will help to modernize the learning environment of educational institutions and organizations, to increase educational potential of ICT and add value to the best examples of available learning resources due to their flexible and learner- adaptive access. At the same time there are several aspects of the cloud-based learning architecture to be a subjected to further research. There are problems of pedagogical and psycho- logical support in regard to the processes of the design and organization of an educa- tional institution’s cloud infrastructure, prospecting possible organizational structures to provide learning environment functioning and to teach educational managers and organizers, pedagogical and technical stuff how to use new methods and approaches to learning, based on cloud computing. There is a necessity therefore, to create an educational and training system of support used by management personnel, teachers and learners. The result of instrumentation for cloud-based learning resources collection elabora- tion, and development of cloud-based learning environment of educational institution might be used within different learning and organizational educational structures. 7 Analysis and Estimation of Perspective Ways of Development The cloud based learning infrastructure is to give the opportunities: To combine the processes of development and use of electronic resources to sup- port learner competencies 444 M. Shyshkina To insure holistic approach to specialist education and training, combining both technological and social competences, development of critical skills of a learner To integrate the processes of training, retraining and advanced training, at different levels of education by providing access to electronic resources of a unite learning environment To solve or significantly mitigate the problems of association of electronic re- sources of the institution into unite framework To access to the best examples of electronic resources and services to those units or institutions, where there is no strong ICT support services for e-learning To provide of invariant access to learning resources within the unified educational environment, depending on the purpose of study or educational level of the student, enabling person-oriented approach to learning To make conditions for a higher level of harmonization, standardization and qual- ity of electronic resources, which may lead to emergence of the better examples of learning resources and to more massive use them 8 Conclusion There are real advantages of CC technologies to assure more flexible, scalable and cost-effective decisions of access to learning resources as within the learning envi- ronment of the university and also in learning environment of the whole region, na- tional and international scale. This is an advantage so as to ensure joint use and wid- ening participation in the learning courses of learners from different institution were necessary services are substantiated and supported. As if holistic approaches to cloud services development are already used in education so the challenge is to transfer this experience into wider context. The project is implemented within the framework of the joint research laboratory of Cloud computing in education of the Institute of Information Technologies and Learning Tools of NAPS of Ukraine (Kiev) and the Krivoy Rog State University (Krivoy Rog), www.ccelab.ho.ua. References 1. Bykov, V.: Models of Organizational Systems of Open Education. Atika, Kyiv (2009) (in Ukrainian) 2. Bykov V.: Cloud Computing Technologies, ICT Outsourcing, and New Functions of ICT Departments of Educational and Research Institutions. Information Technologies in Edu- cation, 10, 8–23 (2011) (in Ukrainian) 3. Bykov V., Shyshkina M.: Innovative Models of Education and Training of Skilled Person- nel for High Tech Industries in Ukraine. Information Technologies in Education, 15, 19– 29 (2013) 4. Candis Best, K.: Holistic Leadership: a Model for Leader-Member Engagement and De- velopment. The Journal of Values Based Leadership, 4(1) (2011) 5. Cheetham, G., Chivers, G.: Towards a Holistic Model of Professional Competence. Jour- nal of European Industrial Training, 20(5), 20–30 (1996) Holistic Approach to Training of ICT Skilled Educational Personnel 445 6. Chapman B., Mosher F. C., Page E. C.: Public Administration. Encyclopedia Britannica, http://www.britannica.com/EBchecked/topic/482290/public-administration 7. Forbes, S. H., Martin, R. A.: What Holistic Education Claims About Itself: an Analysis of Holistic Schools’ Literature. In: Proc. Annual Conf. American Education Research Asso- ciation, San Diego, California (2004) 8. Zhang, Qi, Cheng, Lu, Boutaba, R.: Cloud Computing: State-of-the-Art and Research Challenges. J. Internet Serv. Appl., 1, 7–18 (2010) 9. Quatro, S. A., Waldman, D. A. Galvin, B.M.: Developing Holistic Leaders: Four Domains for Leadership Development and Practice. Human Resource Management Review, 17, 427–441 (2007) 10. Kettl, D.; Fessler J.: The Politics of the Administrative Process. CQ Press, Washington D.C. (2009) 11. Shyshkina, M.: Innovative Technologies for Development of Learning Research Space of Educational Institution. Information Technologies and Society, 1, http://ifets.ieee.org/russian/depository/v16_i1/pdf/15.pdf (2013) (In Russian) 12. Sultan, N.: Cloud Computing for Education: A New Dawn? Int. J. of Information Man- agement, 30, 109–116 (2010) 2.3 2nd International Workshop on Algebraic, Logical, and Algorithmic Methods of System Modeling, Specification and Verification (SMSV 2013) Foreword It is our pleasure to offer you the selection of papers for the 2nd International Work- shop on Algebraic, Logical, and Algorithmic Methods of System Modeling, Specifi- cation and Verification (SMSV 2013) which has been co-located with the 9-th Inter- national Conference on ICT in Education, Research, and Industrial Applications: Integration, Harmonization, and Knowledge Transfer (ICTERI 2013) held at Kherson, Ukraine on June 19-22, 2013. Workshop SMSV 2013 is a successor of the International Workshop on Algebraic, Logical, and Algorithmic Methods of System Modeling, Specification and Verifica- tion (SMSV 2012) which was held in Kherson (Ukraine) on June, 6-12, 2012. The SMSV 2013 was organized by Kherson State University, Taras Shevchenko National University of Kyiv, and Paul Sabatier University of Toulouse within the framework of the cooperation agreement between the universities. The workshop attracted scientists from Austria, France, Algeria, Russia, and Ukraine. Presented papers demonstrated the interest in the topics on different formal meth- ods of system development, and we plan to organize such workshops on a regular basis. We hope that presentations and discussions will help to identify topics of mu- tual interest that can be considered as a base of project proposals to be submitted to international scientific programs. June, 2013 Vladimir Peschanenko Mykola Nikitchenko Martin Strecker An Abstract Block Formalism for Engineering Systems? Ievgen Ivanov1,2 1 Taras Shevchenko National University of Kyiv, Ukraine 2 Paul Sabatier University, Toulouse, France ivanov.eugen@gmail.com Abstract. We propose an abstract block diagram formalism based on the notions of a signal as a time-varying quantity, a block as a signal transformer, a connection between blocks as a signal equality constraint, and a block diagram as a collection of interconnected blocks. It does not enforce implementation details (like internal state-space) or particular kinds of dynamic behavior (like alternation of discrete steps and contin- uous evolutions) on blocks and can be considered as an abstraction of block diagram languages used by engineering system designers. We study its properties and give general conditions for well-definedness of the operation of a system specified by a block diagram for each admissible input signal(s). Keywords. block diagram, signal transformer, semantics, engineering system Key Terms. Mathematical Model, Specification Process, Verification Process 1 Introduction Many software tools for developing control, signal processing, and communica- tion systems are based on block diagram notations familiar to control engineers. Examples include system design software Simulink [1], Scicos [2], Dymola [3], SCADE [4], declarative synchronous languages [5], some embedded program- ming languages [6, 7]. In such notations, a diagram consists of blocks (components) connected by links. Typically, blocks have input and output ports, and (directed) links connect output ports of one block with input ports of the same or another block. A block is interpreted as an operation which transforms input signals (i.e. time-varying quantities) flowing though its input ports into output signals. ? Part of this research has been supported by the project Verisync (ANR-10-BLAN- 0310), France. An Abstract Block Formalism for Engineering Systems 449 Wide applicability of block diagram notations makes them an interesting object of study from a theoretical perspective. Classical control theory and signal processing already provide some degree of formal treatment of block diagrams [8, 9], but this is normally not sufficient to handle such aspects of modern system design languages as mixing of analog and discrete-time blocks, partially defined block operations, non-numeric data processing, etc. To take these issues into account, researches developed formal semantics for various block diagram languages [10–13]. Some effort has been made to unify approaches taken by different engineering system modeling and analysis tools and make them interoperable by the use of exchange languages with well-defined semantics [14, 15] such as Hybrid System Interchange Format (HSIF) which gives semantics of hybrids systems in terms of dynamic networks of hybrid automata [16, 17]. Although hybrid automata-based approaches like HSIF can be used to give semantics to block diagram languages [18] and have many advantages (e.g. avail- ability of verification theory for hybrid automata), we consider them not entirely satisfactory from a theoretical standpoint for the following reasons: – Semantics of a system component (block) is based on the notion of a dy- namical system with an internal state, and so the components with the same externally observable behavior can be semantically distinguishable. In our opinion, this is not needed for a high-level semantics which does not intend to describe details of the component’s physical/logical implementation. – Semantics of a system has a computational nature: it describes a sequence of discrete steps, where a step may involve function computations, solving initial-value problems for differential equations (continuous evolution), etc. It may be adequate for certain classes of discrete-continuous systems, but it does not always capture the behavior of a physical realization of a system (and thus may conflict which the view of a system designer). For example, a Zeno execution [17] of a hybrid automaton can be described as an infinite sequence of discrete steps which takes a bounded total time (but each step takes a non-zero time). This normally does not correspond to the behavior of a physical system described by the automaton. In many cases this is caused by system modeling simplifications. The conflict is usu- ally resolved by applying a certain method of continuation of an execution beyond Zeno time (regularization, Fillipov solution, etc.) [19]. But an ex- tended execution is not, in fact, a sequence of discrete steps, as it resumes after the accumulation point. In our opinion, in the general case, the dynamic behavior of a system should not be restricted to a particular scheme like a sequence of discrete steps and continuous evolutions. The goal of this paper is to introduce abstract formal models for blocks and block diagrams which overcomes limitations of the classical control/signal- theoretic approach to them and does not enforce implementation details (like internal state-space) or particular kinds of dynamic behavior (like alternation of discrete steps and continuous evolutions) on blocks. 450 Ie. Ivanov These models can be used to identify the most general properties of block diagram languages which are valid regardless of implementation details. In par- ticular, in the paper we will give a general formulation and conditions for well- definedness of the operation of a system specified by a block diagram for each admissible input signal(s). To achieve our goal, we will use a composition-nominative approach [20]. The main idea of this approach is that semantics of a system is constructed from semantics of components using special operations called compositions, and the syntactic representation of a system reflects this construction. The paper is organized in the following way: – In Section 2 we give definitions of the auxiliary notions which are used in the rest of the paper. – In Section 3 we introduce abstract notions of a block, a connection, a block diagram, and a block composition. We show how they fit into a system design process and give conditions of well-definedness of the operation of a system specified by a block diagram for each admissible input signal(s). 2 Preliminaries 2.1 Notation We will use the following notation: N = {1, 2, 3, ...}, N0 = N ∪ {0}, R+ is the set of nonnegative real numbers, f : A → B is a total function from A to B, ˜ is a partial function from A to B, 2A is the power set of a set A, f |X is f : A→B the restriction of a function f to a set X. If A, B are sets, then B A denotes the set of all total functions from A to B. For a function f : A→B˜ the symbol f (x) ↓ (f (x) ↑) means that f (x) is defined (respectively undefined) on the argument x. We denote the domain and range of a function as dom(f ) = {x | f (x) ↓} and range(f ) = {y | ∃x f (x) ↓ ∧ y = f (x)} respectively. We will use the the same notation for the domain and range of a binary relation: if R ⊆ A × B, then dom(R) = {x | ∃ y (x, y) ∈ R} and range(R) = {y | ∃ x (x, y) ∈ R}. We will use the notation f (x) ∼ = g(x) for the strong equality (where f and g are partial functions): f (x) ↓ iff g(x) ↓ and f (x) ↓ implies f (x) = g(x). The symbol ◦ denotes a functional composition: (f ◦ g)(x) ∼ = g(f (x)). By T we denote the (positive real) time scale [0, +∞). We assume that T is equipped with a topology induced by the standard topology on R. Additionally, we define the following class of sets: T0 = {∅, T } ∪ {[0, x) | x ∈ T \{0}} ∪ {[0, x] | x ∈ T } i.e. the set of (possibly empty, bounded or unbounded) intervals with left end 0. 2.2 Multi-valued functions A multi-valued function [20] assigns one or more resulting values to each argu- ment value. An application of a multi-valued function to an argument is inter- preted as a nondeterministic choice of a result. An Abstract Block Formalism for Engineering Systems 451 Definition 1 ([20]). A (total) multi-valued function from a set A to a set B tm (denoted as f : A−→ B) is a function f : A → 2B \{∅}. Thus the inclusion y ∈ f (x) means that y is a possible value of f on x. 2.3 Named sets We will use a simple notion of a named set to formalize an assignment of values to variable names in program and system semantics. Definition 2. ([20]) A named set is a partial function f : V →W ˜ from a non- empty set of names V to a set of values W . In this definition both names and values are unstructured. A named set can be considered as a partial (”flat”) case of a more general notion of nominative data [20] which reflects hierarchical data organizations and naming schemes. We will use a special notation for the set of named sets: V W denotes the set of all named sets f : V →W ˜ (this notation just emphasises that V is interpreted as a set of names). We consider named sets equal, if their graphs are equal. An expression of the form [n1 7→ a1 , n2 7→ a2 , ...] (where n1 , n2 , ... are distinct names) denotes a named set d such that the graph of d is {(n1 , a1 ), (n2 , a2 ), ...}. A nowhere-defined named set is called an empty named set and is denoted as []. For any named sets d1 , d2 we write d1 ⊆ d2 (named set inclusion), if the graph of a function d1 is a subset of the graph of d2 . We extend set-theoretical operations of union ∪, intersection ∩ and difference \ to the partial operations on named sets in the following way: the result of a union (intersection, difference) of named sets (operation’s arguments) is a named set d such that the graph of d is the union (intersection, difference) of graphs of the arguments (if such d exists). 3 An Abstract Block Formalism 3.1 Abstract block Let us introduce abstract notions of a signal as a time-varying quantity and a block as a signal transformer. We will use a real time scale for signals, but we will not require them to be continuous or real-valued. So the signals can be piecewise-constant as well and can be used to represent discrete evolutions. Informally, a block (see Fig. 1) is a device which receives input signals and produces output signals. We call a collection of input (output) signals an input (resp. output) signal bunch. At each time moment (the value of) a given signal may be present or absent. In the general case, the presence of an input signal at a given time does not imply the presence of an output signal at the same or any other time moment. A block can operate nondeterministically, i.e. for one input signal bunch it may choose an output signal bunch from a set of possible variants. However, for any input signal bunch there exists at least one corresponding output signal 452 Ie. Ivanov bunch (although the values of all signals in it may be absent at all times, which means that the block does not produce any output values). Normally, a block processes the whole input signal bunch, and does or does not produce output values. However, in certain cases a block may not process the whole input signal bunch and may terminate at some time moment before its end. This situation is interpreted as an abnormal termination of a block (e.g. caused by an invalid input). Fig. 1. An illustration of a block with input signals x1 , x2 and output signals y1 , y2 . The plot displays example evolutions of input and output signals. The input and output signals are lumped into an input and output signal bunch respectively. Solid curves represent (present) signal values. Dashed horizonal segments indicate absence of a signal value. Dashed vertical lines indicate the right boundaries of the domains of signal bunches. Let us give formal definitions. Let W be a (fixed) non-empty set of values. Definition 3. (1) A signal is a partial function from T to W (f : T →W ˜ ). ˜ V W such (2) A V -signal bunch (where V is a set of names) is a function s : T → that dom(s) ∈ T0 . The set of all V -signal bunches is denoted as Sb(V, W ). (3) A signal bunch is a V -signal bunch for some V . (4) A signal bunch s is trivial, if dom(s) = ∅ and is total, if dom(s) = T . A trivial signal bunch is denoted as ⊥. An Abstract Block Formalism for Engineering Systems 453 (5) For a given signal bunch s, a signal corresponding to a name x is a partial function t 7→ s(t)(x). This signal is denoted as s[x]. (6) A signal bunch s1 is a prefix of a signal bunch s2 (denoted as s1 s2 ), if s1 = s2 |A for some A ∈ T0 . Note that on V -signal bunches is a partial order (for an arbitrary V ). Later we will need generalized versions of the prefix relation for pairs and indexed families of pairs of signal bunches. For any signal bunches s1 , s2 , s01 , s02 let us denote (s1 , s2 ) 2 (s01 , s02 ) iff there exists A ∈ T0 such that s1 = s01 |A and s2 = s02 |A . For any indexed families of pairs of signal bunches (sj , s0j )j∈J and (s00j , s000 j )j∈J 0 J,2 00 000 of signal bunches let us denote (sj , sj )j∈J (sj , sj )j∈J iff there exists A ∈ T0 such that sj = s00j |A and s0j = s000 j |A for all j ∈ J. It is easy to check that 2 is a partial order on pairs of signal bunches and J,2 is a partial order on J-indexed families of pairs of signal bunches. A block has a syntactic aspect (e.g. a description in a specification language) and a semantic aspect – a partial multi-valued function on signal bunches. Definition 4. (1) A block is an object B (syntactic aspect) together with an as- sociated set of input names In(B), a set of output names Out(B), and a total tm multi-valued function Op(B) : Sb(In(B), W )−→ Sb(Out(B), W ) (operation, semantic aspect) such that o ∈ Op(B)(i) implies dom(o) ⊆ dom(i). (2) Two blocks B1 , B2 are semantically identical, if In(B1 ) = In(B2 ), Out(B1 ) = Out(B2 ), and Op(B1 ) = Op(B2 ). (3) An I/O pair of a block B is a pair of signal bunches (i, o) such that o ∈ Op(B)(i). The set of all I/O pairs of B is denoted as IO(B) and is called the input-output (I/O) relation of B. An inclusion o ∈ Op(B)(i) means that o is a possible output of a block B on the input i. For each input i there is some output o. The domain of o is a subset of the domain of i. If o becomes undefined at some time t, but i is still defined at t, we interpret this as an error during the operation of the block B (the block cannot resume its operation after t). Definition 5. A block B is deterministic, if Op(B)(i) is a singleton set for each In(B)-signal bunch i. We interpret the operation of a block as a (possibly nondeterministic) choice of an output signal bunch corresponding to a given input signal bunch. However, we would also like to describe this choice as dynamic, i.e. that a block chooses the output signal values at each time t, and in doing so it cannot rely on the future values of the input signals (i.e. values of the input signals at times t0 > t). If a block is deterministic, this requirement can be formalized in the same way as the notion of a causal (or nonanticipative) input-output system [26]. Definition 6. A deterministic block B is causal iff for all signal bunches i1 , i2 and A ∈ T0 , o1 ∈ Op(B)(i1 ), o2 ∈ Op(B)(i2 ), the equality i1 |A = i2 |A implies o1 | A = o 2 | A . 454 Ie. Ivanov This means that the value of the output signal bunch at time t can depend only on the values of the input signal at times ≤ t. Some works in the domain of systems theory extend the notion of a causal (deterministic) system to nondeterministic systems. However, there is no unified approach to an extension of this kind. For example, in the work [21], a system, considered as a binary relation on (total) signals S ⊆ AT × B T , where T is a time domain (Mesarovic time system, [22]) is “non-anticipatory”, if itSis a union of (graphs of) causal (non-anticipatory) selections from S, i.e, S = {f : dom(S) → range(S) | f ⊆ S, f is causal}. In the work [23] the authors define another notion of a “non-anticipatory” or “causal” system in nondeterministic case. In the theory developed in the work [24], the authors use a similar notion of a “precausal” system, which is also defined in [22], as a generalization of the notion of a causal system to the nondeterministic case. In this work, we generalize the notion of a non-anticipatory system in sense of [23] to blocks and call such blocks nonanticipative, and generalize the notion of a non-anticipatory system in sense of [21] to blocks, but call such blocks strongly nonanticipative. We will show that strongly nonanticipative block is nonanticipative. We will consider the words “causal” and “nonanticipative” as synonyms when they are used informally, but we will distinguish them in the context of formal definitions to avoid a conflict with Definition 6. Note, however, that the notion of a strongly nonanticipative block defined below is very different from the notion of a ”strictly causal“ system, defined in some works [25] as a system which uses only past (but not current or future) values of the input signal(s) to produce a current value of the output signal(s). Definition 7. A block B is nonanticipative, if for each A ∈ T0 and i1 , i2 ∈ Sb(In(B), W ), if i1 |A = i2 |A , then {o|A | o ∈ Op(B)(i1 )} = {o|A | o ∈ Op(B)(i2 )}. Definition 8. A block B is a sub-block of a block B 0 (denoted as B E B 0 ), if In(B) = In(B 0 ), Out(B) = Out(B 0 ), and IO(B) ⊆ IO(B 0 ). Informally, a sub-block narrows nondeterminism of a block. Definition 9. A block B is strongly nonanticipative, if for each (i, o) ∈ IO(B) there exists a deterministic causal sub-block B 0 E B such that (i, o) ∈ IO(B 0 ). Informally, the operation of a strongly nonanticipative block B can be inter- preted as a two-step process: 1. before receiving the input signals, the block B (nondeterministically) chooses a deterministic causal sub-block B 0 E B (response strategy); 2. the block B 0 receives input signals of B and produces the corresponding output signals (response) which become the output signals of B. Intuitively, it is clear that in this scheme at any time the block B does not need a knowledge of the future of its input signals in order produce the corresponding output signals. An Abstract Block Formalism for Engineering Systems 455 Lemma 1. If B is a deterministic block, then B is causal iff B is nonanticipa- tive. Proof. Follows immediately from Definition 6. The following theorem gives a characterization of a nonanticipative block which does not rely on comparison of sets of signal bunches. Theorem 1. A block B is nonanticipative iff the following holds: (1) if (i, o) ∈ IO(B) and (i0 , o0 ) 2 (i, o), then (i0 , o0 ) ∈ IO(B); (2) if o ∈ Op(B)(i) and i i0 , then (i, o) 2 (i0 , o0 ) for some o0 ∈ Op(B)(i0 ). Proof. (1) Assume that (1) and (2) are satisfied. Assume that A ∈ T0 , i1 , i2 ∈ Sb(In(B), W ), and i1 |A = i2 |A . Let o ∈ Op(B)(i1 ). Then from assump- tion (1) we have o|A ∈ Op(B)(i1 |A ), because (i1 |A , o|A ) 2 (i1 , o). More- over, i1 |A i2 , because i1 |A = i2 |A . Thus (i1 |A , o|A ) 2 (i2 , o0 ) for some o0 ∈ Op(B)(i2 ) by assumption (2). It is not difficult to check that o|A ∈ {o00 |A | o00 ∈ Op(B)(i2 )}. Because i1 , i2 , A are arbitrary, B is nonanticipative by Definition 7. (2) Assume that B is nonanticipative. Let us prove (1). Assume that (i, o) ∈ IO(B) and (i0 , o0 ) 2 (i, o). Then i0 = i|A and o0 = o|A for some A ∈ T0 . Then i0 |A = (i|A )|A = i|A , whence o0 = o|A ∈ {o00 |A | o00 ∈ Op(B)(i)} = {o00 |A | o00 ∈ Op(B)(i0 )} by Definition 7. Then o0 = o00 |A for some o00 ∈ Op(B)(i0 ). Moreover, dom(o00 ) ⊆ dom(i0 ) ⊆ A. Thus o0 = o00 and (i0 , o0 ) ∈ IO(B). Let us prove (2). Assume that o ∈ Op(B)(i) and i i0 . Then i = i0 |A for some A ∈ T0 . Then i|A = (i0 |A )|A = i0 |A , whence o|A ∈ {o00 |A | o00 ∈ Op(B)(i)} = {o00 |A | o00 ∈ Op(B)(i0 )} by Definition 7. Then o|A = o0 |A for some o0 ∈ Op(B)(i0 ). Moveover, dom(o) ⊆ dom(i) ⊆ A, whence o = o|A = o0 |A . Thus (i, o) 2 (i0 , o0 ). Theorem 2. (About strongly nonanticipative block) (1) If a block B is strongly nonanticipative, then it is nonanticipative. (2) There exists a nonanticipative block which is not strongly nonanticipative. Proof (Sketch). (1) Assume that B is strongly nonanticipative. Let R be the set of all re- lations R ⊆ IO(B) such that R is an I/O relation of a nonanticipative block. For each R ∈ IO let us define a block BR such that IO(BR ) = R, In(BR ) = In(B), Out(BR ) = Out(B). Let B = {BR | R ∈ R}. Then each element ofSB is nonanticipative. S From Definition 9 and Lemma 1 we have 0 IO(B) ⊆ R = B 0 ∈B IO(B S ). On the other hand, IO(B 0 ) ⊆ IO(B) for any B ∈ R, so IO(B) = B 0 ∈B IO(B 0 ). It is easy to see from Theorem 1 0 that (nonempty) union of I/O relations of nonanticipative block is an I/O relation of a nonanticipative block. Thus B is nonanticipative. 456 Ie. Ivanov (2) Assume that W = R. Let f : R → R be a function that is discontinuous at some point (e.g. a signum function). Let use define a block B such that In(B) = {x} and Out(B) = {y} for some names x, y, and for each i ∈ Sb(In(B), R), Op(B)(i) is defined as follows: • if dom(i[x]) = T and limt→+∞ i[x](t) exists and finite, then Op(B)(i) is the set of all {y}-signal bunches o such that dom(o) = dom(o[y]) = T and lim o[y](t) = f lim i[x](t) ; t→+∞ t→+∞ • otherwise, Op(B)(i) is S the set of all {y}-signal bunches o such that dom(o) = dom(o[y]) = {A ∈ T0 | A ⊆ dom(i[x])}. Obviously, in this definition Op(B)(i) 6= ∅ (because T0 is closed under unions) and dom(o) ⊆ dom(i) for each o ∈ Op(B)(i). So B is indeed a block. Using Theorem 1 it is not difficult to show that B is nonanticipa- tive. Suppose that B has a a deterministic causal sub-block B 0 . Let a ∈ R and ak ∈ R, k = 1, 2, ... be a sequence such that limk→∞ ak = a. Let us show that limk→∞ f (ak ) = f (a). Let us define sequences ik ∈ Sb({y}, R), ok ∈ Sb({y}, R), and tk ∈ T , k = 1, 2, ... by induction as follows. Let i1 (t) = [x 7→ a1 ] for all t ∈ T , o1 be a unique member of Op(B 0 )(i1 ), and t1 = 0. If i1 , i2 , ..., ik are already defined, let ik+1 (t) = ik (t), if t ∈ [0, tk ] and ik+1 (t) = [x 7→ ak+1 ], if t ∈ T \[0, tk ]. Let ok+1 be a unique member of Op(B 0 )(ik+1 ). Because B 0 E B, dom(ok+1 ) = dom(ok+1 [y]) = T and limt→+∞ ok+1 [y](t) = f (limt→+∞ ik+1 [x](t)) = f (ak+1 ). Then let tk+1 = 1 + max{tk , inf{τ ∈ T | 1 sup{|ok+1 [y](t) − f (ak+1 )| | t ≥ τ } ≤ }} k+1 We have defined sequences ik , ok , tk . The sequence tk , k = 1, 2, ... is a strictly increasing and unbounded from above and t1 = 0. Let i be a {x}-signal bunch such that dom(i) = T , i(t1 ) = i1 (t1 ), and i(t) = ik+1 (t), if t ∈ (tk , tk+1 ], k ∈ N, and o be a (unique) member of Op(B 0 )(i). We have ik+1 [x](t) = ak+1 for all k = 1, 2, ... and t > tk . Then i[x](t) ∈ {ak+1 , ak+2 , ...} for all k ∈ N and t > tk . For each > 0 there exists k ∈ N such that |ak0 − a| < for all k 0 ≥ k, whence |i[x](t) − a| < for all t > tk . Thus limt→+∞ i[x](t) = a. Then dom(o) = dom(o[y]) = T and limt→+∞ o[y](t) = f (a), because B 0 E B. On the other hand, ik+1 |[0,tk ] = ik |[0,tk ] for all k ∈ N. Because tk is an increasing sequence, we have ik0 |[0,tk ] = ik |[0,tk ] for all k and k 0 ≥ k. Be- sides, i|(tk ,tk+1 ] = ik+1 |(tk ,tk+1 ] for all k ∈ N, whence i|(tk ,tk+1 ] = ik0 |(tk ,tk+1 ] for all k 0 ≥ k + 1. Also, ik (t1 ) = i1 (t1 ) for all k ∈ N. Then i|[0,tk ] = i|{t1 }∪(t1 ,t2 ]∪...∪(tk−1 ,tk ] = ik |[0,tk ] for all k = 2, 3, ..., whence o|[0,tk ] = ok |[0,tk ] , because B 0 is causal. Then o(tk ) = ok (tk ) for all k = 2, 3, ..., and from the definition of tk we have |o[y](tk ) − f (ak )| = |ok [y](tk ) − f (ak )| ≤ k1 for all k = 2, 3, .... This implies that limk→∞ f (ak ) = f (a), because limt→+∞ o[y](t) = f (a). We conclude that f is sequentially continuous and thus is continuous. An Abstract Block Formalism for Engineering Systems 457 This contradicts our choice of f as a discontinuous function. Thus B has not deterministic causal sub-blocks. Consequently, B is not strongly nonantici- pative, though it is nonanticipative. The proof of this theorem gives a reason of why Definition 9 better captures a intuitive idea of causality than Definition 7. Consider, for example, the block B constructed in the proof of the item (2) of Theorem 2, when f is the signum function (i.e., f (0) = 0, f (x) = 1, if x > 0, and f (x) < 0, if x < 0). Then B outputs a signal which converges to 1 (as t → +∞) whenever the input signal converges to a positive number (as t → +∞). Moreover, it outputs a signal which converges to 0 whenever the input signal converges to 0. This implies that when the block receives a decreasing positive input signal which tends to 0, it decides to output values which are close to 0 starting from some time t. Intuitively, after reading the input signal until time t, the block decides that 0 is a more likely limit of the input signal than a positive value, but such a decision cannot be based on the past values of the input signal, so it requires some knowledge of the future of the input signal. These informal observations are captured by the fact that B has no deterministic causal sub-blocks. In the rest of the paper we will focus on strongly nonanticipative blocks, as more adequate models of (real-time) information processing systems. Consider an example of a strongly nonanticipative block. Let u, y be names. Assume that W = R. Example 1. Let B be a block such that In(B) = {u}, Out(B) = {y}, and for each i, Op(B)(i) = {o1 (i), o2 (i)}, where o1 (i), o2 (i) ∈ Sb(Out(B), W ) are signal bunches such that – dom(o1 (i)) = dom(o2 (i)) = dom(i); – o1 (i)(t) = [y 7→ i[u](t)] for all t ∈ dom(i); – o2 (i)(t) = [y 7→ 2i[u](t)] for all t ∈ dom(i). Informally, this means that B is a gain block with a slope which is either 1 or 2 during the whole duration of the block’s operation. Obviously, B satisfies Definition 4(1). Let us check that it is strongly nonan- ticipative. For j = 1, 2 let Bj Θ B be a sub-block such that Op(Bj )(i) = {oj (i)} for all i ∈ Sb(In(B), W ) (i.e. B1 always selects o1 (i) from Op(B)(i) and B2 always selects o2 (i)). The blocks B1 , B2 are deterministic and it is easy to see that they are causal. Obviously, each I/O pair (i, o) ∈ IO(B) belongs either to IO(B1 ), or to IO(B2 ), so B is strongly nonanticipative. Now let us consider an example of a block which is not nonanticipative. Example 2. Let B 0 be a block such that In(B 0 ) = {u}, Out(B 0 ) = {y}, and – Op(B 0 )(i) = {o1 }, where dom(o1 ) = dom(i) and o1 (t) = [y 7→ 1] for all t ∈ dom(i), if dom(i[u]) = T ; 458 Ie. Ivanov – Op(B 0 )(i) = {o2 }, where dom(o2 ) = dom(i) and o2 (t) = [y 7→ 0] for all t ∈ dom(i), otherwise. Informally, the block B 0 decides whether its input signal u is total. It is easy to see that B 0 indeed satisfies Definition 4(1), but the condition (1) of Theorem 1 is not satisfied, because (i, o) ∈ IO(B 0 ), where i(t) = [u 7→ 0] for all t ∈ T , o(t) = / IO(B 0 ). [y 7→ 1] for all t ∈ T , and (i|[0,1] , o|[0,1] ) 2 (i, o), but (i|[0,1] , o|[0,1] ) ∈ 0 So B is not nonanticipative. Informally, the reason is that at each time t the current value of y depends on the entire input signal. 3.2 Composition of blocks By connecting inputs and outputs of several (strongly nonanticipative) blocks one can form a larger block – a composition of blocks (see Fig. 2). We assume that an output can be connected to several inputs, but each input can be connected to no more than one output. Unconnected inputs and outputs of constituent blocks become inputs and output of the composition. Connections are interpreted as signal equality constraints and they always relate an output of some block (”source”) with an input of the same or another block (”target”). We represent connections in the graphical form (like in Fig. 2) as arrows connecting blocks. Fig. 2. An informal illustration of a block composition. Three blocks are composed to form a larger block (a dashed rectangle). Solid arrows denote connections between blocks. Dashed arrows denote unconnected inputs/outputs of the blocks 1, 2, 3 (which become the input/outputs of the dashed block). Definition 10. (1) A block diagram is a pair ((Bj )j∈J , ) of an indexed family of blocks (Bj )j∈J and an injective binary relation ⊆ Vout × Vin , which is called an interconnection relation, where [ [ Vin = {j} × In(Bj ), Vout = {j} × Out(Bj ). j∈J j∈J An Abstract Block Formalism for Engineering Systems 459 Fig. 3. A strongly nonanticipative block B is composed to obtain a block B 0 (a dashed rectangle). A solid loop arrow denotes a connection between an output and an input of B. Dashed arrows denote unconnected inputs and outputs of B which become the inputs and outputs of B 0 . (2) A block diagram ((Bj )j∈J , ) is called regular, if each Bj , j ∈ J is strongly nonanticipative. Note that a diagram may consist of an infinite set of blocks. A relation (j, x) (j 0 , x0 ) means that the output x of the j-th block is connected to the input x0 of the j 0 -th block. A block diagram is only a syntactic aspect of a block composition. We define semantics of a block composition only for strongly nonanticipative blocks. To describe it informally consider Fig. 3. The connection between y2 and x2 means a signal equality constraint. The block B chooses (nondeterministically) some deterministic sub-block B0 E B. When a signal starts flowing into the input x1, the block B 0 tries to choose the initial values for the signals of y1, y2, x2 so that they satisfy the operation of the block B0 (Op(B0 )) and the signals of x2 and y2 have the same values. If such initial values do not exist, the block B 0 terminates (the output signal bunch is nowhere defined). Otherwise, B 0 continues to operate in the similar way until either the signals of y1, y2, x2 cannot be continued, or the input signal (x1) ends. Definition 11. Let ((Bj )j∈J , ) be a regular block diagram. A block B is a composition of (Bj )j∈J under the interconnection relation , if S – In(B) = ( S j∈J {j} × In(Bj ))\range(), – Out(B) = ( j∈J {j} × Out(Bj ))\dom(), – Op(B)(i) is the set of all Out(B)-signal bunches o such that there exist deterministic causal sub-blocks Bj0 E Bj , j ∈ J and an indexed family (ij , oj )j∈J ∈ Xm (i) such that (1) dom(o) = dom(oj ) for all j ∈ J, (2) o[(j, x)] = oj [x] for all (j, x) ∈ Out(B), where Xm (i) is the set of J,2 -maximal elements of X(i), and X(i) is the set of all indexed families of pairs of signal bunches u = (ij , oj )j∈J such that (3) dom(ij ) = dom(oj ) = dom(ij 0 ) = dom(oj 0 ) ⊆ dom(i) for all j, j 0 ∈ J, (4) ij [x] = i|dom(ij ) [(j, x)] for each (j, x) ∈ In(B), (5) (ij , oj ) ∈ IO(Bj0 ) for each j ∈ J, 460 Ie. Ivanov (6) (j, x) (j 0 , x0 ) implies oj [x] = ij 0 [x0 ]. In this definition, ij and oj denote the input and output signal bunches of the j-th block. The set Xm (i) contains maximally extended (in sense of the relation J,2 ) indexed families of signal bunches defined on a subset of the domain of i (the input signal bunch of B) which satisfy constraints imposed by the intercon- nection relation. Any such family gives a possible output of B for the given i by the condition (2), i.e. output signals of B are obtained from the output signals of the sub-blocks Bj0 . It is clear that any two compositions of (Bj )j∈J under are semantically identical. Lemma 2. (Continuity of the operation of a causal deterministic causal block). Let B be a deterministic causal block. Let c ⊆ Sb(In(B), W ) be a non-empty -chain, i∗ beSits supremum (in sense of ), and o∗ ∈ Op(B)(i∗ ). Then o∗ is a supremum of i∈c Op(B)(i) (in sense of ). Proof. Follows from Definition 6. Theorem 3. Let ((Bj )j∈J , ) be a regular block diagram. Then (1) A composition of (Bj )j∈J under exists. (2) If B is a composition of (Bj )j∈J under , then B is strongly nonanticipa- tive. The proof follows from Lemma 2 and Definition 11. 3.3 Specification and implementation Above we have considered a block as an abstract model of a real component. However, it can also be considered as a specification of requirements for a compo- nent. Let B spec , B impl be two strongly nonanticipative blocks. Let us call them a specification block and implementation block respectively. Definition 12. B impl is a refinement of B spec , if B impl is a sub-block of B spec . I.e. an implementation should have the the same input and output names as a specification, and for each input, an output of an implementation should be one of the possible outputs of a specification. We generalize this to diagrams. Let D = ((Bj )j∈J , ) and D0 = ((Bj0 )j∈J 0 , 0 ) be regular block diagrams. Definition 13. D is a refinement of D0 , if J = J 0 , Bj is a refinement of Bj0 for each j ∈ J, and the relations and 0 coincide. Theorem 4. (Compositional refinement) Let B be a composition of (Bj )j∈J under and B 0 be a composition of (Bj0 )j∈J 0 under 0 . If D is a refinement of D0 , then B is a refinement of B 0 . The proof follows from Definition 9, 11, and transitivity of the sub-block relation. This theorem can be considered as a foundation of a modular approach [29, 30] to system design: An Abstract Block Formalism for Engineering Systems 461 1. Create specifications of the system components (Bj0 , j ∈ J 0 ) and connect them (0 ), as if they were real components. 2. Analyze a composition of specifications (B 0 ) to ensure that any of its imple- mentations (B 00 E B 0 ) satisfies requirements to the final system. 3. Create an implementation (Bj ) for each specification (Bj0 ). 4. Connect implementations (according to 0 ). Then the composition of im- plementations (B) is a final system which satisfies design requirements. We consider the steps 1 and 3 domain- and application-specific. The conclusion of the step 4 is addressed in Theorem 4. Step 2 requires some verification method which depends on the nature of requirements. One of the most basic and common requirements is that the operation of B 0 is defined on all input signal bunches which are possible in the context of a specific application of this composition. This trivially holds because of Theorem 3 and our definition of a block. However, B 0 may terminate abnormally on some or all input signal bunches of interest (as we have noted, we interpret the situ- ation when o ∈ Op(B̃)(i) and dom(o) ⊂ dom(i) for some block B̃ as abnormal termination of B̃ on i). So the requirement can be reformulated as follows: B 0 never terminates abnormally on any input signal bunch from a given set IN (this implies that the same property holds for B). We will call this property as well-definedness of the operation of B 0 on IN and study it in the next subsection. 3.4 Well-definedness of the operation of a composition of blocks Let B be a block and IN be some set of In(B)-signal bunches. Definition 14. The operation of B is well-defined on IN , if dom(i) = dom(o) for each i ∈ IN and o ∈ Op(B)(i). Let D = ((Bj )j∈J , ) be a regular block diagram and B be a composition of (Bj )j∈J under . Let F be the set of all families of blocks of the form (Bj0 )j∈J , where for each j ∈ J, Bj0 is a deterministic causal sub-block of Bj . For each In(B)-signal bunch i and a family of blocks F = (Bj0 )j∈J let X F (i) be the set of all indexed families of pairs of signal bunches u = (ij , oj )j∈J which satisfy conditions (3)-(6) of Definition 11 (for B and (Bj0 )j∈J ). F Let Xm (i) denote the set of all J,2 -maximal elements of X F (i). For any indexed family of signal bunches u = (ij , oj )j∈J let O(u) denote the set of all Out(B)-signal bunches o which satisfy conditions (1)-(2) of Definition 11. For any u = (ij , oj )j∈J ∈ X F (i), the domains of ij , oj for all j ∈ J coincide. Denote by cdom(u) this common domain (we assume cdom(u) = T , if J = ∅). S S From Definition 11 we have Op(B)(i) = F ∈F u∈X F (i) O(u) for each i. m Then from Definition 14 we get the following simple criterion: Theorem 5. The operation of B is well-defined on IN iff for each i ∈ IN , F ∈ F , and u ∈ X F (i), if cdom(u) ⊂ dom(i), then u ∈ F / Xm (i). 462 Ie. Ivanov This criterion means that B is well-defined, if each u ∈ X F (i), the common domain of which does not cover dom(i), is extendable to a larger u0 ∈ X F (i) (in sense of J,2 ). We will call it a local extensibility criterion, because, basically, to prove well-definedness, we only need to show that the members of the family u can be continued onto a time segment [0, sup cdom(u) + ] for some small > 0 (under constraints imposed by the interconnection relation ). Locality is especially useful when a block diagram contains ”delay” blocks (possibly working as variable delays), because constraints imposed by connections between blocks reduce over small time intervals. A drawback of this criterion is that it requires checking local extensibility of signal bunches satisfying the I/O relations (IO(Bj0 )) of arbitrarily chosen deterministic causal sub-blocks Bj0 E Bj (condition (5) of Definition 11), which are not explicitly expressed in terms of I/O relations of Bj , j ∈ J. For this reason, we seek for a condition of well-definedness in terms of I/O relations of the constituents of the composition (IO(Bj ), j ∈ J). Let X(i) denote the set X F (i), where F = (Bj )j∈J . Note that F may not be a member of F . Theorem 6. The operation of B is well-defined on IN iff for each i ∈ IN and u ∈ X(i) there exists u0 ∈ X(i) such that u J,2 u0 and cdom(u0 ) = dom(i). The proof follows from Theorem 5 and Definition 9. References 1. Simulink - Simulation and Model-Based Design, http://www.mathworks.com/ products/simulink 2. Campbell, S.L., Chancelier, J.-P., Nikoukhah, R.: Modeling and Simulation in Scilab/Scicos with ScicosLab 4.4. Springer (2010) 3. Multi-Engineering Modeling and Simulation – Dymola, http://www.3ds.com/ products/catia/portfolio/dymola 4. SCADE Suite, http://www.esterel-technologies.com/products/scade-suite 5. Caspi, P., Pilaud, D., Halbwachs, N., Plaice, J.: LUSTRE: A declarative language for programming synchronous systems. In: 14th Annual ACM Symp. on Principles of Programming Languages, Munich, Germany, pp. 178-188 (1987) 6. Henzinger, T., Horowitz, B., Kirsch, C.: Giotto: A Time-Triggered Language for Embedded Programming. First International Workshop on Embedded Software, EMSOFT’01, pp. 166-184 (2001) 7. Lublinerman, R., Tripakis, S.: Modular Code Generation from Triggered and Timed Block Diagrams. In: IEEE Real-Time and Embedded Technology and Ap- plications Symposium, pp. 147-158 (2008) 8. Sontag, E.D.: Mathematical Control Theory: Deterministic Finite Dimensional Systems. Second Edition, Springer, New York (1998) 9. Proakis, J., Manolakis, D.: Digital Signal Processing: Principles, Algorithms and Applications, 4th ed. Pearson (2006) 10. Tiwari, A.: Formal semantics and analysis methods for Simulink Stateflow models. Unpublished report, SRI International (2002) An Abstract Block Formalism for Engineering Systems 463 11. Bouissou, O., Chapoutot, A.: An operational semantics for Simulink’s simulation engine. LCTES 2012, pp. 129-138. (2012) 12. Agrawal, A., Simon, G., Karsai, G.: Semantic translation of Simulink/Stateflow models to hybrid automata using graph transformations. Electronic Notes in The- oretical Computer Science 109, 43-56 (2004) 13. Marian, N., Ma, Y.: Translation of Simulink Models to Component-based Software Models. In: 8-th Int. Workshop on Research and Education in Mechatronics, 14-15 June 2007, Talin University of Technology, Estonia (2007) 14. Pinto, R., Sangiovanni-Vincentelli, A., Carloni L.P., Passerone, R.: Interchange formats for hybrid systems: Review and proposal. In: HSCC 05: Hybrid Systems Computation and Control. Springer-Verlag, pp. 526–541 (2005) 15. Beek, D.A., Reniers, M.A., Schiffelers, R.R., Rooda, J. E.: Foundations of a Com- positional Interchange Format for Hybrid Systems. In: HSCC’07, pp. 587-600 (2007) 16. Henzinger, T.: The theory of hybrid automata. In: IEEE Symposium on Logic in Computer Science, pp. 278–292 (1996) 17. Goebel, R., Sanfelice, R., Teel, R.: Hybrid dynamical systems. In: IEEE Control Systems Magazine 29, 29–93 (2009) 18. Schrammel, P., Jeannet, B.: From hybrid data-flow languages to hybrid automata: a complete translation. In: HSCC 2012: pp. 167–176 (2012) 19. Camhbel, M.,K., Heemels, A.J., van der Schaft, A.J., Schumacher, J.M.: Solution concepts for hybrid dynamical systems. In: Proc. IFAC 15th Triennial World Congress, Barcelona, Spain (2002) 20. Nikitchenko, N.S.: A composition nominative approach to program semantics. Technical report IT-TR 1998-020, Technical University of Denmark, 103 p. (1998) 21. Windeknecht, T.G.: Mathematical systems theory: Causality. Mathematical sys- tems theory 1, pp. 279-288 (1967) 22. Mesarovic, M.,D., Takahara, Y.: Abstract systems theory. Springer, Berlin Hei- delberg New York, 439 p. (1989) 23. Foo, N., Peppas, P.: Realization for Causal Nondeterministic Input-Output Sys- tems. Studia Logica 67, pp. 419-437 (2001) 24. Lin, Y.: General systems theory: A mathematical approach. Springer, 382 p. (1999) 25. Matsikoudis, E., Lee, E.: On Fixed Points of Strictly Causal Functions. Tech- nical report UCB/EECS-2013-27, EECS Department, University of California, Berkeley (2013). 26. Williems, J.: Paradigms and puzzles in the theory of dynamical systems. In: IEEE Transactions on Automatic Control 36, pp. 259-294 (1991) 27. Williems, J.: On Interconnections, Control, and Feedback. In: IEEE Transactions on Automatic Control 42, pp. 326-339 (1997) 28. Williems, J.: The behavioral approach to open and interconnected systems. In: IEEE Control Systems Magazine, pp. 46-99 (2007) 29. Baldwin, C.Y., Clark, K.B.: Design Rules, Volume 1: The Power of Modularity. MIT Press (2000) 30. Tripakis, S., Lickly, B., Henzinger, T., Lee, E.: On relational interfaces. Proceed- ings of EMSOFT’2009, pp. 67-76 (2009) 31. Ivanov, Ie.: A criterion for global-in-time existence of trajectories of non- deterministic Markovian systems. Communications in Computer and Information Science 347, pp. 111-130, Springer (2012) 32. Carloni, L.P., Passerone, R., Pinto A.: Languages and Tools for Hybrid Systems Design. Foundations and Trends in Design Automation 1, 1-204 (2006) Multilevel Environments in Insertion Modeling System Dmitriy M. Klionov1 1 Kherson State University, 40 rokiv Zhovtnya st. 27, Kherson, Ukraine soulslayermaster@gmail.com Abstract. The goal of this paper is to show that the Insertion Modeling Sys- tem[1] developed by A.A. Letichevsky of the department 100/105 of the Glush- kov Institute of Cybernetics, National Academy of Science of Ukraine, Kyiv, Ukraine, can be used as an instrument for the modeling and analysis of complex distributed systems, such as a client-server architectures. The Insertion Model- ing [1] is based on the interactions of environments and agents inserted into that environments. Agents have different behaviors represented as Behavior Alge- bras, and can also be the environments themselves, having another agents with different behaviors inserted into them. The definition for multilevel environ- ments was first given in a paper [1], and was slightly extended in following pa- pers. Keywords. Insertion modeling, multilevel environments, compatibility relation, client-server architecture Key terms. Computation, Model, Insertion Modeling 1 Introduction Insertion modeling is a technology for specification and verification of complex dis- tributed systems based on the interactions of agents and environments. Agents and environments are models of some entities of real world or components of complex systems on different levels of abstraction that interact with one another by means of insertion functions. Also if the environment is considered as an agent it can also be inserted to other environments. In order to model complex systems those consist of a lot of components that have hierarchical structure, the notion of multilevel environ- ments, with agents that are able to move from one environment to another is required. The notion of mobility of such mobile agents are based on the approach recently fa- vored in declarative mobile language design is using mobile calculi that extend or modify the π-calculus [10] with new features, including mechanisms for encryption and security. Calculi of this kind include, among others, the Spi Calculus [6], and the Ambient Calculus [7]. In addition, there is a broader body of work favoring declara- tive approaches, including work in the field of coordination languages. There has also Multilevel Environments in Insertion Modeling System 465 been a great expansion of the capabilities and security of agent-based languages such as OAA [10] and D’Agents[13]. According to the Ambient Calculus [7], devised by Luca Cardelli the main diffi- culty of mobile computations in Web is not in mobility itself but in handling of ad- ministrative domains. In the early days of the Internet one could rely on a flat name space given by IP addresses; knowing the IP address of a computer would very likely allow one to talk to that computer in some way. This is no longer the case: firewalls partition the Internet into administrative domains that are isolated from each other except for rigidly controlled pathways. System administrators enforce policies about what can move through firewalls and how. The client–server model is the prevalent approach in computer networking. The model assigns one of two roles to the computers in a network: a client or a server. A server is a computer system that selectively shares its resources; a client is a com- puter or computer program that initiates contact with a server in order to make use of a resource. Data, CPUs, printers, and data storage devices are some examples of re- sources. This model can be represented as a set of administrative domains, with de- fined access rules, or as some architectural design pattern, like three-tier pattern. Both of these are presented in this paper in terms of the insertion modeling. 2 Insertion Modeling System Insertion modeling system is an environment for the development of insertion ma- chines and performing experiments with them. Insertion model of a system represent this system as a composition of environment and agents inserted into it, using the insertion function. Contrariwise the whole system as an agent can be inserted into another environment. In this case we speak about the internal and external environ- ment of a system. Agents inserted into the internal environment of a system them- selves can be environments with respect to their internal agents. In this case we speak about multilevel structure of agent or environment and about high level and low level environments. Agent and environments have a set of action and a set of behaviors (processes), defined in behavior algebra. Two set of actions: a set of environment actions and a set of agent actions define the type of environment. If an agent is about to be inserted into the environment at least one of its actions must be allowed by this environment. So the set of agent actions define the type of environments it can be inserted in, as well as the environment’s set of allowed agent actions define the type of agents that can be inserted into this environment. Such a relation between types of agents and environ- ments is called compatibility relation [2], which defines a directed graph. When an agent is inserted into some environment, it is able to move to another environment if it is compatible with this environment. For example the rule(1) shows an agent u that moves to an external environment E, from environment R, it is currently inserted into. e u u move P( E , R, u, move e) E[ R[u ]] move E[u , R[]] _ up ( r e ) (1) 466 D. M. Klionov Here e and r – are the names of environments, R[] – describes environment R that currently have no agents inserted into it. Insertion only occurs if a predicate P is true, and in general case it may depend only on the types of agents and environments. This example rule shows “one step” movement of an agent u, and if the new state of agent u has the same type as u, and types of environments E and R had not changed as well, rule (1) can be considered as commutative. Also“long range” movements can be defined recursively, for any set of environments between E and R. 3 Insertion Models of Client-Server Architecture 3.1 Domain Model This model describes a client-server-architecture as a set of administrative domains that have certain access rules. Each of these domains is represented by an environ- ment in IMS. Agents are messages that travel over these domains, trying to access certain protected area of some administrative domain of the server. As an example we take our website apsystem.or.ua. It is shown at picture below. The top-most environ- ment E- represents some network (local-area network or internet), with environments of apsystem itself , and a set of clients C1, C2, … , Cn inserted into the network. Cli- ent environments create agents and send them over the network in order to gain ac- cess to some function of apsystem if they have certain permission, or to a domain of another client. One of the clients can represent a villain (Hacker), which goal is to find all possible security risks and ways of an attack to curtain security protocol. In order to access administrative domain and to authorize on a server the client has to show that it knows some secret, which is only known to client and server (or two clients that want to exchange some data), and which is not transferred over the network. This key is used to encode messages (transferred by the agents), and when agents tries to move into the environment of administrative domain, this key is used to decode the message, if it is possible than agent inserts into the environment, and pro- ceeds further. There are many ways for generating such secret. Fig. 1. The domain model of client server Multilevel Environments in Insertion Modeling System 467 This model uses the standard Needham–Schroeder Public Key protocol [21]. Each client and server has a secret key, which is used to decode messages encoded with appropriate public key. When an agent gets inside the administrative domain (apsys- tem for example), it have to get a permissions to act inside it. The message transferred by this agent, contains the information about the access rights of the client who sent it. This data is used to move further. When an agent reaches some function (“download_paper” for example) it has a permission to, it is to be sent back by the server to the client. Account environments that are inserted into the clients and the top-most environment of the server store all information required to authorize at ap- propriate client or server. Tables below show all types of environments and agents. Types of environments of clients and the top-most environment of the server, are identical. In general the client differs from the server only by the means of environ- ments inside it, which require an action authorized_move. Fig. 2. Compatibility graph for the client-server domain model Vertexes represent agents and environments, and edges represent a compatibility relation. Directions mean that for example the authorized agent can be inserted into the environments of the account, server functions environments, clients and servers environments. Interactions with agents: a u u send P( E , C , AC , A, u , send a) a E[C [ AC []], A[u]] E[C[ AC[u ]], A[]] send (2) In equation (2) send(A) Means that client C sends the message u, with an appro- priate account AC, to the server A[], over the network E, where a – is the name of server A[]. The definition A[] means that there were no agents inserted into this envi- ronment. 468 D. M. Klionov Table 1. Actions of agents and environments in domain model Agent / Envi- Attributes Actions ronments type mb – message body, actual information carried by this agent; send a - makes agent to move to the server envi- ronment named a, sender – the name of the one Simple message who sent this message; access d - agent tries to authorize in order to enc_key – key that is used by enter the environment named d, that is in the encryption algorithm; server environment. mb – message body, actual auth_move d - “authorized move” to some inter- information carried by this nal environments of the server named d agent; get_data(x) - agent shares the data it carries. invoke(x) - invokes the main function of the Authorized mes- environments of the server functions, x – is the sage role – defines the access access level of authorized agent. It receives as an level of this information answer or the result of execution of function, or the “access denied” message. done(x) - required to check if the result it carries is equivalent to the expected result Clients and the Secretkey – an integer value top-most envi- of the client’s secret key, that is used by the Needham- allow(y) - environment checks the incoming ronment of the Schroeder algorithm message from the server, y – is the secret key that server is used to decode the information from that Allowed actions: Nounce – a place for random message. send, access, numbers. authmove update(x) - account is able to update its data server – the name of the about the secret keys used in the Needham- server it belongs to Schroeder algorithm role – an integer value that check_goal(x) - checks if the result brought by Accounts represent a role of this ac- the message, is equal to the expected result that is Allowed actions: count at server x access, authmove, publickey – the public key of send, get_ data(y), the server, that is used by the done(z) Needham-Schroeder algo- create(r,t) - environment creates agent named r, rithm which has type t secret – that will be obtained by Needham–Schroeder algorithm Environments that represent server check_permission u - checks if the access level functions( permission – an integer value of agent u is appropriate for performing action, if download_paper, indicating what the required it do then it is delta, if not then agent receives a upload_paper) permissions to access it are. message that it has no rights to perform the Allowed actions: function of this environment authmove, invoke E Allowed actions: send a Multilevel Environments in Insertion Modeling System 469 A allow ( y) A, u access ca u P( A, CA, u, access ca) A[u, CA[], D[]] A[CA[u ], D[]] access ca ; (3) An agent u tries to gain access to the server A[], A tries to authorize it, using the secret y, if the authorization succeeds, then u enters appropriate account on the server that is CA, and ca is its name. CA create CA, u get ( r ,t ) _ u data ( x ) P (CA, u , t , get _ data ( x), create( r , t )) A[CA[u ], D[]] A[CA[u , r ], D[]] create ( r ,t ) ca (4) An account environment CA creates a new agent named r, which type is t . It car- ries all data received from u, by the action get_data(x), x – is that data. This rule cre- ates an agent of type authorized_agent , but it can create an agent of any type that is compatible with this environment. r r authmove d P ( A, CA, D, r , authmove d ) A[CA[u , r ], D[]] A[CA[u ], D [ r ]]] authmove d (5) The authorized agent u moves to the environment D[], that represent one of the server functions; D check D , r invoke _ permission u r P( D, r , check _ permission r , invoke) D[r ] D [r ] invoke (6) The agent u invokes the main function of D[], and depending on the result of check_permission u, the result of this invoke might be different. AC chech AC r done _ goal ( x ) ( y) P ( AC , r , check _ goal ( x), done( y )) AC[r ] AC[ ] (7) When an agent comes back to the client that sent it, the client checks the message it carried, and it matches the required result then it is successful termination. These rules only work if both the client and the server share a secret, known only to them. In order to safely generate such secret the Needham–Schroeder public key algorithm is used. Usually the Needham–Schroeder protocol requires a second server that hosts all the public keys, but for simplicity we assume that all clients and servers know all the public keys. If the secret has already been created, than it is taken instead of public key and secret key for encoding and decoding of messages. It runs as follows: 1. First we check if the secret exists for an account A, if not we send message to the server A[] by the rule (2), and set the value of an agent’s attribute mb to N1 that is a simple random number. 2. Then the server A[] uses the rule(3) to decode message using the secret key of server A[],.if the secret is not created yet. 470 D. M. Klionov 3. Then the server replies by the rule (2) to client C the value of mb is set to (N1,N2), N1 – is the random number created by the client C, and N2 – is the new random number. 4. If the first part of the mb is equal to the random number that was generated before, than C can take the pair (N1,N2), as a secret for the account A. 5. Then C sends a message to A[], that contains N2. When A will receive it, he will also take the pair (N1,N2), as a secret for account C. In order to verify this protocol one of the clients has to take the role of a villain, its goal is to be authorized as another client from the network, using in this case a men- in-middle attack. [22] 3.2 Insertion Model of Three-Tier Architecture Unlike of the previous model this one focuses on the actual behavior of data-packages represented by agents, inside the server environment, divided basically to three layers according to the three-tier architecture. The example model of the server hosting two sites apsystem and unarea, is presented. Fig. 3. Insertion model of three-tier client-server architecture Their frontends are located inside the presentation tier. E [Pr[ aps [], un [], [ App [ Php [ aps _ l []], Py [ un _ l []], (8) [ Data [ mysql [ aps _ d [], un _ d []]]]] (8) is the state of environment in such example. E – the top-most environment, Pr- the presentation tier, aps – the apsystem frontend, un – the unarea frontend, App – the application tier, PHP\ PY – all sites developed in php and python accordingly, aps_l\un_l – the logic of apsystem\unarea, Data – the data tier, aps_d\un_d – the data- base of apsystem\unarea. The user only works with frontend. This means, that the Multilevel Environments in Insertion Modeling System 471 incoming agent is compatible only with environments of the presentation layer. An agent inserted into one of the frontends carries one request. Table 2. Types of agents and environments Agents/Environments types Actions execute(x) - executes the request brought by user, x-is the request data User request User_move d - User agent moves to environment, named d execute_script(y) - executes the request brought by Script request script, y-is the request data Allowed actions: ex- Script_move d - script agent moves to environment cute(x),User_move d named d Execute_query(z) - Executes the request brought by Data base request data base, z-is the request data Allowed actions: execute_script(y), Data_base_move d - Data base agent moves to Script_move d environment named d Environments of the Presentation tier Allowed actions: exe- Create (r,t) - Creates agent named r, of the type t cute(x),User_move d, Script_move d Environments of the Application tier Allowed actions: execute_script(x), Create (r,t) - Creates agent named r, of the type t Script_move d,Data_base_move d Environments of the Data tier Allowed actions: execute_query(x), Script_move d, Data_base_move d Interaction with environments: In the rule (9) u gets inside Pr using user_move pr, where pr is the name of the environment Pr, if P can allow this. (a simple one step insertion). The same way u gets inside the environment aps[], using in(aps). This shows how a user goes to some web-site (apsystem in our case), in order to download a page for example. In order to do so he has to load a web-page that has a required link to the paper he wants. user_ move pr u u (9) user_ move pr) P(Pr,u, user_ move pr)) E[u, Pr[APS[], App[ APS_ l[],Data[mysql[ APS_ d[]]]]]] E[Pr[u, APS[], App[ APS_ l[],Data[mysql[ APS_ d[]]]]]] The link to the paper is stored in the site’s data base that is inside the Data tier, and the rules for extracting these data, and displaying them is inside the Application layer. So, the frontend part (environment of apsystem in our case) creates a new agent, that is compatible only with this environment, and with according environ- ments of the Application layer: 472 D. M. Klionov u execute( x) (10) P( APS, u, execute( x)) execute( x) E[Pr[ APS []],Q]] E[Pr[APS[u],Q]] Here we check if u is able to execute its request x if it succeeds than it is DELTA, if it is NOT able to, than we have to check if there any environments inside aps, that are compatible with u, go inside them, and again try the same rule. Q is put for sim- plicity; it describes all rest of environments that are not involved in the current rule. If there are no such environments or even after insertion to such environment u is still unable to solve(x), then we have to create a new agent r that will get necessary data from application tier. APS create AP S ( r ,t ) ( APS , t , create ( r , t )) E [Pr[ aps [ u ], Q ]] E [P r [ ap s [ u , r ]], Q ]] create ( r , t ) (11) Note: r has to be created inside that environment, which u is currently inserted in. r execute _ r script( y ) (12) P( APS _ l , r, execute_ script( y)) E[Pr[APS[u], App[ PHP[ APS _ l[r ]]]]] solve ( y) E [Pr [ APS[u], App[ PHP[ APS _ l [r ]]]]] The agent r moves to that environment (APS_l). It should go first to the environ- ment PHP, which is the top-environment of all sites based on PHP, and then moves to the environment APS_l. The rule for its movement is similar to the movement of a user request. The execution of script request is different: (13) r r script _ move aps P ( PR , App , PHP , APS , APS _ l , r , script _ move aps ) E [Pr[ APS [u ], App [ PHP [ APS _ l[ r ]]]]] script _ move aps E [P r [ APS [u , r ], Ap p [ PH P [ APS _ l []]]]] If the execution succeeds, than r moves back to the environment, which created it. If not, then the environment APS_l, creates a new agent, which is the query for the data tier. The rules are similar. 4 Conclusions The client-server model can be considered as a prevalent approach in computer net- working, and is one of the best examples of complex distributed systems. Two exam- ples of insertion models of client-server architecture are presented in this paper: the domain model – as a set of administrative domains with pre-defined access rules; and a three-tier architecture - a client–server architecture in which the presentation, the application processing, and the data management functions are logically separated. Both these insertion models with multilevel environments and mobile agents can be extended later for more complicated applications, such as the verification of crypto- Multilevel Environments in Insertion Modeling System 473 graphic protocols, the problem solving, the constraint propagation, the cognitive ar- chitectures. References 1. Letichevsky, A. A.: Insertion Modeling. Control Systems and Computers, 6, 3-14 (2012) 2. Letichevsky, A.: Algebra of Behavior Transformations and its Applications. In: Kudryavtsev, V. B., Rosenberg, I. G. (eds.) Structural Theory of Automata, Semigroups, and Universal Algebra. NATO Science Series II. Mathematics, Physics and Chemistry, vol. 207, pp. 241-272, Springer (2005) 3. Baranov, S., Jervis, C., Kotlyarov, V., Letichevsky, A., Weigert, T.: Leveraging UML to Deliver Correct Telecom Applications. In: L. Lavagno, G. Martin, and B.Selic (eds.) UML for Real: Design of Embedded Real-Time Systems. Kluwer Academic Publishers, Amster- dam (2003) 4. Letichevsky, A.A., Kapitonova, J., Letichevsky, A. Jr., Volkov, V., Baranov, S., Kot- lyarov, V., Weigert, T.: Basic Protocols, Message Sequence Charts, and the Verification of Requirements Specifications. Computer Networks, (47), 662–675 (2005) 5. Kapitonova, J., Letichevsky, A., Volkov, V., Weigert, T.: Validation of Embedded Sys- tems. In: R. Zurawski (ed.) The Embedded Systems Handbook. CRC Press, Miami (2005) 6. Abadi, M., Gordon, A.D.: A Calculus for Cryptographic Protocols: the spi Calculus. In: Proc. 4th ACM Conference on Computer and Communications Security, pp. 36-47, (1997) 7. Cardelli., L., Gordon, A.: Mobile Ambient. In: Nivat, M. (ed.) Proc. FoSSaCs’98: Founda- tions of Software Science and Computational Structures, LNCS 1378, pp. 140–155, Springer-Verlag (1999) 8. Lange, D.B., Oshima: Programming and Deploying Java Mobile Agents with Aglets. Ad- dison-Wesley (1998) 9. Kotz, D., Gray, R.S.: Mobile Agents and the Future of the Internet. ACM Operating Sys- tems Review 33(3), 7–13, (1999) 10. Milner, R., Parrow, J., Walker, D.: A Calculus of Mobile Processes (Parts I and II). Infor- mation and Computation 100, pp. 1-77 (1992) 11. Martin, D., Cheyer, A., Morgan, D.: The Open Agent Architecture: A Framework for Building Distributed Software Systems. Applied Artificial Intelligence, 12, 91–128 (1999) 12. Gray, R.S., Kotz, D., Cybenko, G., Rus, D.: D’Agents: Security in a Multiple-Language, Mobile-Agents System. In: Vigna G. (ed.) Mobile Agents and Security, LNCS 1419, pp. 154–187, Springer-Verlag (1998) 13. Milner, R.: A Calculus of Communicating Systems, LNCS 92, Springer-Verlag, (1980) 14. Milner, R.: Communication and Concurrency. Prentice Hall, (1989). 15. Milner, R.: The Polyadic π-Calculus: a Tutorial. Tech. Rep. ECS–LFCS–91–180, Labora- tory for Foundations of Computer Science, Department of Computer Science, University of Edinburgh, UK (1991) 16. Park, D.: Concurrency and Automata on Infinite Sequences. In: LNCS 104. Springer- Verlag, (1981) 17. Roggenbach, M., Majster-Cederbaum, M.: Towards a Unified View of Bisimulation: a Comparative Ctudy. TCS, 238, 81–130 (2000) 18. ITU-T. Z.120 Recommendation Z.120 (11/99): Languages for telecommunications appli- cations – Message Sequence Charts (MSC) (1999) 19. ITU-T. Z.100 Recommendation Z.100 – Specification and Description Language (SDL) (1999) 474 D. M. Klionov 20. Rutten, J.: Coalgebras and Systems. TCS, 249 21. Needham, R., Schroeder, M.: Using Encryption for Authentication in Large Networks of computers. Comm. ACM, 21(12), 993–999 (1978) 22. Lowe, G.: An Attack on the Needham-Schroeder Public Key Authentication Proto- col. Information Processing Letters, 56(3), 131–136 (1995) 23. Eckerson, W.: Three Tier Client/Server Architecture: Achieving Scalability, Performance, and Efficiency in Client Server Applications. Open Information Systems. 3(20), 10, 1 (1995) Clocks Model for Specification and Analysis of Timing in Real-Time Embedded Systems Iryna Zaretska1 , Galyna Zholtkevych1 , Grygoriy Zholtkevych1 and Frédéric Mallet2 1 V.N. Karazin Kharkiv National University, School of Mathematics and Mechanics, 4, Svobody Sqr., 61022, Kharkiv, Ukraine zar@univer.kharkov.ua,{galyna,g}.zholtkevych@gmail.com 2 Université Nice Sophia Antipolis, AOSTE Team Project (INRIA/I3S), INRIA Sophia Antipolis Méditerranée, 2004 rte des Lucioles (Lagrange L-043) BP93, F-06902 Sophia Antipolis Cedex, France frederic.mallet@unice.fr Abstract. Problems concerning formal semantics for Clock Constraint Specification Language (CCSL) are considered in the paper. CCSL is intended for describing logical time models for real-time embedded sys- tems and the language is a part of UML profile for MARTE. There exist two approaches to introduce a denotational semantics for CCSL. A pure relational subset of CCSL is defined in the paper. The notion of time structure with clocks is introduced to refine describing denotational se- mantics for this CCSL subset, which authors called RCCSL. Semantic properties of RCCSL have been studied. Theorem about coincidence se- mantics of RCCSL for the two approaches is proved. Keywords. Embedded system, real-time system, time modelling, time structure, clock constraint, formal specification Key terms. ConcurrentComputation, FormalMethod, SpecificationPro- cess, VerificationProcess, MathematicalModeling 1 Introduction Nowadays, the growth of using distributed real-time systems (including embed- ded systems) [4] is the developing trend for Information and Communication Technology. There are two reasons for such growth: first, the physical limit for processor acceleration is reached, and, second, using mobile and cloud technolo- gies are explosively expanded. The impossibility to continue over-clocking of a processor leads to using a multi-core system, which is parallel and distributed. A complex consisting of a computational cloud and an ensemble of mobile de- vices is a parallel and distributed system too. Moreover its structure is not fixed. 476 I. Zaretska, G. Zholtkevych, Gr. Zholtkevych and F. Mallet Each of the cases requires using different kinds of multiprocessing architectural and software solutions [3]. Therefore, providing correct working of such systems requires more research in the area. Mathematical modelling of systems makes possible to develop formal specifi- cations and methods of their analysis as a base for trustworthy system construct- ing. There are a lot of approaches to modelling multiprocessor systems. First of all, the following ones should be noticed: CSP of C.A.R. Hoare [8], π-calculus of R. Milner [11], abstract state machine model [5], and processing algebra [14]. This paper is devoted to formal methods for an important subclass of mul- tiprocessing distributed systems, namely, real-time embedded (RTE) systems. These methods are closely connected with the UML profile for MARTE (Mod- elling and Analysis of Real-Time and Embedded systems) [2, 15]. In the context of the MARTE approach UML [16, 17] is used to build engineering models of a developing system. But the UML notation does not support detailed description of interactions for joining components into a united RTE system. A very com- mon way to specify conditions for the system integrity is through the Object Constraint Language (OCL) [12]. However, no facilities for specifying temporal constraints are provided by the OCL standard. The Clock Constraint Specifica- tion Language (CCSL) [2] was defined in an annex of MARTE as a way to build logical and temporal constraints on model elements. CCSL is intended to describe the temporal ordering of interactions between components of a distributed software system. It focuses on the ordering of event occurrences, but not on their chronometric characteristics. It relies on a logical time model inspired by the work on synchronous systems and their polychronous extensions. The denotational semantics for basic constructions of CCSL is given in [10]. It is based on the notion of a time structure with clocks, other approach [1] defines an operational way to compute runs for CCSL specifications. The main contribution of this paper is a demonstration that the relationship of semantics consequence based on time structures as models of constraints and semantics consequence based on time structures associated with runs are only equivalent for a subset of CCSL, which we call RCCSL. 2 Syntax of Pure Relational CCSL In the paper we restrict ourself to a very simple sublanguage of CCSL, which we call the pure relational CCSL (RCCSL). Syntax of this subset is given here using EBNF [6]. clock constraint = clock relation, {’,’, clock relation}; clock relation = clock reference, sign of clock relation, clock reference; sign of clock relation = ’subclocking’ | Clocks Model for Specification and Analysis of Timing ... 477 ’exclusion’ | ’coincidence’ | ’cause’ | ’precedence’; clock reference = ? any element of clock set ?; Below we use the next notation for symbols of clock relations (see Table 1). Table 1. Symbols of clock relations Relation name Relation symbol subclocking ⊂ exclusion # coincidence = cause 4 precedence ≺ These five binary relations on a clock set C are determined as logical primi- tives for CCSL in [1]. Defining semantics for RCCSL is one of the paper objectives. Following the paper [10], we define the denotational meaning for a set of clock constraints as some class of time structures expanded by a classification for event occurrences. The next section is devoted to describing such structures. 3 Time Structure with Clocks Let consider a set of event occurrences, which is below denoted by I. Elements of the set I are called instants. Some pairs of instants denotes instant pairs, whose elements are ordered in time: i1 4 i2 is denoted the fact ”an instant i1 causes an instant i2 ” or equivalently ”an instant i1 cannot occur later than an instant i2 ”, where i1 , i2 ∈ I. This relation is called ’cause’. It is naturally to suppose that cause is a pre-order. As known [7, section 1.3], each pre-order can be decomposed uniquely into the union of two relations such that the former is a strict order (it is denoted bellow by ’≺’ and called a precedence) and the latter is an equivalence (it is denoted bellow by ’≡’ and called a coincidence). These relations are connected by the next property: for any instants i1 , i01 , i2 , i02 ∈ I the validity of i1 ≡ i01 , i2 ≡ i02 , and i1 ≺ i2 implies (1) truth of i01 ≺ i02 . 478 I. Zaretska, G. Zholtkevych, Gr. Zholtkevych and F. Mallet Moreover, if we have a strict order and an equivalence on the same set and these relations satisfy (1) then their union is a pre-order. Now, we can introduce the notion of a time structure for formalising our understanding a set of instants. Definition 1. Let (I, 4) be a pair of a set and a pre-order on this set respec- tively. Denote by ≺ the strict order corresponding to the pre-order 4. The pair (I, 4) is called a time structure if the next property (the property of cause finite- ness [13]) holds: the set {i0 ∈ I | i0 ≺ i} is finite for all i ∈ I. (2) Definition 1 is based on the corresponding definition in [10]. One can compare them with the definition of a time structure in [13]. Difference consists in a possibility of modelling an instant coincidence. Note that Definition 1 specifies the set of instants and some time relations on it but it does not determine any classification of instants in compliance with their sources. Therefore, in the following [2] we introduce such a classification by adding a finite set of instant sources called clocks and by mapping the set of instants into this clock set. Definition 2. Let (I, 4) be a time structure, C be a finite set of clocks, and π : I → C be a map then the quadruple (I, 4, C, π) is called a time structure with clocks if the next property holds: for any clock c ∈ C and i1 , i2 ∈ π −1 (c) the validity of i1 6= i2 implies truth of i1 ≺ i2 ∨ i2 ≺ i1 , (3) i.e. π −1 (c) is linearly ordered by the restriction of the cause. If c ∈ C then the set π −1 (c) is usually denoted by Ic . It can be considered as an event stream generated by the source associated with the clock c. From Definition 1 and Definition 2 the next fact follows immediately. Proposition 1. Let (I, 4, C, π) be a time structure with clocks then 1. Ic is well-ordered by the strict order ≺ for all c ∈ C; 2. ordinal type of Ic for any c ∈ C is less or equal to ω, where ω is the first infinite ordinal. Proof. Firstly note that property (3) implies linear ordering Ic for an arbitrary c ∈ C. Further, suppose that A is some non-empty subset of Ic for an arbitrary c ∈ C, i is some element of A. If for all i0 ∈ A the statement i ≺ i0 ∨ i = i0 is true then inf A = i ∈ A. If there exists i0 ∈ A such that i0 ≺ iTthen the set A(i) = {i0 ∈ A | i0 ≺ i} is not empty. It is evident that A(i) = A {i0 ∈ Ic | i0 ≺ i}. This equality and the property of cause finiteness (2) imply finiteness of A(i). So, taking into account Clocks Model for Specification and Analysis of Timing ... 479 property (3) we can conclude that A(i) is a finite linearly ordered set. Hence, there exists i∗ ∈ A(i) such that i∗ = inf A(i). It is evident that inf A = inf A(i) = i∗ ∈ A(i) ⊂ A. Thus, inf A ∈ A and Ic is well-ordered. The supposition that ordinal type of Ic for some c ∈ C is greater than ω is inconsistent with the property of cause finiteness (2). t u Corollary 1. Any instant i ∈ I is uniquely determined by the pair (π(i), idx(i)), which is an element of the set C × N. Here, idx is a map from I into N such that idx(i) = |{i0 ∈ Iπ(i) | i0 ≺ i}| + 1, where the number of elements in a set A is denoted by |A|. The designation TC is used below to refer to the class of time structures with C as a set of clocks. Remark 1. One can show that this class is a set but we do not do it in the paper. 4 Denotational Semantics for RCCSL Usually, a denotational semantics can be considered as the theory of models for the corresponding language. We shall use time structures with clocks as models for describing meaning of clock constraints. 4.1 Some General Notes One can identify a class of event occurrences of the same type with a set of instants for some clock in the process of specifying interactions between compo- nents of distributed parallel systems. Such an identification is provided by fixing a set of clocks C and describing rules of interacting system components. These rules divide the set TC into two subsets: the subset of time structures satisfying the constraints and the set of time structures contradicting them. Taking into account the specification of RCCSL one can say that a clock constraint is a finite set of clock relations. If the set of clock relations determining the constraint is denoted by C then the fact ”the time structure T ∈ TC satisfies the constraint C” can be written as T |= C. More precisely, T |= C means that for each C ∈ C the clause T |= C is true. Further, for a constraint C, JCK denote the following set {T ∈ TC | T |= C}. The first important problem is the consistency problem for the constraint. The rigorous problem formulation has usually the form: Problem 1 (Consistency Problem). For a constraint C check that the set JCK is not empty. 480 I. Zaretska, G. Zholtkevych, Gr. Zholtkevych and F. Mallet The second important problem is the semantic consequence for the constraints. The rigorous problem formulation has the next form: Problem 2 (Semantic Consequence Problem). For a constraint C and a clock relation r check that JCK ⊂ JrK (or in the another notation C r). Below we use the notation {C} for the set of clock relations that form the constraint C. It is easy to see that the next properties of the relationship are true. Proposition 2. The next properties are satisfied: 1. if a constraint C and a clock relation r satisfy the condition r ∈ {C} then C r; 2. if constraints C1 and C2 and a clock constraint r satisfy the next condition C1 r0 for all r0 ∈ {C2 } and C2 r are true then C1 r is true. Proof is omitted t u To complete defining the denotational semantics for RCCSL we should determine the meaning of basic clock relations. 4.2 Subclocking This relation is intended for specifying a requirement to synchronize each instant of one clock with some instant of an other clock. In this case the first clock is called a subclock of the second clock. More precisely, let c0 , c00 ∈ C and T ∈ TC then T |= c0 ⊂ c00 means that there exists a strict monotonic map h : Ic0 → Ic00 such that i ≡ h(i) for any i ∈ Ic0 . Proposition 3 (Trivial Subclocking). For each c ∈ C the clause c ⊂ c is true. Proof is trivial t u Proposition 4 (Transitivity Law for Subclocking). For each c0 , c00 , c000 ∈ C the clause c0 ⊂ c00 , c00 ⊂ c000 c0 ⊂ c000 is true. Proof. Let hc00 c0 : Ic0 → Ic00 , hc000 c00 : Ic00 → Ic000 be strict monotonic maps pro- viding the validity of the clauses T |= c0 ⊂ c00 and T |= c00 ⊂ c000 respectively for some T . It is easy to see that the map hc000 c00 ◦ hc00 c0 provides the validity of the clause T |= c0 ⊂ c000 t u 4.3 Exclusion This relation is used for specifying the mutual exclusion for two events. More formally, let c0 , c00 ∈ C and T ∈ TC then T |= c0 # c00 means that for any i0 ∈ Ic0 , i00 ∈ Ic00 the coincidence i0 ≡ i00 is false. Clocks Model for Specification and Analysis of Timing ... 481 Proposition 5 (Irreflexivity Law for Exclusion). For each c ∈ C the equal- ity Jc # cK = ∅ is true. Proof is trivial t u Proposition 6 (Symmetry Law for Exclusion). For each c0 , c00 ∈ C the clause c0 # c00 c00 # c0 is true. Proof is trivial t u 4.4 Coincidence This relation describes synchronization of two event sources. More precisely, let c0 , c00 ∈ C and T ∈ TC then T |= c0 = c00 means that there exists a strict monotonic bijection h : Ic0 → Ic00 such that i ≡ h(i) for any i ∈ Ic0 . Proposition 7 (Trivial Coincidence). For each c ∈ C the clause c = c is true. Proof is trivial t u Proposition 8 (Symmetry Law for Coincidence). For each c0 , c00 ∈ C the clause c0 = c00 c00 = c0 is true. Proof. Let h : Ic0 → Ic00 be a strict monotonic bijection providing the validity of the clause T |= c0 = c00 for some T and h−1 be its inverse map. Suppose that i0 , i00 ∈ Ic00 , i0 ≺ i00 , and h−1 (i0 ) 6≺ h−1 (i00 ) then either h−1 (i0 ) = h−1 (i00 ) or h−1 (i00 ) ≺ h−1 (i0 ). But the first alternative contradicts to bijectivity of h, and the second alternative and strict monotonicity of h implies i00 ≺ i0 . The last clause contradicts to irreflexivity of the precedence relation. These contra- dictions show that h−1 is a strict monotonic map. Further, for any i ∈ Ic00 we have that h−1 (i) ∈ Ic0 and h−1 (i) ≡ h(h−1 (i)) = i. Thus, the clause T |= c00 = c0 is true t u Proposition 9 (Transitivity Law for Coincidence). For each c0 , c00 , c000 ∈ C the clause c0 = c00 , c00 = c000 c0 = c000 is true. Proof is similar to proof of Proposition 4 t u 4.5 Cause This relation is intended for specifying that each instant of one clock is caused by an instant in another clock. More precisely, let c0 , c00 ∈ C and T ∈ TC then T |= c0 4 c00 means that there exists a strict monotonic map h : Ic00 → Ic0 such that h(i) 4 i for any i ∈ Ic00 . Proposition 10 (Trivial Cause). For each c ∈ C the clause c 4 c is true. 482 I. Zaretska, G. Zholtkevych, Gr. Zholtkevych and F. Mallet Proof is trivial t u Proposition 11 (Transitivity Law for Cause). For each c0 , c00 , c000 ∈ C the clause c0 4 c00 , c00 4 c000 c0 4 c000 is true. Proof is similar to proof of Proposition 4 t u 4.6 Precedence This relation is a stronger variant of the cause relation. Namely, let c0 , c00 ∈ C and T ∈ TC then T |= c0 ≺ c00 means that there exists a strict monotonic map h : Ic00 → Ic0 such that h(i) ≺ i for any i ∈ Ic00 . Proposition 12 (Irreflexivity Law for Precedence). For each c ∈ C the equality Jc ≺ cK = ∅ is true. Proof is trivial t u Proposition 13 (Transitivity Law for Precedence). For each c0 , c00 , c000 ∈ C the clause c0 ≺ c00 , c00 ≺ c000 c0 ≺ c000 is true. Proof is similar to proof of Proposition 4 t u 4.7 Interdependencies Laws for the Basic Relations Above we considered properties of each basic relation but interdependencies between these relations were not in our focus. Thus, such interdependencies are considered below. The next lemma is needed to ground these dependencies. Lemma 1. Let (X, ≤) be a well-ordered set and φ : X → X be a strict mono- tonic map such that for all x ∈ X the assertion φ(x) ≤ x is true then φ is the identity map. Proof. One can prove the lemma by using the transfinite induction t u Proposition 14 (Interdependencies Laws for the Basic Relations). 1. For each c0 , c00 ∈ C the clause c0 ⊂ c00 , c00 ⊂ c0 c0 = c00 is true. 2. For each c0 , c00 ∈ C the clock relations c0 ⊂ c00 and c0 # c00 are inconsistent, i.e. Jc0 ⊂ c00 , c0 # c00 K = ∅. 3. For each c0 , c00 ∈ C the clause c0 ⊂ c00 c00 4 c0 is true. 4. For each c0 , c00 ∈ C the clause c0 4 c00 , c00 4 c0 c0 = c00 is true. Proof. 1) For any T ∈ TC the validity of the assertion ”T |= c0 = c00 implies T |= c0 ⊂ c00 ” is evident. Let’s check the validity of the inverse assertion. Denote the strict monotonic maps that provide for some T ∈ TC the validity of T |= c0 ⊂ c00 and T |= c00 ⊂ c0 Clocks Model for Specification and Analysis of Timing ... 483 by hc00 c0 : Ic0 → Ic00 and hc0 c00 : Ic00 → Ic0 respectively. We claim that they are mutually inverse. Indeed, for any i ∈ Ic0 we have the next coincidences: i ≡ hc00 c0 (i) and hc00 c0 (i) ≡ hc0 c00 (hc00 c0 (i)). These coincidences and the Transitivity Law for Coincidence (see Proposition 9) provide the validity of the coincidence i ≡ hc0 c00 (hc00 c0 (i)). Taking into account that both i and hc0 c00 (hc00 c0 (i)) are elements of Ic0 and the fact that restriction of 4 on Ic0 is a strict order (see Proposition 1) one can derive the equality i = hc0 c00 (hc00 c0 (i)). The equality i = hc00 c0 (hc0 c00 (i)) for all i ∈ Ic00 is derived similarly. Thus, hc00 c0 is a bijection. 2) Proof is trivial. 3) Proof is trivial. 4) Really, let hc00 ,c0 : Ic0 → Ic00 and hc0 ,c00 : Ic00 → Ic0 be strict monotonic maps provided for some T ∈ TT the validity of the clauses T |= c00 4 c0 and T |= c0 4 c00 respectively. Then the map φ = hc0 ,c00 ◦ hc00 ,c0 : Ic0 → Ic0 is strict monotonic and it satisfies the condition φ(i) 4 i. Therefore, applying the Lemma 1 allows to conclude that φ and the identity map are equal t u 5 Runs and Chronometers Following [1], in this section we introduce the notion of a run for a set of clocks. We use this notion to define a behavioural model for the set of clocks. Definition 3 (see [1]). Let C be a finite set of clock then any map r : N → 2C such that r(t) = ∅ implies r(t0 ) = ∅ for all t0 > t is called a run for C. This definition means that if r is a run then at the (global) time t all clocks of the set r(t) and only them are triggered. For each run r one can construct a quadruple T [r] = (Ir , 4, C, πr ) by the following way: – Ir = {(c, t) ∈ C × N | c ∈ r(t)}; – (c0 , t0 ) 4 (c00 , t00 ) if and only if t0 ≤ t00 ; – πr (c, t) = c for all (c, t) ∈ Ir . Proposition 15. T [r] is a time structure with clocks for given run r. Proof. It is proved by trivial checking properties (2) and (3) t u Hence, we can define the semantic relationship between a run r and a constraint C by the next way: r |= C if and only if the clause T [r] |= C is true. Also, we can introduce the relationship C1 run C2 as an abbreviation of the sentence ”for any r such that r |= C1 the next relationship r |= C2 is valid”. Proposition 15 allows to suggest that a run carries more information than a time structure because a run depends on global time. A refinement and a substanti- ation of this hypothesis is discussed below. The notion of chronometer is introduced to specify dependences between time structures and runs. 484 I. Zaretska, G. Zholtkevych, Gr. Zholtkevych and F. Mallet Definition 4. Let T = (C, I, 4, π) be a time structure with clocks and χ : I → N be a map such that the next assertions are true: for any i0 , i00 ∈ I the coincidence i0 ≡ i00 implies χ(i0 ) = χ(i00 ) (4) for any i0 , i00 ∈ I the strict precedence i0 ≺ i00 implies χ(i0 ) < χ(i00 ) (5) 0 0 for any t, t ∈ N the validity of the clauses t ∈ χ(I) and t < t (6) implies truth of the clause t0 ∈ χ(I) then χ is called a chronometer on T [9]. Example 1. Let C be a finite set of clocks, r be a run for C. Then it is evident that the map χ∗ : Ir → N determined by the equality χ∗ (c, t) = t is a chronometer. Hence, Example 1 shows that each time structure generated by a run has a native chronometer χ∗ . Proposition 16. Let T be a time structure with clocks and χ : I → N be a chronometer then the map r[T , χ] : N → 2C defined by the next formula r[T , χ](t) = π(χ−1 (t)) (7) is a run. Proof. To prove the proposition we should show that r[T , χ](t) = ∅ for some t ∈ N implies r[T , χ](t0 ) = ∅ for any t0 ≥ t. Suppose existence of t1 and t2 such that t1 < t2 , π(χ−1 (t1 )) = ∅, but π(χ−1 (t2 )) 6= ∅. Taking into account this assumption one can derive that χ−1 (t1 ) = ∅ and χ−1 (t2 ) 6= ∅. Hence, t1 ∈/ χ(I) and t2 ∈ χ(I). We have obtained the contradic- tion to condition (6) of Definition 4 t u The next property for the chronometer χ∗ from Example 1 holds. Proposition 17. Let r be a run for a clock set C then the next equality holds r[T [r], χ∗ ] = r. (8) Let T = (C, I, 4, π) be a time structure with clocks and χ : I → N be a chronome- ter on T then the map χ b : I → C × N defined in the next way χ b(i) = (π(i), χ(i)) is a map onto Ir[T ,χ] such that any coincidence i0 ≡ i00 implies the coinci- dence χb(i0 ) ≡ χ b(i00 ) in T [r] and any precedence i0 ≺ i00 implies the precedence 0 00 b(i ) ≺ χ χ b(i ) in T [r]. Proof. Really, r[T [r], χ∗ ](t) = πr (χ−1 ∗ (t)) = πr ({(c, t) ∈ Ir }) = πr ({(c, t) ∈ C × N | c ∈ r(t)}) = r(t). Further, (c, t) ∈ Ir[T ,χ] if and only if c ∈ r[T , χ](t). It is easy to see that the last clause is equivalent to existence of i ∈ I such that c = π(i) and t = χ(i), i.e. it is equivalent to (c, t) = χ b(i). If i0 ≡ i00 then χ(i0 ) = χ(i00 ) by definition of a chronometer, hence χ b(i0 ) ≡ χ b(i00 ). 0 00 0 00 0 00 Similarly, if i ≺ i then χ(i ) < χ(i ), therefore χ b(i ) ≺ χ b(i ) t u Clocks Model for Specification and Analysis of Timing ... 485 Proposition 18. There exists only one chronometer on T [r] for any run r. Proof. For any run r there exists the chronometer χ∗ on T [r]. Let χ be an other chronometer on T [r]. For (c0 , t), (c00 , t) ∈ I[r] using (4) we have χ(c0 , t) = χ(c00 , t). Hence, taking into account Definition 3 one can obtain that χ(c, t) = τ (t) where τ is strict monotonic function from α into α for some cardinal α ≤ ω. Thus, τ is the identity function and χ = χ∗ t u Hence, a chronometer exists on a time structure associated with a run. We claim that a chronometer exists on any time structure with clocks. The next binary relation / on a time structure with clocks will be used for describing an algorithm that calculates timestamps for instants. More precisely, if i0 , i00 ∈ I then i0 / i00 means that for all i ∈ I the validity of the next clause i ≺ i00 & i0 4 i implies truth of the coincidence i ≡ i0 . It is easy seen that if i1 ≡ i01 , i2 ≡ i02 , and i1 / i2 then i01 / i02 . Now we can construct the algorithm that allows to calculate timestamps for instants on an arbitrary time structure with clocks. This Algorithm 1 is a generalization of Lamport’s algorithm [9]. Algorithm 1: Computing timestamp for an instant input : T = (C, I, 4, π) is a time structure with clocks, i is an element of I output: timestamp for the instant i 1 begin 2 count ← 1; D ← ∅; W ← ∅ ; // -- initializing work variables -- 3 while i ∈ / D do // -- main loop -------------------- 4 W+ ← {j ∈SI | j ∈ / D & idx(j) = count}; 5 W+ ← WS + / D & (∃j 0 ∈ W+ )j 0 ≡ j}; {j ∈ I | j ∈ 6 W ← W W+ ; 0 0 0 7 D+ ← {j S∈ W | (∀j ∈ I)(j / j ⇒ j ∈ D)}; 8 D ← D D+ ; 9 W ← W \ D+ ; 10 count ← count + 1; 11 end 12 return count; 13 end Theorem 1 (existence of a chronometer). Let T be a time structure with clocks and χ0 : I → N be the function calculated by Algorithm 1 then χ0 is a chronometer on T . Proof. One can see that Algorithm 1 builds two sequences of sets D0 ⊂ D1 ⊂ D2 ⊂ · · · ⊂ Dn ⊂ . . . W0 , W1 , W2 , . . . , Wn , . . . 486 I. Zaretska, G. Zholtkevych, Gr. Zholtkevych and F. Mallet in accordance to the following computational scheme: W0 = ∅ D0 =∅ 0 0 0 S W n+1 = (WnS {j ∈ I | (∃j ∈ I)(j ≡ j & idx(j ) = n + 1)}) \ Dn 0 0 0 Dn+1 = Dn {j ∈ Wn+1 | (∀j ∈ I)(j / j ⇒ j ∈ Dn )} and maps an instant i ∈ I into χ0 (i) = inf{n ∈ N | i ∈ Dn }. Firstly, note that supposition about partial definiteness of χ0 implies existence of an infinite sequence i1 . i2 . . . . . But it contradicts the causes finiteness property (2). Secondly, it is true by the construction of Dn that the validity of i0 ≡ i00 implies the truth of the following statement: i0 ∈ Dn if and only if i00 ∈ Dn . Hence, we obtain that i0 ≡ i00 implies χ0 (i0 ) = χ0 (i00 ). Further, similar reasoning provides the validity of the following statement: i0 ≺ i00 implies χ0 (i0 ) < χ0 (i00 ). Finally, the simple inequality idx(i) ≤ χ(i), which is correct for any i ∈ I and any chronometer χ on T , provides the validity of property (6) t u Corollary 2. There exists a chronometer on an arbitrary time structure with clocks. 6 Equivalence of Semantics for RCCSL Determined by Relations and run In the section the notion of a chronometer is used to prove the theorem about equivalence of the relationships and run . The theorem is the main result of the paper. Taking into account the theorem one can confine himself to checking semantic consequence by using runs. This opens a way to constructing an opera- tional semantics of RCCSL so that it is equivalent to the denotational semantics defined above. We need two lemmas to prove the main theorem. Let’s use the notation i1 k i2 for instants i1 and i2 such that i1 64 i2 & i2 64 i1 . Lemma 2. Let T = (C, I, 4, π) by a time structure with clocks and i1 , i2 be instants such that the clause i1 k i2 is true then there exists a chronometer χ on T satisfied the following condition χ(i1 ) < χ(i2 ). Proof. Let’s consider the quadruple T 0 = (C, I, 40 , π) such that i0 ≺0 i00 is valid if one of the next conditions is true 1. i0 = i1 and i00 = i2 ; 2. i0 ≺ i00 ; 3. i0 ≺ i1 and i2 ≺ i00 ; and i0 40 i00 if and only if i0 ≡ i00 or i0 ≺0 i00 . It is easy seen that the relation 40 is a pre-order. More over, it satisfies properties (2) and (3). Hence, T 0 is Clocks Model for Specification and Analysis of Timing ... 487 a time structure with clocks. Using Corollary 2 we obtain that there exists a chronometer χ on T 0 . But then χ is a chronometer on T and χ(i1 ) < χ(i2 ) is true t u Corollary 3. Let T = (C, I, 4, π) be a time structure with clocks, i0 , i00 ∈ I be instants, then 1. i0 ≺ i00 is valid if and only if for any chronometer χ on T the inequality χ(i0 ) < χ(i00 ) is true; 2. i0 ≡ i00 is valid if and only if for any chronometer χ on T the equality χ(i0 ) = χ(i00 ) is true. Lemma 3. Let T = (C, I, 4, π) be a time structure with clocks, ∗ be an arbi- trary sign of a clock relation, c0 and c00 be clocks then T |= c0 ∗ c00 if and only if r[T , χ] |= c0 ∗ c00 for any chronometer χ on T . Proof. It is evident that T |= c0 ∗ c00 implies r[T , χ] |= c0 ∗ c00 for any chronometer χ on T . Hence, we need to prove the inverse statement. 1) Suppose that r[T , χ] |= c0 ⊂ c00 for any chronometer χ on T . Then for any i ∈ Ic0 and for each chronometer χ there exists an instant iχ ∈ Ic00 such that χ(i) = χ(iχ ). Denote by X the set formed all iχ . It is a nonempty subset of Ic00 . Suppose that there exists at least two different elements in the set X. Let’s denote them by iχ1 and iχ2 . Taking in account linearity of the order on Ic0 and iχ1 6= iχ2 one can suppose that iχ1 ≺ iχ2 . Therefore χ1 (i) = χ1 (iχ1 ) < χ1 (iχ2 ). Thus, one of the two cases is realised: i ≺ iχ2 or i k iχ2 . But in the first case we obtain the inequality χ2 (i) < χ2 (iχ2 ), which contradicts to the choice of iχ2 . Hence, i k iχ2 is true. Similarly, one can obtain that i k iχ1 is true. Therefore, we proved that |X| > 1 implies i k iχ for all i ∈ Ic0 and any chronometer χ. Let i∗ = inf iχ then i k i∗ and χ(i∗ ) ≤ χ(iχ ) = χ(i). This is a contradic- χ∈X tion because Lemma 2 provides existence of some chronometer χ0 such that χ0 (i∗ ) > χ0 (i). Hence, X contains only one element, which we denote by h(i). By construction we have χ(i) = χ(h(i)) for any chronometer χ. The last prop- erty implies strict monotonicity of h and the coincidence i ≡ h(i). Therefore, T |= c0 ⊂ c00 . 2 and 3) Suppose that r[T , χ] |= c0 ∗ c00 for any chronometer χ on T then it is evident that T |= c0 ∗ c00 where ∗ equals to # or = . 4 and 5) Suppose that r[T , χ] |= c0 ∗ c00 for any chronometer χ on T where ∗ equals to 4 or ≺ . Similarly, in the first case one can derive that T |= c0 ∗ c00 is true t u Theorem 2 (about equivalence of semantics). Let C be an arbitrary finite set of clocks, C1 and C2 be RCCSL constraints then the C1 C2 is true if and only if C1 run C2 is true. Proof. One can easily see that the Theorem is the direct consequence of the Lemma 3 t u 488 I. Zaretska, G. Zholtkevych, Gr. Zholtkevych and F. Mallet 7 Conclusion In the paper we have considered the pure relational subset of CCSL (RCCSL) and have introduced semantics for it by using a class of mathematical objects called by authors time structures with clocks. We have studied semantic properties of RCCSL (see Propositions 3 – 9). We hope that these properties can be a background of an axiomatic basis for analysing relational clock constraints. Further we have introduced the notions ”a run” and ”a chronometer”. It allowed us to study interrelations between time structures and runs, to introduce the alternative semantics closer to the operational approach than the denotational semantics discussed earlier. Finally, the main theorem about equivalence of these two semantics (see Theo- rem 2) has been proved. We are planning to continue our research in the next areas: – building an axiomatic theory of the semantic consequence for RCCSL con- straints; – extending results on complete CCSL; – studying an operational semantics of CCSL and specifying its interrelations to the denotational semantics. References 1. André, C.: Syntax and Semantics of the Clock Constraint Specification Lan- guage (CCSL). Technical report, RR-6925, INRIA (2009), http://hal.inria.fr/inria- 00384077/en/ 2. André, C., Mallet, F., de Simone, R.: The Time Model of Logical Clocks available in the OMG MARTE profile. In: Shukla, S.K., Talpin, J.-P. (eds.) ”Synthesis of Em- bedded Software: Frameworks and Methodologies Correctness by Construction”, pp. 201–227. Springer Science+Business Media, LLC New York (2010) 3. Baer, J.-L.: Multiprocessing Systems. IEEE Trans. on Computers. 12, vol. C-25, 1271–1277 (1976) 4. Bonomi, F., Milito, R., Zhu, J., Addepalli, S.: Fog computing and its role in the internet of things. In: Proceedings of the first edition of the MCC workshop on Mobile cloud computing, pp. 13–16. ACM New York, NY, USA (2012) 5. Börger, E., Stärk, R.: Abstract State Machines: A Method for High-Level System Design and Analysis. Springer-Verlag, Berlin Heidelberg (2003) 6. Information technology – Syntactic metalanguage – Extended BNF. ISO/IEC 14977:1996(E) 7. Harzheim, E.: Ordered Sets. Springer Science+Business Media, Inc. New York (2005) 8. Hoare, C.A.R.: Communicating Sequential Processes. Prentice Hall International (1985) 9. Lamport, L.: Time, Clocks, and the Ordering of Events in a Distributed System. Comm. ACM. 7, vol. 12, 558–565 (1978) 10. Mallet, F.: Logical Time @ Work for the Modeling and Analysis of Embedded Systems, Habilitation thesis. LAMBERT Academic Publishing (2011) Clocks Model for Specification and Analysis of Timing ... 489 11. Milner, R.: Communicating and Mobile Systems: The Pi Calculus. Cambridge Uni- versity Press, Cambridge (1999) 12. Information technology – Object Management Group – Object Constraint Lan- guage (OCL). ISO/IEC 19507:2012(E) 13. Nielsen, M., Plotkin, G., Winskel, G.: Petri nets, event structures and domains. Theor. Comp. Sc. 1, vol. 13, 85–108 (1981) 14. Process Algebra for Parallel and Distributed Processing. Alexander, M., Gardner, W. (eds), CRC Press (2009) 15. UML Profile for MARTE: Modeling and Analysis of Real-Time Embedded Sys- tems. OMG (2011), http://www.omg.org/spec/MARTE/1.1/pdf/ 16. OMG Unified Modeling LanguageTM (OMG UML), Infrastructure. OMG (2011), http://www.omg.org/spec/UML/2.4.1/Infrastructure 17. OMG Unified Modeling LanguageTM (OMG UML), Superstructure. OMG (2011), http://www.omg.org/spec/UML/2.4.1/Superstructure Specializations and Symbolic Modeling Vladimir Peschanenko1, Anton Guba2 and Constantin Shushpanov3 1 Kherson State University, 27, 40 rokiv Zhovtnya str., Kherson, 73000 Ukraine, vladimirius@gmail.com 2 Glushkov Institute of Cybernetics of NAS of Ukraine, 40, Glushkova ave., Kyiv, 03680, antonguba@ukr.net 3 LLC «Information Software Systems», 15, Bozhenko str., Kyiv, 03680 Ukraine, costa@iss.org.ua Abstract. We present the technique that allows splitting first-order logic formu- lae into parts which helps to use the special algorithms of satisfiability checking and predicate transformer, which are the specializations. We describe the mathematical description of the algorithm of the constructing specializations and a few particular approaches to them, which speed up modeling of industrial models. We prove the correctness of satisfiability and predicate transformer functions. We consider forward and backward applicability of basic protocols during symbolic modeling and verification We introduce the examples for each specialization. We provide the experiments with typical real examples. Keywords. Symbolic modeling, satisfiability, predicate transformer Key terms. FormalMethod, MathematicalModeling, SoftwareComponent, VerificationProcess 1 Introduction The technique of symbolic verification of requirement specifications of software sys- tems has shown good results in automatic detection of reachability of deadlocks and violation of user-defined properties [1]. In previous works [2-4] symbolic models of systems being transition systems with symbolic states represented by formulae of first order logic were considered. A relation of transitions between the formulae is deter- mined and marked by basic protocols, which are considered as actions, performed in the system. A basic protocol is a formula of dynamic logic x ( ( x, a ) P ( x, a ) ( x, a )) and it describes local properties of the system in terms of pre- and postconditions α and β. Both are formulae of first order multisorted logic interpreted on a data domain, P is a process, represented by means of MSC dia- Specializations and Symbolic Modeling 491 gram and describes the reaction of a system triggered by the precondition, x is a set of typed data variables, and a is a set of environment attributes. The general theory of basic protocols is presented in [5]. A transition is considered as an operator in the space of postcondition formulae. As the operator transforms one formula to another, in [6] a term “predicate trans- former” is used. Thus, to compute transitions between the states of such models basic protocols are interpreted as predicate transformers: for given symbolic state of the system and given basic protocol the direct predicate transformer generates the next symbolic state as its strongest postcondition, and the backward predicate transformer generates the previous symbolic state as its weakest precondition. These concepts have been implemented in VRS (Verification of Requirement Specifications) system [7] and IMS (Insertion Modeling System) system [8]. An amount of papers with novel and very efficient techniques for computing satis- fiability using SAT/SMT has been published in the last years, and some very efficient SMT tools are now available (e.g., BarceLogic [9], CVCLite/CVC/CVC4 [10,11,12], DLSAT [13], haRVey [14], MathSAT [15], SDSAT [16], TSAT++ [17], UCLID [18], Yices [19], Verifun [20], Zapato [21], Z3 [22]). An amount of benchmarks, mostly derived from verification problems, is available at the SMT-LIB [23]. Work- shops devoted to SMT and official competitions on SMT tools are run yearly. All these tools could be configured with the help of many parameters, which means the usage of some techniques, tactics, heuristics or not, in order to gain in per- formance. In the paper [24] the algorithm configuration problem is stated as follows: given an algorithm, a set of parameters for the algorithm, and a set of input data, found parameter values under which the algorithm achieves the best possible per- formance on the input data. It gives a possibility of automated tuning of algorithm for obtaining performance on formulae of some theory. Usually during modeling of real projects we deal with complex environment states and simple formulae of basic protocols (pre- and postconditions). It means that we should check the satisfiability of the conjunction of the environment state and the precondition formula and transform this whole big formula with the help of predicate transformer [6]. Obviously, the manipulation with whole formulae is not required for most of cases. For example, let i, j : int, f : int int be attributes and f (i ) 0 f (0) 5 j 0 be an environment state, and 1 j : j 1 be a basic protocol. Let’s apply this basic protocol to the environment state. First, the satisfiability of conjunction of basic protocol precondition and environment state should be checked: f (i ) 0 f (0) 5 j 0 . This checking should use the notion of functional sym- bols: (i 0) ( f (i ) f (0)) . After that we should apply basic protocol postcondi- tion to conjunction of environment state and precondition (see section Application of Basic Protocol): (v : int)( f (i ) 0 f (0) 5 v 0 ( j v 1)) f (i ) 0 f (0) 5 (v : int)(v 0 ( j v 1)) f (i ) 0 f (0) 5 j 1 492 V. Peschanenko, A. Guba and C. Shushpanov It is known that basic protocol changes attribute j only (see section about predicate transformers). It means that we could apply basic protocol to small part of environ- ment state that depends on j, but not to whole environment state formula. In this ex- ample it could be j 0 only. If there are no predicates in projects ,which could com- pare values of attribute j with values of other attributes, then we could use some spe- cial theories for manipulating with such formulae. In this example numeral intervals could be used for representation of values of attribute j. We call such special theories Specialization of sat, pt functions according to our general algorithm. So, the main goal of this paper is to present a mathematical description of algo- rithm of constructing specializations and a few particular approaches to specialization which speed up modeling of industrial models. This paper is a continuation of the [25], where only concrete values as a kind of specialization were described. In the Section 2 we describe the process of forward application of a basic protocol with the help of the satisfiability and the forward predicate transformer. In the Section 3 we present an applicability of basic protocols using satisfiability and backward predicate transformer. The specializations by memory usage and functional symbols are proposed in the Section 4. The results of experiments are discussed in the Section 5. In the Section 6 we summarize advantages of usage of the specializations and what could be done in the nearest future. 2 Forward Application of Basic Protocol Let S(a) be an environment state, x( ( x, a ) P ( x, a ) ( x, a )) be a basic proto- col, where x – parameters of basic protocol, a – attributes of model, D ( x, a ) E ( a ) ( x, a ) – conjunction of environment state and precondition of basic protocol. At the first step of application of basic protocol satisfiability of conjunction of en- vironment state and precondition of basic protocol is checked: sat ( D ( x, a)) . If the formula is unsatisfiable, then basic protocol is not applicable to environment state S(a). If not, then process P ( x, a) is run and after forward predicate transformer is applied: pt ( D ( x, a ), ( x, a )) . The process of P ( x, a) is not considered in the paper, because the specialization tries to speed up the functions sat and pt. 2.1 Satisfiability The checking formula satisfiability function sat is based on the Shostak method, adapted to combination of numerical, symbolic and enumerated data types. If all of the attribute expressions (simple attributes and functional symbols with parameters) that are free in the formula S are simple, then for satisfiability checking it is sufficient to prove validity of the closed formula (a, x) D ( x, a ) , where a is a set of all simple attributes which occur in S, x is a set of parameters of basic protocol. For attribute expressions with parameters (including access functions to the elements of arrays), Specializations and Symbolic Modeling 493 the Ackermann reduction of the uninterpreted functional symbols is used, where at- tribute expression is an attribute or functional symbol with parameters. The Shostak method consists of the following. An expression of the form f (x) is called as Functional Expression, if f is an attribute and x is a list of its parameters. At first, superpositions of functional expressions are eliminated by successive substitu- tion of every internal occurrence of f (x) by a new variable y, bounded by existential quantifier and added to the formula y f (x ) . For example, formula P ( f ( g ( x))) is replaced by formula y ( y g ( x) P ( f ( y ))) . After all such replacements there will not be complex functional expressions in the formula. Further, for every attribute expression f of functional type all its occurrences f ( x1 ),..., f ( xn ) with the different parameters x1 ,..., xn are considered. Occurrence f ( xi ) is replaced by variable yi , bounded by existential quantifier and substitutive equations ( xi x j ) ( yi y j ) are added. Now in the formula there are only simple attributes, and a method consid- ered in [26] is used. 2.2 Forward Predicate Transformer In general case, the post-condition looks like ( x, a ) R ( x, a ) C ( x, a ) , where R ( r1 : t1 r2 : t 2 ...) is a conjunction of assignments and C(x,a) is a formula part of post-condition. We will consider three sets of functional expressions (we consider attributes as a functional expression with 0 arity): r, s and z. Set r (r1 , r2 ,...) consists of the left parts of assignment, and also of other functional expressions that recursively depend on the left parts. In other words, r consists of the left parts of assignments and, if some functional expression f is included into this set, then all functional expressions in which f occurs are also included in r. Set s ( s1 , s 2 ,...) consists of functional ex- pressions which have external occurrences (not in arguments of such functional ex- pressions) in formula part C of post-condition, but do not coincide with expressions from the set r. Finally, set z ( z1 , z 2 ,...) consists of functional expressions which have external occurrences in formula D in right parts of assignments and in internal occurrences (in arguments if functional expressions) of the functional expressions of formula part C of post-condition and left parts of assignments, but these assignments are not included in two other sets (including parameter of basic protocol). Now, con- sidering formulae, from which a post-condition and formula D are constructed as functions of external occurrences of elements of these sets, we get a presentation of post-condition in the following form: B (r , s, z ) (r1 (r , s, z ) : t1 (r , s, z ) r2 (r , s, z ) : t 2 ( r , s, z ) ...) C ( r , s, z ) , Predicate transformer is determined by the following formula: pt ( D (r , s, z ), (r , s, z )) q1 q2 ... ,where qi (u , v )( D (u , v, z ) R (u , v, z ) Ei (u , v, z ) C ( r , s, z )) , R (u , v, z ) (r1 (u , v, z ) t1 (u , v, z )) (r2 (u , v, z ) t 2 (u , v, z )) ...) , 494 V. Peschanenko, A. Guba and C. Shushpanov Formula R (u , v, z ) is a quantifier-free part of the assignment formula. Set of the variables u(v) represents new variables for each attribute expression from r(s) set. The pt substitutes attributes from r(s) set to variables from u(v) set in corresponded part of formula. Each of disjunctive members q i corresponds to one of possible means of identifi- cation of functional expressions occurring in formulae ( x, a ) , and Ei (u , v, z ) is a set of equalities and inequalities corresponding to such identification. To describe the construction of Ei (u , v, z ) we will consider the set M of all pairs of functional expressions in the form ( f (k ), f (l )), k (k1 , k 2 ,...), l (l1 , l2 ,...) , where f (k ) is chosen from set z, and f (l ) – from sets r and s. These functional expressions shall be equal if their arguments were equal before application of basic protocol. Let’s choose arbitrary subset N M (including an empty set for every pair ( f (k ), f (l )) N we will consider conjunction of equalities k l , ( k1 l1 k 2 l 2 ...) . We will unite all such conjunctions in one and will add to it conjunctive negations of all equalities, which correspond to pairs which are not included into the set N. We will denote the obtained formula as Gi (r , s, z ) . If this formula is satisfiable, then the choice is successful. Now obviously, f (k ) is not inde- pendent and shall change the value because Gi (r , s, z ) is true. Thus, f (k ) shall change the value in the same way as f (l ) . Set Ei (r , s, z ) Gi (r , s, z ) H i ( z , u , v) where H i ( z , u , v ) is a conjunction of equalities f (k ) w if a variable w corresponds to f (l ) . Thus, if f (k ) coincides with several functional expressions, it is not impor- tant what variable is chosen (transitivity of equality) [6]. 3 Backward Application of Basic Protocol Let S (a ) be an environment state after the application of the basic protocol x( ( x, a ) P ( x, a ) ( x, a )) , where x is parameters of basic protocol, a – attrib- utes of model, ( x, a ) R ( x, a ) C ( x, a ) , where R (r1 : t1 r2 : t 2 ...) is a conjunction of assignments and C is a formula part of post- condition, D ( x, a ) S (a ) C ( x, a ) is a conjunction of environment state and formula part of postcondition of basic protocol. 3.1 Satisfiability At first step of application of basic protocol in backward mode satisfiability of con- junction of environment state and formula part of postcondition of basic protocol is checked: sat ( D( x, a )) . If the formula is unsatisfiable, then the basic protocol is not applicable to environment state S (a ) .If not, then process P ( x, a ) is run and after a backward predicate transformer is applied: pt 1 ( D ( x, a ), ( x, a )) . Specializations and Symbolic Modeling 495 3.2 Backward Predicate Transformer A backward predicate transformer considers three sets of functional expressions r, s and z (as forward too). A postcondition of the basic protocols is represented by the following formula: B (r , s, z ) (r1 (r , s, z ) : t1 (r , s, z ) r2 (r , s, z ) : t 2 (r , s, z ) ...) C ( r , s, z ) A backward predicate transformer is determined by the following formula: pt -1( D(r , s, z ), (r , s, z )) q11 q21 ... , where qi1 (u, v)( D(u, v, z ) R(u, r , s, z ) Ei (u, v, z )) (r , s, z ) , R (u , r , s, z ) (u1 ( r , s, z ) t1 ( r , s, z )) (u 2 t 2 ( r , s, z )) ...), u {u1 , u 2 ,...} , Each of disjunctive members qi corresponds to one of possible identification of functional expressions, occurring in formulae ( x, a ) and environment state S(a), where Ei (u , v, z ) are sets of equalities and inequalities corresponding to such identifi- cation. Formula Ei (u , v, z ) is built in the same way as in forward predicate trans- former [27]. 4 Specialization We propose to use two types of specializations: 1. Specialization by memory usage 2. Specialization by functional symbol 4.1 Specialization by memory usage Let a1,a2 be sets of attributes from initial environment state and a1 a2 a1 a2 a , S (a) S1 (a1 ) S 2 (a2 ) is environment state, B( x, a) x(1 ( x1 , a1 ) 2 ( x2 , a2 ) P( x, a) 1 ( x1, a1 ) 2 ( x2 , a2 )) is basic protocol, where x1 x 2 x 1 x 2 x . If B( x, a) x( i ( x, a) P( x, a) ( x, a)) then sat ( S (a) ( i ( x, a))) i i sat ( S (a ) i ( x, a )) and pt ( S (a) ( i ( x, a)), ( x, a)) pt ( S (a) i i i i ( x, a ), ( x, a )) . So, in the next text we consider basic protocol as B ( x, a ) only. 4.2 Theorem 1 sat ( S1 (a1 ) 1 ( x 1 , a1 ) S 2 (a2 ) 2 ( x 2 , a2 )) sat ( S1 (a1 ) 1 ( x 1 , a1 )) sat ( S 2 (a2 ) 2 ( x 2 , a2 )) Proving. Function sat builds closed formula. So, 496 V. Peschanenko, A. Guba and C. Shushpanov sat ( S1 (a1 ) 1 ( x 1 , a1 ) S 2 (a2 ) 2 ( x 2 , a2 )) (v1 , v2 , x1 , x2 )( S1 (a1 ) 1 ( x 1 , a1 ) S 2 (a2 ) 2 ( x 2 , a2 )) where v1 ,v2 are variables generated for attribute expression which depend on at- tributes a1 , a2 . It is known that a1 a2 a1 a2 a x1 x 2 x 1 x 2 x . It means that scope of quantifiers could be narrowed: (v1 , v2 , x1 , x2 )( S1 (a1 ) 1 ( x 1 , a1 ) S 2 (a2 ) 2 ( x 2 , a2 )) (v1 , x1 )( S1 (a1 ) 1 ( x 1 , a1 )) (v2 , x2 )( S 2 (a2 ) 2 ( x 2 , a2 )) i sat ( S1 (a1 ) 1 ( x 1 , a1 )) sat ( S 2 (a2 ) 2 ( x 2 , a2 )) Theorem is proved. This theorem means the following: 1. If S (a ) S1 (a1 ) S 2 (a2 ) and (a, x) 1 ( x1 , a1 ) and S (a ) is satisfiable, then it is enough to check satisfiability of conjunction of S1 (a1 ) 1 ( x1 , a1 ) for satisfi- ability checking of S (a ) ( x, a ) . Checking of satisfiability of S 2 (a2 ) is not re- quired. 2. Checking of each part sat ( Si (ai ) i ( xi , ai )) could be done concurrently. This case could be easily generalized to a1 ,...., an case, because if it is possible to build subsets a1i , ai2 ai a1i ai2 a1i ai2 ai and to spilt an environment state and basic protocol accordingly to the theorem 1, then sat ( Si (ai ) i ( xi , ai )) sat ( Si (ai ) i ( xi , ai )) . So, after if we say about such i i pair of two sets ai , ai ai a1i ai2 a1i ai2 ai , then we understand that it 1 2 could be applicable and for n sets. Let’s see how forward and backward predicate transformer can be applied. 4.3 Theorem 2 For forward application of basic protocol it is true that: pt ( S1 (a1 ) 1 ( x 1 , a1 ) S 2 (a2 ) 2 ( x 2 , a2 ), 1 ( x 1 , a1 ) 2 ( x 2 , a2 )) pt ( S1 (a1 ) 1 ( x 1 , a1 ), 1 ( x 1 , a1 )) pt ( S 2 (a2 ) 2 ( x 2 , a2 ), 2 ( x 2 , a2 )) Proving. pt function builds sets r , s , z from postcondition 1 ( x 1 , a1 ) 2 ( x 2 , a2 ) and formula S1 (a1 ) 1 ( x 1 , a1 ) S 2 (a2 ) 2 ( x 2 , a2 ) , where r is a set of attribute ex- pressions from left parts of assignments of postcondition, s is a set of attribute ex- pressions from formula part of postcondition, z is a set of other attribute expressions from formula and postcondition. We know that sets of attribute expressions from pairs S1 (a1 ) 1 ( x 1 , a1 ) , 1 ( x 1 , a1 ) and S 2 (a2 ) 2 ( x 2 , a2 ) , 2 ( x 2 , a2 ) are not inter- sected. It means that we could split each set r , s, z on subsets Specializations and Symbolic Modeling 497 r r1 r 2 , s s 1 s 2 , z z 1 z 2 and r1 r2 , s1 s 2 , z 1 z 2 , because a1 a2 . Let’s write formula which is built by pt func- tion. Let D(a, x) D1 ( x1 , a1 ) D2 ( x2 , a2 ), D1 ( x 1 , a1 ) S1 (a1 ) 1 ( x 1 , a1 ) , D 2 S 2 (a2 ) 2 ( x 2 , a2 ) and 1 ( x 1 , a1 ) R1 (r1 , s1 , z1 ) C1 (r1 , s1 , z1 ) , 2 ( x2 , a2 ) R2 (r2 , s2 , z 2 ) C2 (r2 , s2 , z 2 ) . So, general formula of predicate trans- former is the following: qi (u, v)( D(u , v, z ) R(u, v, z ) ( Ei (u , v, z )) C (r , s, z )) i i where R (u , v, z ) R1 (u1 , v1 , z1 ) R2 (u 2 , v2 , z 2 ) , C (r , s, z ) C1 (r1 , s1 , z1 ) C 2 ( r2 , s2 , z 2 ). because (a, x) 1 ( x 1 , a1 ) 2 ( x 2 , a2 ) . Let’s write in details how to obtain Ei (u , v, z ) . It is known that r1 r 2 , i s 1 s 2 , z 1 z 2 . To build such disjunction we should take into account all pairs of functional attribute expressions from sets r,s and z. It means that each such pair should be in set of attribute (r1 s1; z1 ) or (r2 s2 ; z 2 ) . So, Ei (u , v, z ) ( Ei1 (u1 , v1 , z1 )) ( Ei2 (u 2 , v2 , z 2 )) i i1 i2 Let’s consider formula of predicate transformer: pt ( S1 (a1 ) 1 ( x 1 , a1 ) S 2 (a2 ) 1 ( x 2 , a2 ), 1 ( x 1 , a1 ) 2 ( x 2 , a2 )) qi (u , v)( D(u , v, z ) R (u, v, z ) ( Ei (u , v, z )) C (r , s, z )) i i (u1 , u 2 , u1 , v2 )( D1 (u1 , v1 , z1 ) D2 (u1 , v1 , z1 ) Ri (u1 , v1 , z1 ) Ri (u 2 , v2 , z 2 ) ( Ei1 (u1 , v1 , z1 )) ( Ei2 (u 2 , v2 , z 2 )) i1 i2 C1 (r1 , s1 , z1 ) C2 (r2 , s2 , z 2 )) (u1 , v1 )( D1 (u1 , v1 , z1 ) R(u1 , v1 , z1 ) ( Ei1 (u1 , v1 , z1 )) i1 C1 (r1 , s1 , z1 )) (u 2 , v2 )( D2 (u 2 , v2 , z 2 ) R(u 2 , v2 , z 2 ) ( Ei2 (u 2 , v2 , z 2 )) C2 (r2 , s2 , z 2 )) ... i2 pt ( D1 ( x 1 , a1 ), 1 ( x 1 , a1 )) pt ( D2 ( x 2 , a2 ), 2 ( x 2 , a2 )) Theorem is proved. 4.4 Theorem 3 For backward mode it is true that: 498 V. Peschanenko, A. Guba and C. Shushpanov pt 1 ( S1 (a1 ) C1 (r1 , s1 , z1 ) S 2 (a2 ) C2 (r2 , s2 , z 2 ), 1 ( x 1 , a1 ) 2 ( x 2 , a2 )) pt 1 ( S1 (a1 ) C1 (r1 , s1 , z1 ), 1 ( x 1 , a1 )) pt ( S 2 (a2 ) C2 (r2 , s2 , z 2 ), 2 ( x 2 , a2 )) Proving. R (u , v, z ) R (u1 , v1 , z1 ) R (u 2 , v2 , z 2 ) , C ( r , s, z ) C1 ( r1 , s1 , z1 ) C2 ( r2 , s2 , z 2 ) because (ra , x) 1 (a1 , x 1 ) 2 (a2 , x 2 ) . Ei (u, v, z ) ( Ei1 (u1 , v1 , z1 )) i i1 ( Ei2 (u 2 , v2 , z 2 )) from previous theorem. i2 pt -1 ( S (a) C (r , s, z ), (r , s, z )) qi1 i (u , v)( S (a ) C (r , s, z ) R(u , r , s, z ) Ei (u , v, z )) (r , s, z ) i (u1 , u 2 , v1 , v2 )( S1 (r1 , s1 , z1 ) S 2 (r2 , s2 , z 2 ) C1 (r1 , s1 , z1 ) C2 (r2 , s2 , z 2 ) ( Ei1 (u1 , v1 , z1 )) ( Ei2 (u 2 , v2 , z 2 ))) i1 i2 1 (r1 , s1 , z1 ) 2 (r2 , s2 , z 2 ) (u1 , v1 )( S1 (r1 , s1 , z1 ) C1 (r1 , s1 , z1 ) ( Ei1 (u1 , v1 , z1 ))) 1 (r1 , s1 , z1 ) i1 (u 2 , v2 )( S 2 (r2 , s2 , z 2 ) C2 (r2 , s2 , z 2 ) ( Ei2 (u 2 , v2 , z 2 ))) 2 (r2 , s2 , z 2 ) i2 1 pt ( S1 (a1 ) C1 (r1 , s1 , z1 ), 1 ( x 1 , a1 )) pt ( S 2 (a2 ) C2 (r2 , s2 , z 2 ), 2 ( x 2 , a2 )) Theorem is proved. Theorem 2 and theorem 3 mean that: 1. Functions pt, pt-1 could be applied separately and concurrently. 2. If postcondition contains 1 ( x1 , a1 ) only, then pt ( S1 (a1 ) 1 ( x 1 , a1 ) S 2 (a2 ) 2 ( x 2 , a2 ), 1 ( x 1 , a1 )) S 2 (a2 ) 2 ( x 2 , a2 ) pt ( S1 (a1 ) 1 ( x 1 , a1 ), 1 ( x 1 , a1 )) pt 1 ( S1 (a1 ) C1 (r1 , s1 , z1 ) S 2 (a2 ) C2 (r2 , s 2 , z 2 ), 1 ( x 1 , a1 )) S 2 (a2 ) C 2 (r2 , s 2 , z 2 ) pt 1 ( S1 (a1 ) C1 (r1 , s1 , z1 ), 1 ( x 1 , a1 )) So, functions sat ( Di ( xi , ai )), pt ( Di ( xi , ai ), i ( xi , ai )) are called specialization, be- cause we could use some special theories for implementation of it. Specializations and Symbolic Modeling 499 5 Examples of Usage of Specializations 5.1 Examples of Specializations by Memory Usage Example 1. Concrete values. Let S (a ) (i 2) S (a / i ) be an environment state where i : int and a / i is a set of all attributes in model except i, b x((i 0) (i : i 1)) . For application of such basic protocol we should check satisfiability of the next formula: sat ((i 2) (i 0)) 1 , and the postcondi- tion should be applied to (i 2) : v((v 2) (v 0) (i v 1)) (i 3) . For such examples direct C++ translation could be used instead of using some special theories, and it will work much faster because it doesn’t require any additional checking, just direct translation into C++ code and compilation of it. Example 2. Let S (a ) (i 2) S (a / i ) be environment state where i : int and a / i is a set of all attributes in model except i, b x((i 0) (i : i 1)) . For application of such basic protocol we should check satisfiability of the next formula: sat ((i 2) (i 0)) 1 , and the postcondition should be applied to (i 2) : v((v 2) (v 0) (i v 1)) (i 1) (i 3) . For such examples numerical intervals could be used. So, S (a ) (i (;2)) S (a / i ) , b x((i (0;)) (i : i 1)) . Satisfiability checking looks like just crossing of two numerical intervals: i (;2) (0;) i (0;2) i [1;1] for integer. Appli- cation of pt creates the following formula: v((v [1;1]) (i v 1)) i 1 [1;1] i [2;2] . This approach will work faster than general satisfiability checking and quantifiers eliminations. Such approach could be used for all numeric and enumerated types. 5.2 Examples of Specializations by Functional Symbol It is not always possible to represent environment state and basic protocols in the following way: S (a) S1 (a1 ) S 2 (a2 ) , and B(a, x) x(1 ( x1 , a1 ) 2 ( x2 , a2 ) P( x, a) 1 ( x1 , a1 ) 2 ( x2 , a2 )) where a1 a2 , a1 a2 a , x1 x 2 x 1 x 2 x . One of such situation occurs when a value of functional attribute expression and its parameter has different types and belongs to the different subsets ai . For example, if functional attribute: i, j : int, f : int T is defined where T (c1, c2 , c3 ) is enumerated type with three enumerated constants: c1 , c2 , c3 , then formula ( f (i) c1 ) i 0 could be represented with specializations as follows: ( f : v1 f (i)) (v1 c1 ) (i 0) . Let b 1 ( f ( j ) : c2 ) be a basic protocol. Its specialized representation is: b 1 ( f : v1 f ( j )) (v1 : c2 ) 1 . It is required to merge such data structures for pt function which should consider all pairs of functional attribute expression from sets r,s and z: 500 V. Peschanenko, A. Guba and C. Shushpanov ( f : v1 f (i )) ( f : v1 f ( j )) ( f : v1 f (i ), v2 f ( j )) . After that basic protocol should be transformed in the following form: b 1 ( f : v2 f ( j )) (v2 : c2 ) 1 . It is required to take into account two possible combinations: (i j ) (i j ) . So, we obtain: pt (( f : v1 f (i ), v2 f ( j )) (v1 c1 ) i 0, v2 : c2 ) ( f : v1 f (i ), v2 f ( j )) v ((i j ) (v c1 ) (v2 c2 )) v ((i j ) (v1 c1 ) (v2 c2 )) i 0 ( f : v1 f (i ), v2 f ( j )) (v2 c2 ) i 0 (i j ) ( f : v1 f (i ), v2 f ( j )) (v1 c1 ) (v2 c2 ) i 0 (i j ) ( f : v1 f (i )) (v1 c2 ) i 0 (i j ) ( f : v1 f (i ), v2 f ( j )) (v1 c1 ) (v2 c2 ) i 0 (i j ) Let S (a) F ( f1 , f 2 ,..., v1 , v2 , a1 , a2 ) S1 (a1 ) S 2 (a2 ) be an environment state where f1 f 2 ... are names of functional symbols, v1, v2 are variables for each functional attribute expression from sets a1 ,a2 correspondently, and F ( f1 , f 2 ,..., v1 , v2 , a1 , a2 ) ( f1 : v11 f1 (t11 , t12 ,...), v12 f1 (t11 , t12 ,...),..., f 2 : v12 f 2 (t12 , t 22 ,...), v22 f 22 (t12 , t 22 ,...),...,...) where v11 , v12 a f1 , v12 , v22 a f 2 ,... are variables of type of functional names f1 , f 2 ,... for each attribute expression, a fi is set of attribute, such as fi a j , tij aii {ai } , … - corresponded arguments for each functional with the same name are in one specialization, and Shostak’s method could be applied for each right part of equation in F. Let S (a) F ( f1 , f 2 ,..., v1 , v2 , a1 , a2 ) S1 (a1 ) S 2 (a2 ) . and b(a) x( Fb ( f1 , f 2 ,..., v1 , v2 , a1 , a2 , x1 , x2 ) 1 (v1 , x1 , a1 ) 2 (v2 , x2 , a2 ) P(a, x) 1 (v1.x1 , a1 ) 2 (v2 , x2 , a2 )) 5.3 Theorem 4 sat ( S (a) ( x, a)) sat ( (( f i (ti1 , ti2 ,...) f i (ti1, ti 2 ,...)) (vi k vil )) (i , k , l ) S1 (v1 , a1 ) 1 (v1 , x1, a1 ) S 2 (v2 , a2 ) 2 (v2 , x2 , a2 )) sat (qi Si (vi , ai ) i (vi , ai , xi )) i where f i (ti1 , ti2 ,...) f i (ti1 , ti 2 ,...) is equality of arguments of functional attribute expressions. Proving Specializations and Symbolic Modeling 501 Let’s define F ( f1 , f 2 ,..., v1 , v2 , a1 , a2 , x1 , x2 ) F ( f1 , f 2 ,..., v11 , v12 , a1 , a2 ) Fb ( f1 , f 2 ,..., v12 , v22 , a1 , a2 , x1 , x2 ) . We combine all equations with the same name of functional symbol f i and renaming variables names after such union for equations from basic protocol. After that we obtain sets of variables v1 , v 2 and new basic proto- col b(a) x( Fb ( f1 , f 2 ,..., v1 , v 2 , a1 , a 2 , x1 , x 2 ) 1 (v1 , x1 , a1 ) 2 (v2 , x2 , a2 ) ... P(a, x) 1 (v1 , x1 , a1 ) 2 (v2 , x2 , a2 )) . For satisfiable checking we should add corresponded implication for each pair of equation from F ( f1 , f 2 ,..., v1 , v2 , a1 , a2 , x1 , x2 ) with the same name of functional symbol f i . (( f i (ti1 , ti2 ,...) f il (ti1 , ti 2 ,...)) (vi k vil )) (i , k , l ) (( f i (ti1 , ti2 ,...) f i (ti1 , ti 2 ,...)) (vi k vil )) i, k , l Each left and right parts of equation and negation of equations are in the same specialization. It means that we could build here a disjunction of conjunction. Each conjunct in such disjunction is qi which will be in one form of our specialization. So, it means that we could check satisfiability in the following form sat (qi S i (vi , ai ) i (vi , ai , xi )) . i Theorem is proved. 5.4 Theorem 5 pt ( S (a) ( x, a), ( x, a)) ( pt ( E1i (v1 , x1 , a1 ), S1 (v1 , a1 ) 1 (v1 , x1 , a1 ), 1 ( x1 , a1 ))) i pt ( E2i (v2 , x2 , a2 ), S 2 (v2 , a2 ) 2 (v2 , x2 , a2 ), 2 ( x2 , a2 )))) where pt ( E ij (vj , x j , a j ), S j (vj , a j ) j (vj , x j , a j ), j ( x j , a j )) qk , k qk (u, v)( S i (u, v, z ) i (u, v, z ) R(u , v, z ) Ei j (u, v, z ) E k (u , v, z ) C (r , s, z ) Proving. The sets v1 , v 2 are built in the same way as in theorem 3. Let’s consider a general formula of predicate transformer: pt ( D(r , s, z ), (r , s, z )) (u , v)( D(u , v, z ) R (u , v, z ) Ei (u , v, z ) C (r , s, z ) . i Coefficient Ei (u , v, z ) looks like disjunction of conjunction of all possible i matchings with functional attribute expressions from sets r,s and z. So, we can present 502 V. Peschanenko, A. Guba and C. Shushpanov it as conjunction of two disjunctions: Ei (u , v, z ) ( Ek (u, v, z )) ( El (u, v, z )) i k l where Ek (u, v, z ) is disjunction for matching of functional attribute expression k where parameters and its value are from different sets of a j . El (u , v, z ) is a dis- l junction of matching of other functional attribute expression. Each conjunct of such disjunction could be considered as a conjunction which depends on different sets of memory a j . It means that disjunction of conjunction Ek (u, v, z ) could be prepared k early before calling of some pt function without corresponded substitution of x,y. So, Ek (u , v, z ) E1k (v1 , x1 , a1 ) E2k (v2 , x2 , a2 ) . Disjunction El (u , v, z ) could be k k l presented in the same way. So, the theorem is proved. 5.5 Theorem 6 pt 1 ( S (a) C (r , s, z ), ( x, a)) ( pt 1 ( E1i (v1 , x1 , a1 ), S1 (v1 , a1 ) C1 (v1 , x1 , a1 ), 1 ( x1 , a1 ))) i pt 1 ( E2i (v2 , x2 , a2 ), S 2 (v2 , a2 ) C2 (v2 , x2 , a2 ), 2 ( x2 , a2 )))) where pt 1 ( E ij (vj , x j , a j ), S j (vj , a j ) C j (vj , x j , a j ), j ( x j , a j )) qk , k qk (u, v)( S j (vj , a j ) C j (vj , x j , a j ) R (u , r , s, z ) Ei j (u , v, z ) Ek (u, v, z )) j (r , s, z ) This theorem could be proved in the same mode as theorem 4. 6 Experiments In this section we present some results from our test suites. All experiments are devided into several groups. We compare the time of modeling of the satisfiability and the predicate transformer, presented in the Section 2, and these algorithms with the specializations. The first group of experiments refers to specialization by memory usage. Projects contain formulae in which some attributes have only concrete values. Let us present one typical real example. This example has a functional attribute of symbolic type with integer parameters, simple enumerated and simple integer attributes. All of these integer attributes initialize with concrete values and have concrete values at all times during trace generation (basic protocols do not change those to symbolic ones). Other attributes are symbolic. We provide a specialization for attributes, which are always concrete. The difference of modeling time for this example and for this one special- ized by concrete values is more than in 3 times. Of course, the speedup depends on Specializations and Symbolic Modeling 503 project: more concrete attributes we have, more speedup we shall obtain. In [25] it was shown that speedup could be in thousands times. The second group of experiments refers also to specialization by memory usage, but not to concrete values. Examples from this group have enumerated attributes and integer attributes. Some of the integer attributes memories are intersected, some of them are independent. First of all, we provide the splitting of formulae into two parts according to attribute types: enumerated part and integer part. For the enumerated part we use bitsets, for integer – common Pressburger algorithm. Speedup was about 5- 7%. After we specialize an integer part. We consider the attributes which memory is independent and obtain speedup in 10 times. So, the results of comparison of modeling time using general satisfiability func- tions and functions with specialization are given. Table 6. Results of experiments Group of tests General With specializations algorithm 1 930 sec 300 sec (memory usage/concrete values) 2 300 sec 280 sec (splitting by types) 3 300 sec 33 sec (memory usage/independent memory) 7 Conclusions Symbolic modeling is a powerful technique for the automated reachability of dead- locks and violations of user-defined properties. The main complexity of the reachabil- ity problem is in the complexity of satisfiability and predicate transformer functions. There are a lot of SMT-based techniques which speed up the satisfiability of formulae that satisfy some particular theory. We propose a technique that allows to speedup classical symbolic modeling when formulae could be splitted in several parts and used some special theories for manipulations with them, which are called specializations. The mathematical description of the algorithm for constructing specializations is pro- vided and the correctness of such specializations is proved. Specializations by memory usage and functional symbols are considered and ex- amples for each are given. The nearest plans are the investigation of additional kinds of specialization, be- cause the more specializations we have, the more speedup we obtain. References 1. Symbolic Modeling, http://en.wikipedia.org/wiki/Model_checking 2. Letichevsky, A., Gilbert, D.: A Model for Interaction of Agents and Environments. In: Bert, D., Choppy, C., Moses, P. (eds.) Recent Trends in Algebraic Development Tech- niques. LNCS 1827, pp. 311–328. Springer Verlag, Berlin Heidelberg (1999) 504 V. Peschanenko, A. Guba and C. Shushpanov 3. Letichevsky, A.: Algebra of Behavior Transformations and its Applications. In: Kudryavtsev, V. B., Rosenberg, I. G. (eds.) Structural Theory of Automata, Semigroups, and Universal Algebra, NATO Science Series II. Mathematics, Physics and Chemistry, vol. 207, pp. 241–272. Springer Verlag, Berlin Heidelberg (2005) 4. Letichevsky A., Kapitonova J., Kotlyarov V., Letichevsky Jr., A., Nikitchenko N., Volkov, V., Weigert T.: Insertion Modeling in Distributed System Design. Problems of Programming, (4), 13–39 (2008) 5. Letichevsky, A., Kapitonova, J., Volkov, V., Letichevsky Jr., A., Baranov, S., Kotlyarov, V., Weigert, T.: System Specification with Basic Protocols. Cybernetics and System Analysis, (4), 3–21 (2005) 6. Letichevsky, A. A., Godlevsky, A. B., Letichevsky Jr., A. A., Potienko, S. V., Peschanenko, V. S.: Properties of Predicate Transformer of VRS System. Cybernetics and System Analyses, (4), 3–16 ( 2010) 7. Letichevsky, A., Kapitonova, J., Letichevsky Jr., A., Volkov, V., Baranov, S., Kotlyarov, V., Weigert, T.: Basic Protocols, Message Sequence Charts, and the Verification of Re- quirements Specifications, In: ISSRE 2004, WITUL (Workshop on Integrated reliability with Telecommunications and UML Languages) , Rennes, 4 November (2005) 8. Letichevsky, A., Letychevskyi, O., Peschanenko, V.: Insertion Modeling System. In: Clarke, E.M., Virbitskaite, I., Voronkov, A. (eds.) PSI 2011. LNCS 7162, pp. 262–274, Springer Verlag, Berlin Heidelberg (2011) 9. Bofill, M., Nieuwenhuis, R., Oliveras, A., Rodríguez-Carbonell, E., Rubio, A.: The Barce- logic SMT Solver. In: Gupta, Aarti and Malik, Sharad (eds.) CAV 2008. LNCS 5123, pp. 294–298, Springer Verlag, Berlin Heidelberg (2008) 10. Barrett, C., Berezin, S.: CVC Lite: A New Implementation of the Cooperating Validity Checker. In: Rajeev, A., Peled, D.A. (eds.) CAV '04. LNCS 3114, pp. 515–518, Springer Verlag, Berlin Heidelberg (2004) 11. Barrett, C., Tinelli, C.: CVC3. In: W. Damm and H. Hermanns (eds.) CAV '07. LNCS 4590, pp. 298–302, Springer Verlag, Berlin Heidelberg (2007) 12. Barrett, C., Conway, C. L., Deters, M., Hadarean, L., Jovanović, D., King, T., Reynolds, A., Tinelli, C.: CVC4. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV’11. LNCS 6806, pp. 171–177, Springer Verlag, Berlin Heidelberg (2011) 13. Cotton, S., Asarin, E., Maler, O., Niebert, P.: Some Progress in Satisfiability Checking for Difference Logic. In: Proc. FORMATS-FTRTFT (2004) 14. Déharbe, D., Ranise, S.: Bdd-Driven First-Order Satisfiability Procedures (extended ver- sion). Research report 4630, LORIA (2002) 15. Bozzano, M., Bruttomesso, R., Cimatti, A., Junttila, T., van Rossum, P., Schulz, S., Sebas- tiani, R.: An Incremental and Layered Procedure for the Satisfiability of Linear Arithmetic Logic. In: Halbwachs, Lenore (eds.) TACAS’05. LNCS 3440, pp. 317–333, Springer Verlag, Berlin Heidelberg (2005) 16. Ganai, M. K., Talupur, M., Gupta, A.: SDSAT: Tight Integration of Small Domain Encod- ing and Lazy Approaches in a Separation Logic Solver. In: H. Hermanns, J. Palsberg. (eds.) TACAS 2006. LNCS 3920, pp. 135–150. Springer Verlag, Berlin Heidelberg (2006) 17. Audemard, G., Bertoli, P. G., Cimatti, A., Kornilowicz, A., Sebastiani, R.: A SAT based Approach for Solving Formulas over Boolean and Linear Mathematical Propositions. In: A. Voronkov (ed.) CADE 2002. LNCS (LNAI) 2392, pp. 195–210. Springer Verlag, Berlin Heidelberg (2002) 18. Bryant, R.E., Lahiri, S.K., Seshia, S.A.: Modeling and Verifying Systems using a Logic of Counter Arithmetic with Lambda Expressions and Uninterpreted Functions.. In: Brinksma Specializations and Symbolic Modeling 505 K., Larsen G. (eds) CAV’04. LNCS 2404, pp. 78–92, Springer Verlag, Berlin Heidelberg (2002) 19. Dutertre, B., de Moura, L.: A Fast Linear-Arithmetic Solver for DPLL(T). In T. Ball and R.B. Jones, (eds.) CAV’06. LNCS 4144, pp. 81–94, Springer Verlag, Berlin Heidelberg (2006) 20. Walther, C., Schweitzer, S.: About veriFun. In: F. Baader (eds.) CADE’03. LNCS 2741, pp. 322–327, Springer Verlag, Berlin Heidelberg (2003) 21. Ball, T., Cook, B., Lahiri, S.K., Zhang, L.: Zapato: Automatic Theorem Proving for Predi- cate Abstraction Refinement. In: Alur, R. A., Peled D. A. (eds.) CAV'04. LNCS 3114, pp. 457–461. Springer Verlag, Berlin Heidelberg (2004) 22. de Moura, L., Bjørner, N.: Z3: An Eficient SMT Solver. In: C. R. Ramakrishnan, J. Rehof (eds.) TACAS'08, LNCS 4963, pp. 337–340. Springer Verlag, Berlin Heidelberg (2004) 23. Barrett, C., de Moura, L., Ranise, S., Stump, A., Tinelli, C.: The SMT-LIB Initiative and the Rise of SMT. In: Barner S., Harris I. (eds.) HVC 2010. LNCS 6504, pp. 3–3, Springer Verlag, Berlin Heidelberg (2010) 24. Hutter, F., Hoos, H.H., Leyton-Brown, K., Stuetzle, T.: ParamILS: an Automatic Algo- rithm Configuration Framework. JAIR, 36, 267–306 (2009) 25. Peschanenko, V. S., Guba, А. А., Shushpanov, C. I.: Mixed Concrete-Symbolic Predicate Transformer. Bulletin of Taras Shevchenko National University of Kyiv, Series Physics & Mathematics, 2 (2013) (in press) 26. Barrett, C., Sebastiani, R., Seshia, S., Tinelli, C.: Satisfiability Modulo Theories. Frontiers in Artificial Intelligence and Applications, 185, 825–885 (2009) 27. Godlevsky, A. B.: Predicate Transformers in the Context of Symbolic Modeling of Transi- tion Systems. Cybernetics and System Analysis, 4, 91–99 ( 2010) On a Dynamic Logic for Graph Rewriting ? Mathias Winckel and Ralph Matthes Institut de Recherche en Informatique de Toulouse (IRIT) {winckel, matthes}@irit.fr Abstract. Initially introduced by P. Balbiani, R. Echahed and A. Herzig, this dynamic logic is useful to talk about properties on ter- mgraphs and to characterize transformations on these graphs. Also are presented the deterministic labelled graphs for which the logical frame- work is designed. This logic has been the starting point of a formal development, using the Coq proof assistant, to design a logical and algorithmic framework useful for verifyin and proving graph rewriting. The formalization allowed us to figure out some ambiguities in the involved concepts. This formalization is not the topic here but the clear view brought to us by the formal work, so the results will be expressed using the original mathematical objects of this logic. Some problems of this logic are demonstrated, relatively to the repre- sentation of graph rewriting. Some are minor issues but some are far more important for the adequation between the formulas about graph rewriting and the actual rewriting systems. Invalidating some resulting propositions, solutions are given to reestablish the logical characteriza- tion of graph rewriting, which was the initial purpose. Keywords. Dynamic Logic, Graph Rewriting, Adequation Issues Key terms. Formal Method, Model 1 Introduction Nearly every field of computer science uses graphs to represent data or the behavior of systems. Then, to get a higher level of dynamics, it uses rewriting in a more or less formal way to handle manipulation of such objects. In some fields, as in Formal Methods and Model-Driven Engineering, graphs are one of the main tools, having advanced methods to express and reason on these graphs is of great pertinance. Modal logic allows to express relational properties naturally and, with Kripke semantics, is closely linked to graphs which are its models. Yet, instead of dis- cussing graphs and rewriting using hyper-graphs as models, the transformed ? This work has been funded by the CLIMT project (ANR-11-BS02-016 of the French Agence Nationale de la Recherche) On a Dynamic Logic for Graph Rewriting 507 graphs could be directly these models. Dynamic logic as defined by D. Harel offers by its modalities the possibility to express relations on models when they have some desired properties. In this spirit, introduced during ICGT 2010, “A Dynamic Logic for Termgraph Rewriting” [1] was proposed by P. Balbiani, R. Echahed and A. Herzig as a suitable dynamic logic to describe graphs and transformations on these graphs. The termgraph lifts the concept of term to the more general one of graph. It was initially introduced to represent terms while having a simpler way to talk about recursion or sharing of sub-terms, what a tree-like structure doesn’t allow to do easily. Computer science involves data structures that are usually syntactically rep- resented as simple terms. It is interesting to develop such a tool for more than just a graphic representation: using graph rewriting, research could be done on term rewriting with this rich and powerful layer of language of graph structure. In the following, in Section 2 we will present the type of graphs we use, then in Section 3 the dynamic logic for graph rewriting, its syntax and its semantics. In Section 4 the rewriting system is presented, and propositions to logically talk about it, but with issues to express these concepts with the logic as originally introduced. Graph homomorphisms, rewriting steps and application to a graph, with the definition of rewriting rules, matching of rules and normal forms, actu- ally leads to some divergences between actual rewriting and its translation using the logic. Section 4 also discuss and proposes some solutions for these original issues, for such an utilisation. 2 Termgraph And Rooted Termgraph To illustrate terms representation as graphs from some typical λ-calculus, as shown in Figure 1, a classical tree-like term in (a), an example of argument sharing in (b), and eventually a representation for a fixpoint operator in (c), usually denoted Y for a function f [2]. One can easily unfold the recursion in the latter graph and get Y f = f (Y f ), which is what is expected from a fixpoint operator. The graphs are deterministic and has labelled nodes and edges [3], but for convenience edge labels will be named features. A linear graph grammar is de- fined over a set of nodes N , a set of labels Ω and a set of features F [1], by the following rules: N ode ::= n : ω (f1 ⇒ N ode, ... , fk ⇒ N ode) | n : • | n avec n ∈ N , ω ∈ Ω et f1 , ... , fk ∈ F T ermGraph ::= N ode | N ode + T ermGraph Allowing one to define a node, with its label and its direct sons, a node without label or just a reference to an actual node with the first rules, and to define multiparts termgraphs with the others. 508 M. Winckel and R. Matthes app app app λx y app f x F x (a) term: (λx.x) y (b) sharing: F x x (c) recursor: Y f Fig. 1. Examples of term representation In a less syntactic and maybe more semantic manner, a termgraph is defined as well by a structure. (N , E, LN , LE , S, T ) with – N a finite set of nodes – E a finite set of edges – LN a partial function from N to Ω, associating a node to its label – LE a total function from E to F, associating a edge to its feature – S a source function from E to N – T a target function from E to N It’s assumed that the previous definitions respect a determinism condition de- fined as ∀ e1 , e2 ∈ E, S(e1 ) = S(e2 ) ∧ LE (e1 ) = LE (e2 ) → e1 = e2 Rooted TermGraphs. In the following, for the need of the logic, the graphs will be an extension of the termgrah with a specific node pointed as its root. (N , E, LN , LE , S, T , r) with the root r ∈ N 3 Dynamic Logic for Graph Rewriting A modal logic is a propositional logic, extended with one or several modality op- erators. Dynamic logic [4] is a multi-modal logic, its modalities being actions and the possibility to express a choice, an iteration or the sequence of actions. Actions can be defined to express graph transformations by miscellaneous modalities. 3.1 Syntax of the Dynamic Logic For given countable sets F and Ω, of features and labels (their respective ele- ments being usually denoted a, b, ... and ω, π, ...), the rules defining the formulas and the actions of the syntax are the following. [1] For an action α: On a Dynamic Logic for Graph Rewriting 509 α ::= a for the navigation by a ∈ F from the root |U for changing the root to any node in the graph |n|n for the creation of a node n, eventually setting it as the root | φ? for the verification of the validity of a formula φ for the root | (ω :=l φ) | (ω :=g φ) for labeling a node with a label ω ∈ Ω if a formula φ is valid, locally for the root or globally for any node. | (a + (φ, ψ)) | (a − (φ, ψ)) for adding or removing edges with the feature a, between nodes verifying a formula φ and those verifying a formula ψ. | (α1 ; α2 ) | (α1 ∪ α2 ) | α1∗ for the definition of a sequence, a choice or an iteration over actions α1 and α2 . For a formula φ: φ ::= ω | ⊥ | ¬ φ | φ ∨ ψ | [ α ]φ ω as a formula means that node labels can be atomic formulas. And intuitively, [ α ]φ means that after any execution of an action α, the formula φ holds. The propositional logic of such dynamic logic being a classical one, some more sym- bols can be defined as usual conjunction, implication and equivalency being re- spectively φ∧ψ ≡ ¬(¬φ∨¬ψ), φ → ψ ≡ ¬φ∨ψ and φ ↔ ψ ≡ (φ → ψ)∧(ψ → φ), for given formulas φ and ψ. One more modality can be defined, for a given action α and formula φ, as < α > φ ≡ ¬[α]¬φ. Intuitively, it holds when an execution of the action α makes φ hold. 3.2 Semantics of the Dynamic Logic A semantic can be defined for this dynamic logic using rooted termgraphs as models and verifying the properties expressed by the formulas on them, using the sets of labels and features of the logic to define them. [1] Models. The rooted termgraphs used here are different from the previous ones. The labeling function for nodes is now defined over the power set of Ω, so typed LN : N → P(Ω), of what the former definition is said to be a particular case. Interpretation of Actions and Formulas. Before the definition of a satisfi- ability relation, it needs interpretation functions over the models. IG is defined as – IG (a) = {e | e ∈ E and LE (e) = a} the set of a edges. – IG (ω) = {n | n ∈ N and ω ∈ LN (n)} the set of nodes having the ω label. And, for any a ∈ F, RG is a family of binary relations defined such as – RG (a) = {(n1 , n2 ) | ∃e ∈ IG (a), S(e) = n1 and T (e) = n2 } 510 M. Winckel and R. Matthes Because of dependency between formulas and actions, the satisfiability rela- tion |= for a formula F and a rooted termgraph G requires an inductive definition on F , dependent of a relation G −→α G0 for every action α. – G ω iff n0 ∈ IG (ω), interpreting ω in G. – G 2⊥ – G ¬φ iff G 2 φ – G φ ∨ ψ iff G φ or G ψ – G [α]φ iff for any rooted termgraph G0 , if G −→α G0 then G0 φ with G −→α G0 the binary relation between two termgraphs G and G0 consid- ering the action α. It is defined inductively on α, with G = (N , E, LN , LE , S, T , r) 0 0 and G0 = (N 0 , E 0 , LN , LE , S 0 , T 0 , r0 ), but only definitions for the useful cases will be introduced. The definitions are declarative, so it should be read as condi- tions for a correct relation between G and G0 and not as the way to get a model G0 from a model G. A notation JeKG is introduced for a graph G and an edge e of G, to express (S(e), LE (e), T (e)). Ambiguity decreases, in the following definitions, the set E being only identifiers of edges for the functions LE , S and T and not tuples of these informations. We can justify this by looking at the other way, with E as a set of tuples, and seeing that it does not make much sense: for example, when redirecting edges, the set E was staying the same, and so were the tuples in this set, but the target of an edge was changed and thus a tuple was associated by functions to information which is no longer the content of the tuples. For convenience, another notation G[n0 ] is introduced for a node n0 of a graph G = (N , E, LN , LE , S, T , r), to express the graph (N , E, LN , LE , S, T , n0 ), or in more simple terms, to express changing the root of G with n0 . G −→a G0 iff 0 0 – N 0 = N , E 0 = E, LN = LN , LE = LE , S 0 = S and T 0 = T , to express almost all parts of G0 being the same. – (n0 , n00 ) ∈ RG (a), to express the possibility to navigate from the root of G to the root of G0 . G −→U G0 iff 0 0 – N 0 = N , E 0 = E, LN = LN , LE = LE , S 0 = S and T 0 = T , to express no characterization for the root of G0 but other parts being the same. G −→(ω:=g φ) G0 iff 0 – N 0 = N , E 0 = E, LE = LE , S 0 = S, T 0 = T and r0 = r 0 0 – for any m ∈ N , if G[m] φ then LN (m) = LN (m) ∪ {ω} else LN (m) = N L (m)\{ω}, expressing the addition or deletion of the label ω of any node m of the graph satisfying the formula φ. G −→φ? G0 iff On a Dynamic Logic for Graph Rewriting 511 0 0 – N 0 = N , E 0 = E, LN = LN , LE = LE , S 0 = S, T 0 = T and r0 = r – G φ, to express the formula φ being valid for G. G −→(ω:=l φ) G0 iff E0 – N 0 = N , E 0 = E, L 0 = LE , S 0 = S, T 0 = T and r0 = r N0 – if G φ then L (r) = L (r) ∪ {ω} else L (r) = LN (r)\{ω}, to express N N the addition or deletion of the label ω from the root. G −→(a+(φ,ψ)) G0 iff 0 – N 0 = N and LN = LN Considering the set of candidate edges C = {(ns , a, nt ) | with ns ∈ N and nt ∈ N such as G[ns ] φ and G[nt ] ψ}, to be added only between nodes vali- dating the formulas φ and ψ. – E 0 ⊃ E, and for all e ∈ E, JeKG0 = JeKG , characterizing an addition without any loss. – for all p ∈ C, ∃e ∈ E 0 , JeKG0 = p and for all e ∈ E 0 \ E, JeKG0 ∈ C, expressing candidate edges being added in E 0 but nothing else. G −→(a−(φ,ψ)) G0 iff 0 – N 0 = N and LN = LN Considering the set of deleted edges E = {e | e ∈ E such as JeKG0 = (ns , a, nt ) with ns and nt such as G[ns ] φ and G[nt ] ψ}. – E 0 = E \ E characterizing deletion of edges only between nodes validating formulas φ and ψ.0 – for all e ∈ E, LE (e) = LE (e), S 0 (e) = S(e), T 0 (e) = T (e) and r = r0 . G −→α;β G0 iff – there exists a rooted termgraph G00 such G −→α G00 and G00 −→β G0 . G −→α∗ G0 iff – there is a rooted termgraph sequence (G(0) ,. . . ,G(k) ) with G(0) = G, G(k) = G0 and for all i ∈ {0, ... , k − 1}, G(i) −→α G(i+1) . Semantics of left over actions n, n and α1 ∪ α2 are in the original paper. At this point, the logic allows to characterize classes of graphs using the satisfi- ability of formulas by these graphs as models. Everything goes pretty well, but issues come when dealing with rewriting and propositions made to talk about graph rewriting. 4 Actual Rewriting, and its Issues The approach here is an algorithmic one: transformation actions are defined within a rewriting system and can be applied to a graph, alone or sequentially. It forms rules of rewriting that could be applied if an instantiation of the rule is found in a given graph. Such instance can be defined with a graph morphism which embeds the graph domain of the rule into the graph in which the rule could be applied. 512 M. Winckel and R. Matthes 4.1 Homomorphism of Graphs A homomorphism of labelled graphs h : G → G0 can be defined, given two rooted 0 0 termgraphs G = (N , E, LN , LE , S, T , r) and G0 = (N 0 , E 0 , LN , LE , S 0 , T 0 , r0 ). n 0 Somewhat, only with a function h : N → N preserving the labeling of nodes but equally preserving the source and target function for the edges, and thus the labelization of edges. 0 So ∀m ∈ N , LN (hn (m)) = LN (m) is mandatory and then because of the deter- minism condition satisfied by the graphs, there is no ambiguity on the conditions for the edges of the codomain graph G0 . For any e of E, it only requires one exist- 0 ing e0 of E 0 verifying S 0 (e0 ) = hn (S(e)), LE (e0 ) = LE (e) and T 0 (e0 ) = hn (T (e)), and so with corresponding source and target while preserving the edge labeliza- tion, mandatory too for such homomorphism. There is no specific condition for correspondence of roots of the two termgraphs. Examples of the original paper can be presented here, in Figure 2, to display some homomorphisms: morphisms h2 and h3 , between three graphs B1, B2 and B3 displaying the association of their nodes. n0 : f n0 : f n0 : f h2 : B1 −→ B3 h3 : B2 −→ B3 1 1 1 n0 7→ n0 n0 7→ n0 n1 : g n1 : g n1 : g n1 7→ n1 n1 7→ n1 a b a b a b n2 7→ n2 n2 7→ n2 n2 : • n3 : • n2 : 0 n3 : • n2 : 0 n3 7→ n2 n3 7→ n2 B1 B2 B3 (a) termgraphs (b) homomorphisms Fig. 2. Examples of termgraph homomorphisms Existence of Graph Homomorphisms and Particularity of such Homomorphisms In the original paper, a way to talk about homomorphisms on graphs using the logic is proposed, a formula express this concept and is defined for a graph G = (N , E, LN , LE , S, T , r). Thus an action αG and a formula φG can relate that there is a homomorphism from G to a graph G0 if and only if G0 < αG > φG .[1] For this, considering N = {n0 , . . . , nN −1 }, with N being the number of nodes of G and n0 being its root, and considering a sequence P = {π0 , . . . , πN −1 } of distinct elements of Ω, each πi is going identify the node ni . The action αG is defined – with βG = (π0 :=g ⊥) ; . . . ; (πi :=g ⊥) ; . . . ; , (πN −1 :=g ⊥), characterizing the elimination of any label πi . On a Dynamic Logic for Graph Rewriting 513 i – with, for 0 ≤ i ≤ N − 1, γG = (¬π0 ∧ . . . ∧ ¬πi )? ; (πi :=l >) ; U , charac- terizing the labelization with πi of a node not already labelled with πk for k ≤ i. 0 N −1 – and finally αG = βG ; γG ; . . . ; γG , sequencing these actions. Then, the formula φG is defined – with 0 ≤ i ≤ N − 1, if LN (ni ) 6= ∅ then ψG i =< U > (πi ∧ LN (ni )) i else ψG = >, characterizing the conservation of the labelization of a node identified as the image of ni , by the label πi , but nothing if this node wasn’t labelled. – with for 0 ≤ i, j ≤ N − 1, if ∃e ∈ E such S(e) = ni and T (e) = nj then ζ i,j E i,j G =< U > (πi ∧ < L (e) > πj ) else ζG = >, characterizing the existence of edges corresponding to the ones of G. 0 N −1 0,0 N −1,N −1 – and finally φG = ψG ∧ . . . ∧ ψG ∧ ζG ∧ . . . ∧ ζG , verifying all these formulas. In the latter definition, it is assumed to have a subset P of Ω of fresh labels not already used in G, and so that do not have to be preserved by the homomorphism in G0 , and do not erase information when used in βG . Such dedicated labels for identification of the elements of N could be assumed as a part of the set Ω, by definition. This definition of P , identifying differently each node of G and the way it is used in the formula, requires more for a homomorphism than what implies the homomorphism definition. Acutally, the examples of the Figure 2, which do not stand against the definition, are not injective morphisms. In the formula < αG > φG , matched nodes are labelled to be identified to only one node of G, as the test (¬π0 ∧ . . . ∧ ¬πi )? express it. Therefore, it is mandatory for a homomorphism h : G −→ G0 to be injective to satisfy the formula and this necessary condition on the models. Although, such injectivity is a common feature of graph morphism in many graph rewriting frameworks, so it is not mandatory to define formally such a system using injective matching only and the initial definition of graph homomorphism should specify wether this particularity is actually required. 4.2 Rewriting Step and Translation in Logic The rewriting is defined to be applied to a graph G = (N , E, LN , LE , S, T , r) 0 0 to obtain a graph G0 = (N 0 , E 0 , LN , LE , S 0 , T 0 , r0 ), the result of the application of an action α to a graph G is denoted α[G]. It is possible to make sequences of actions, as empty or the concatenation of an action α and another sequence, and the application of a sequence ∆ to a graph G is denoted ∆[G].[1] – if ∆ is the empty sequence then ∆[G] = G – if ∆ = α; ∆0 is a concatenation, with a sequential operator “;”, then ∆[G] = ∆0 [ α[G] ] For a morphism h and a sequence ∆, h(∆) denotes the sequence one obtains by substitution in ∆ of any node n by h(n). 514 M. Winckel and R. Matthes The original paper proposes a way to talk about sequences of actions, by de- scribing the sequence of rewriting actions as a sequence of logical actions. Thus, after the translation of an action α of the rewriting system, the relation −→tr(α) introduced with the semantic of the logic allows to relate models, the second potentially being the result of the application of the action to the first one. The translation of a sequence of a given action a and another sequence ∆ will be the sequence of translations αa;∆ = αa ; α∆ . Assuming that any action has a translation entirely independent of the rest of the sequence, the translation order actually does not matter, but a sequential translation seizes a correct idea. For the rewriting actions: n a m is a local redirection. An outgoing edge from a node n with the feature a is modified to point to a node m. 0 0 – N 0 = N , E 0 = E, LN = LN , LE = LE , S 0 = S and r = r0 , the nodes, edges, their labelization, the source of the edges and the root are the same. 0 – for e ∈ E such S 0 (e) = n and LE (e) = a then T 0 (e) = m, the target of the only wanted edge is changed for m. ∀e0 ∈ E 0 , if e0 6= e then T 0 (e0 ) = T (e0 ), the target function doesn’t change for the other edges. To begin with, here is the given formula for the local redirection. – αna m = (a − (πn , >)) ; (a + (πn , πm )) One can notice that this formula depicts a transformation by deleting an edge, labelled a, between a node n and any other node, though the determinism dictates the existence of at most only one such edge. But above all thus an a edge between this node n and a node m is added at the end. Looking at the definition of the local redirection, one can see that E 0 = E specify that no edge is actually added and the other item clearly specify that only an existing e of E with source as n and feature as a has its target changed to m. A difference happens between the rewriting of a graph and the models linked by the actions of the translation of the rewriting action, as shown in Figure 3 which is a counter example for adequation between this translation and rewriting. n : πn n : πn n : [πn ] n : [πn ] =⇒ −→ a m : πm m : πm m : [πm ] m : [πm ] (a) rewrite rule (b) models Fig. 3. Local Redirection Difference On a Dynamic Logic for Graph Rewriting 515 The problem being that the logical actions don’t require the existence of the redirected edge, what is obviously mandatory to rewrite it. We proposed, as a correction for this problem, the formula: – αna m = (λn :=g ⊥) ; (λn :=g πn ∧ < a > >) ; (a−(πn , >)) ; (a+(λn , πm )) The added actions require the marking of a node n with a λn , assumed fresh for the graph, validating the possibility to navigate along a a edge to any node, then it proceeds to delete and add the edge only with this sole mark. If no edge was removed, no mark will be found on the node n, correcting this problem of adequation between rewriting and logical translation. n m is a global redirection. The target of every edge pointing to the node n is modified to make the edges point to the node m. 0 00 0 – N 0 = N , LE = E, LN = LN , LE = LE and S 0 = S, the nodes, edges, their labelizations and the source of the edges. – for all e ∈ E such as T (e) = n then T 0 (e) = m else T 0 (e) = T (e), the targets of the only wanted edges are changed for m, otherwise it does not change. – if n = r then r0 = m else r0 = r, the root changes if it’s the former target of the changed edges. If one looks at the translation of the global redirection, with first a formula to globally redirect to a node m every a edges initially pointing to a node n: – αnga m = (λa :=g ⊥) ; (λa :=g < a > πn ) ; (a − (>, πn )) ; (a + (λa , πm )) This formula does not suffer of this problem, the second action marking with a λa any reachable node by an edge with the feature a. Then, the action adding the edges does not add wrongly an inexistant previously removed edge, since the label λa ensures these nodes are subjects to redirection. However, it is implicitly required that this λa is a dedicated label for this action, to not interfere with any other formula using this label for another purpose. The full translation of the rewriting step is finally a sequence of the previously defined formulas, for every feature of the graph: – αnm = ;a∈F αnga m Comes finally the last rewriting action, the node labelization. n : ω (f1 ⇒ n1 , ... , fk ⇒ nk ) is a node definition or labelization. It adds a node n, if it does not already belongs to the graph, or modifies an already existing one. It assigns the label ω and defines the edges e1 to ek outing of this node n, respectively pointing to the nodes n1 to nk with the labels f1 to fk , according to the following definition: – N 0 = N ∪ {n, n1 , ..., nk }, nodes of the rules which are not already included in G are added. 516 M. Winckel and R. Matthes 00 0 – LN (n) = ω and ∀m ∈ N \ {n}, LN (m) = LN (m), n is labelled with ω and the other nodes keep the same labeling. 0 Considering the newly defined edges E = {ei |S 0 (ei ) = n, LE (ei ) = fi and T 0 (ei ) = ni } with 1 ≤ i ≤ k. – E 0 = E ∪ E, new edges are added to the already existing ones. 0 – ∀ei ∈ E, LE (ei ) = fi , the features of the new edges are defined, and ∀e ∈ / E0 E, L (e) = LE (e), the features of the other edges don’t change. – ∀ei ∈ E, S 0 (ei ) = n and ∀e ∈/ E, S 0 (e) = S(e), the same thing is done for the source function. – ∀ei ∈ E, T 0 (ei ) = ni and ∀e ∈ / E, T 0 (e) = T (e), the same thing is done for the target function. – r0 = r, the root remains the same. And the formula, initially given as its translation: – αn : ω (f1 ⇒n1 , ... , fk ⇒nk ) = U ; πn ? ; (ω :=l >) ; (f1 + (πn, , πn1 )) ; ... ; (fk + (πn, , πnk )) One could there raise a first issue, regarding the end of the formula: it adds without much care outgoing edges of the node n, thereby not ensuring the deter- minism. The result is that no model can be related as resulting of the application of this action, when what was intended was a relation on models displaying a redirection of these edges. We proposed this correction, using as previously the idea of edge redirection by deletion of the former edge and addition of the redi- rected one. – αn : ω (f1 ⇒n1 , ... , fk ⇒nk ) = U ; πn ? ; (ω :=l >) ; (f1 − (πn , >)) ; . . . ; (fk − (πn , >)) ; (f1 + (πn , πn1 )) ; . . . ; (fk + (πn , πnk )) Now, a model related by the relation of the corresponding formula for this action can exist, and it will be a deterministic graph with redirected edges. How- ever, there still remains a problem. If one looks at the transformation expressed by the formula, it translates the labelization of a node positionning the root on a node n, identified by πn , thus it labelizes it with ω and finally changes the edges outgoing of this node n. As displayed in Figure 4, this raises again a major difference between the graph one obtains by rewriting and the models related by the relation for the modality of the logical action. The behavior is not the same because the formula requires only the addition of a label to the multiple already existing labels of a node while the rewriting action requires the replacement as an unique label. For a better understanding of the extent of this problem, one should look at the following, which demonstrates how much of an issue that becomes in regard of characterization of rewriting system. On a Dynamic Logic for Graph Rewriting 517 n:l =⇒ n:ω n : [l] −→ n : [ω, l] (a) rewrite rule (b) models Fig. 4. Node Labeling Difference 4.3 Rewriting Rules and Rewriting System Characterization Rewriting rules A rewriting rule is expressed as a graph L and a sequence of actions ∆ to apply, and will be noted (L, ∆). It is said that a graph G is rewritten as a graph G0 if there exists a homomorphism h : L → G such as h(∆)[G] = G0 , denoted G →L,∆ G0 . A simple yet useful example may be defined, representing an action on a simple on-off switch of an electrical network: L= off : SwitchOff e on off pos : • e e neg : • int : Off on : SwitchOn Fig. 5. Left-hand side L of a rule Displayed in Figure 5, on the left of L there is a positive terminal of the circuit, on the right side there is a negative one. The edges labelled edges as e are for the electrical circuit and the switch is currently set to Off . The two positions SwitchOn and SwitchOff are linked to each other to avoid any mismatch if another switch is part of the network, even though a one-way link alone could be enough. ∆switchOn = (int : On ( ) ; int >>e on) Fig. 6. Sequence ∆ of a rule The sequence for this rule, as displayed in Figure 6, is made of an action to mark the switch node with On without adding any extra edge, and an action to redirect the edge e, outgoing from this node to the Off position of the switch, to get a graph representing a closed electrical circuit, displayed now in Figure 7 as the result on the graph L. 518 M. Winckel and R. Matthes off : SwitchOff on off pos : • e e e neg : • =⇒ int : On on : SwitchOn Fig. 7. result of applying ∆ to L Rewriting System Characterization and The Labelization Problem Using the previous example of a switch activation, a graph G is displayed in the Figure 8, being a simple electrical network having only a generator and a switch, and being really similar to the left-hand side of the rule (L, ∆switchOn ) previously defined. off : SwitchOff e on off e e G = pos : DC+ int : Off on : SwitchOn neg : DC− Fig. 8. graph G The normal form of the graph G, with respect to the rule (L, ∆switchOn ), satisfies the formula ψ := [U ]¬Off , because of its switch being activated and thus being labelled with On, so no node is labelled with Off anymore. In the end of the original paper, the proposition 4 [1] which defines a way to talk about a normal formal of a graph G satisfying ψ says that this should be equivalent to the following satisfaction G [(αL ; φL ?; α∆switchOn )∗ ]([αL ; φL ?] ⊥ → ψ) To explain this proposition, one can think about a normal form with respect of a rule as resulting of successive matchings and applications of this rule. When the rule cannot be matched anymore this implies that the formula ψ holds, which characterizes the normal forms. In the example with the rule (L, ∆switchOn ), every switch must be activated and the rule shouldn’t be matchable then, because there is no label Off or any e edge pointing to the right position. But because of the difference between rewriting and translation in the logic, when every e edge of the switch is redirected in the models, no more matching is available but there is still the label Off , along with the label On labeling the node int. The formula ψ is thus unsatisfied, the graph G and the formula ψ are a counter example for this proposition. On a Dynamic Logic for Graph Rewriting 519 First, one can notice that relying on the definition of the models, the logic makes use of the labels to mark and reason, but without any difference between the ones being actual parts of the graph and the ones serving the logic. Even if the problem appears with the semantics, the solution is not there, because any change of the model definition, allowing to make a difference between graph and logic for example by simply splitting in two layer these informations, doesn’t mean that the same syntax of the logic allows to express this difference. The problem comes with the translation of the label definition of a node, it seems sounding to complete this translation to get the correspondence between rewriting and relation on models. But the rules and the matching use the labels to logically identify the node, so one should be careful with erasing, thus to erase everything is not an option here. Actually, the syntax of the logic needs explicitly the label to delete, with a global or local labeling action. And the syntax of the action currently does not allow to say which label will be erased. This is where the problem lies in: there is only the identifier of a node and the new label. To actually translate the action, it is possible if it is given more information, so a new label definition action is a possibility or the syntax of the logic should be changed to express actions on a new model type. But before any syntactic change, of rewriting actions or logic, it is interesting to consider the action in the context of a graph. Because of the LN : N −→ Ω of the graph, and having already the node n of the action, the labeling function can give the label to remove. Regarding of the action to translate, there is not this information, but more generally rewriting actions are defined for a specific graph. Thus, lifting the translation in formula to the level of the rule, this gives the required information. From this rule, a graph is given as left-hand side and thus a LN : N −→ Ω function can be used, and it allows an explicit erasing of a label as it was implicitly expected by the rewriting action. For a graph G = (N , E, LN , LE , S, T , r) and a node labeling action n : ω (f1 ⇒ n1 , ... , fk ⇒ nk ), we give as a correct translation the formula – αG, n : ω (f1 ⇒n1 , ... , fk ⇒nk ) = U ; πn ? ; (LN (n) :=l ⊥) ; (ω :=l >) ; (f1 − (πn , >)) ; ... ; (fk − (πn , >)) ; (f1 + (πn , πn1 )) ; ... ; (fk + (πn , πnk )) However, the application of the rule to the left-hand side results in another graph with another labeling function and the translation of a sequence cannot remain as previously proposed. Every node labeling action should be translated depending on the result of previously applied actions. The translation of a se- quence of an action a and another sequence ∆ is a sequence of sequential trans- lations: αG, (a;∆) = αG, a ; αa[G], ∆ One could notice that this translation is still done in a linear way through the sequence, but this definition could actually be slightly changed because every action but node labelization can be translated independently. Only the node labelization is dependent of a context and more precisely there are conflicts only 520 M. Winckel and R. Matthes when editing the same node, having to erase a label which was just defined previously in the sequence. 5 Conclusion This paper provided an introduction to a dynamic logic, defined by P. Balbiani, R. Echahed and A. Herzig. Originally, a formalization was done using the Coq proof assistant and so was done a study of this logic. During this work some mistakes were spotted. While sometimes these were just minor-looking imprecisions, when working with formal logic, it may be relevant to point out such ambiguities if the logic can become incoherent with a wrong interpretation. Other mistakes are not only due to interpretation, and as it was demon- strated, they allow to find counter-examples for propositions and thus to estab- lish incoherence with the main goal of this logic, to talk about rewriting systems. Solutions to these issues are proposed, they resolve the issues by getting back to an adequation between actual rewriting and relations on models of the logic. A first idea to find another technical problem is the conservation of deter- minism of graphs, which only seems currently assumed and can be broken by the action of the logic, heading to the impossibility to relate a model to another because of the determinism, assumed by definition. It could be more explicit in the semantics of rewriting, and currently the definition of one of the rewriting actions allows to add edges breaking this condition. This is something the logic can express and handle during the translation of these rewriting actions, this looks like an interesting improvement of the logical framework. The models are defined with a multi-labeling function of nodes and already demonstrate some differences during translation of rewriting into the logic, be- cause the models are graphs with a unique-labeling function into the initial rewriting system. It is not totally clear whether the difference stops there, and it is required to study more uses of the logic in regards of rewriting to be sure of the correct use. It seems interesting to study the reverse translation as well, from logic to rewriting, to get an useful and complete logical framework, in the idea of Curry − Howard correspondence with terms as proofs. References 1. Balbiani, P., Echahed, R., Herzig, A.: A dynamic logic for termgraph rewriting. In Ehrig, H., Rensink, A., Rozenberg, G., Schürr, A. (eds.) ICGT. LNCS, vol. 6372, pp.59–74. Springer, Heidelberg (2010) 2. Ariola, Z.M., Klop, J.W.: Lambda calculus with explicit recursion. Inf. Comput. 147(2), 154–233 (1997) 3. Barendregt, H., van Eekelen, M., Glauert, J., Kennaway, J., Plasmeijer, M., Sleep, M.: Term graph rewriting. In de Bakker, J., Nijman, A., Treleaven, P. (eds.) PARLE Parallel Architectures and Languages Europe. LNCS, vol. 259, pp. 141– 158. Springer, Heidelberg (1987) 4. Harel, D., Kozen, D., Tiuryn, J.: Dynamic logic. Handbook of Philosophical Logic, MIT Press. 497–604 (1984) Logical Foundations for Reasoning about Transformations of Knowledge Bases? ?? Mohamed Chaabani1 , Rachid Echahed2 and Martin Strecker3 1 LIMOSE, University of Boumerdès, Algeria 2 Laboratoire d’Informatique de Grenoble http://membres-liglab.imag.fr/echahed/ 3 Université de Toulouse / IRIT http://www.irit.fr/~Martin.Strecker/ Abstract. This paper is about transformations of knowledge bases with the aid of an imperative programming language which is non-standard in the sense that it features conditions (in loops and selection statements) that are description logic (DL) formulas, and a non-deterministic assign- ment statement (a choice operator given by a DL formula). We sketch an operational semantics of the proposed programming language and then develop a matching Hoare calculus whose pre- and post-conditions are again DL formulas. A major difficulty resides in showing that the for- mulas generated when calculating weakest preconditions remain within the chosen DL fragment. In particular, this concerns substitutions whose result is not directly representable. We therefore explicitly add substitu- tion as a constructor of the logic and show how it can be eliminated by an interleaving with the rules of a traditional tableau calculus. Keywords. Description Logic, Graph Transformation, Programming Lan- guage Semantics, Tableau Calculus Key terms. MathematicalModel, SoftwareSystem, KnowledgeRepresen- tation 1 Introduction Knowledge bases (KBs) are specific forms of graphs structures that are subject to change because the world they describe changes. The question explored by this paper is: What is an adequate formalism for describing these changes, and how to reason about the effects of changes? Reasoning about graph transformations in full generality is hard [7]. Some decidable logics for graph transductions are known, such as MSO [6], but are ? Preliminary workshop version of a paper to be presented at DL 2013 ?? Part of this research has been supported by the Climt project (ANR-11-BS02-016). 522 M. Chaabani, R. Echahed and M. Strecker descriptive, applicable to a limited number of graphs and often do not match with an algorithmic notion of transformation. Some implementations of verification environments for pointer manipulating programs exist [9], but they often impose severe restrictions on the kind of graphs that can be manipulated, such as having a clearly identified spanning tree. In [4], the have authors introduced a dynamic logic which is very expressive. It has been designed to describe different kinds of elementary knowledge bases transformations (addition of new items, addition and deletion of links, etc.). It allows also to specify advanced properties on graph structures which go beyond mu-calculus or MSO logics. Unfortunately, the expressive power of that logic has a price: the undecidability of the logic. The purpose of the present paper is to identify a programming language together with a logic such that the transfor- mation of the KB is decidable. The transformations themselves are not encoded in the logic itself (as in [4]) but in a dedicated imperative language for which we develop a Hoare-style calculus. Work on (KB) updates [8] seem to approach the problem from the opposite direction: Add facts to a KB and transform the KB at the same time such that certain formulas remain satisfied. In our approach, the KB the modification of the KB is exclusively specified by the program. The work described in this paper is ongoing, some results are still prelimi- nary. Based on previous work [5], we are in the process of coding the formalism described here in the Isabelle proof assistant [10]. Parts of the coding in this paper are inspired by formalizations in the Isabelle distribution and by [11]. The formal development accompanying this paper will be made available on the web4 , which should also be consulted for proofs. Before starting with the formal development, let us give an example of the kind of program (see Fig. 1) that we would like to write. Assume a knowledge base with objects of class A and B, and a relation r. The node n is initially connected to at least 3 objects of class A, and all objects it is connected to are of class B. Because the number of connections to A is too large, we execute a loop that selects an A-object (let’s call it a) that n is connected to, and delete the r-connection between n and a. To compensate, we select an object b of class B and connect n to b. We stop as soon as the number of A-connections of n has reached 2, which is one of the post-conditions we can ascertain. 2 Logic Our logic is a three-tier framework, the first level being DL concepts (“TBox”), the second level facts (“ABox”, instances of concepts), the third level formulas (Boolean combinations of facts and a simple form of quantification). Concepts: We concentrate on a DL featuring concepts with simple roles and number restrictions, similar to ALCN [2]. For c being the type of concept names 4 http://www.irit.fr/~Martin.Strecker/Publications/dl_transfo2013.html Logical Foundations for Reasoning about Transformations ... 523 vars n, a, b; /* Pre: n : (≥ 3 r A) u (∀ r B) */ while ( n : (> 2 r A) ) do { /* Inv: n : (≥ 2 r A) u (∀ r B) */ select a sth a : A ∧ (n r a); delete(n r a); select b sth b : B ; add(n r b) } /* Post: n : (= 2 r A) u (∀ r B) */ Fig. 1. An example program and r the type of role names, the data type C of concepts can be defined induc- tively by: C ::= c (atomic concept) | ¬C (negation) | C u C (conjunction) | C t C (disjunction) | (≥ n r C) (at least) | (< n r C) (no more than) | C[r := RE] (explicit substitution) Adding a universal concept > and an empty concept ⊥ would not add expres- sivity, as they are equivalent to (≥ 0 r C) respectively (< 0 r C) for arbitrary r and C, and we will use them as shortcuts. We also write (∃ r C) for (≥ 1 r C) and (∀ r C) for (< 1 r (¬C)). The last constructor, explicit substitution [1], is a particularity of our frame- work, required for a lazy elimination of substitutions that replace, in a concept C, a role name r by a role expression RE. If i is the set of individual variable names, the type RE is defined by RE ::= r (atomic role) | r − (i, i) (deletion of relation instance) | r + (i, i) (insertion of relation instance) Please note that concepts implicitly depend on the types c, r and i, which we assume mutually disjoint. A substitution can therefore never affect an individual variable. A set-theoretic semantics is provided by a domain ∆ an interpretation func- tion I mapping c to a set of individuals (subsets of ∆), r to a binary relation of individuals (subsets of ∆ × ∆), and i to individual elements of ∆. For interpretation of concepts C, negation is inductively interpreted as com- plement, concept conjunction as intersection and disjunction as union. I(≥ 524 M. Chaabani, R. Echahed and M. Strecker n r C) = {x | card{y | (x, y) ∈ I(r) ∧ y ∈ I(C)} ≥ n}, and analogously for I(< n r C). Here, card is the cardinality of finite sets (and 0 otherwise). For interpretation of role expressions RE, we define I(r − (i1 , i2 )) = I(r) − {(I(i1 ), I(i2 ))}, and I(r + (i1 , i2 )) = I(r) ∪ {(I(i1 ), I(i2 ))}. Interpretation update I [r:=rl] modifies the interpretation I at relation name r to relation rl, thus I [r:=rl] (r) = rl and I [r:=rl] (r0 ) = I(r0 ) for r0 6= r. With this, we can define the semantics of explicit substitution by I(C[r := RE]) = I [r:=I(RE)] (C). Facts: Facts make assertions about an instance being an element of a concept, and about being in a relation. In DL parlance, facts are elements of an ABox. The type of facts is defined as follows: f act ::= i : C (instance of concept) | iri (instance of role) | i (¬r) i (instance of role complement) | i = i (equality of instances) | i 6= i (inequality of instances) The interpretation of a fact is a truth value, defined by: – I(i : C) = (I(i) ∈ I(C)) – I(i1 r i2 ) = (I(i1 ), I(i2 )) ∈ I(r) and I(i1 (¬r) i2 ) = (I(i1 ), I(i2 )) ∈ / I(r) – I(i1 = i2 ) = (I(i1 ) = I(i2 )) and I(i1 6= i2 ) = (I(i1 ) 6= I(i2 )) Please note that since concepts are closed by complement, facts are closed by negation (the negation of a fact is again representable as a fact), and this is the main motivation for introducing the constructors “instance of role complement” and “inequality of instances”. Formulas: A formula is a Boolean combination of facts. We also allow quantifi- cation over individuals i (but not over relations or concepts), and, again, have a constructor for explicit substitution. f orm ::= ⊥ | f act | ¬f orm | f orm ∧ f orm | f orm ∨ f orm | ∀i.f orm | ∃i.f orm | f orm[r := RE] The extension of interpretations from facts to formulas is standard; the inter- pretation of substitution in formulas is in entire analogy to concepts. As usual, a formula that is true under all interpretations is called valid. When calculating weakest preconditions (in Sect. 4), we obtain formulas which essentially contain no existential quantifiers; we keep them as constructor because they can occur as intermediate result of computations. We say that a formula is essentially universally quantified if ∀ only occurs below an even and ∃ only below an odd number of negations. For example, ¬(∃x. x : C ∧ ¬(∀y. y : D)) is essentially universally quantified. Logical Foundations for Reasoning about Transformations ... 525 Implication f1 −→ f2 is the abbreviation for ¬f1 ∨ f2 , and ite(c, t, e) the abbreviation for (c −→ t) ∧ (¬c −→ e), not to be confused with the if-then-else statement presented in Sect. 3. 3 Programming Language The programming language is an imperative language manipulating relational structures. Its distinctive features are conditions (in conditional statements and loops) that are restricted DL formulas, in the sense of Sect. 2. It has a non- deterministic assignment statement allowing to select an element satisfying a fact. Traditional types (numbers, inductive types) are not provided. In this paper, we only consider a core language with traditional control flow constructs, but without procedures. Also, it is only possible to modify a relational structure, but not to “create objects” (with a sort of new statement) or to “deallocate” them. These constructs are left for further investigation. 3.1 Syntax The type of statements is defined by: stmt ::= Skip (empty statement) | select i sth f orm (assignment) | delrel(i r i) (delete arc in relation) | insrel(i r i) (insert arc in relation) | stmt ; stmt (sequence) | if f orm then stmt else stmt | while f orm do stmt 3.2 Semantics The semantics is a big-step semantics with rules of the form (st, σ) ⇒ σ 0 ex- pressing that executing statement st in state σ produces a new state σ 0 . The rules of the semantics are given in the Fig. 2. Beware that we overload logical symbols such as ∃, ∧ and ¬ for use in the meta-syntax and as constructors of f orm. The state space σ is in fact identical to an interpretation function I as intro- duced in Sect. 2, and it is only in keeping with traditional notation in semantics that we use the symbol σ. We may therefore write σ(b) to evaluate the condition b (a formula) in state σ. Most of the rules are standard, apart from the fact that we do not use expres- sions, but formulas as conditions. The auxiliary function delete edge modifies the state σ by removing an r-edge between the elements represented by v1 and v2 . With the update function for interpretations introduced in Sect. 2, one defines delete edge v1 r v2 σ = σ [r:=σ(r)−{(σ(v1 ),σ(v2 ))}] 526 M. Chaabani, R. Echahed and M. Strecker (c1 , σ) ⇒ σ 00 (c2 , σ 00 ) ⇒ σ 0 (Skip) (Seq) (Skip, σ) ⇒ σ (c1 ;c2 , σ) ⇒ σ 0 σ 0 = delete edge v1 r v2 σ σ 0 = generate edge v1 r v2 σ (EDel) (EGen) (delrel(v1 r v2 ), σ) ⇒ σ 0 (insrel(v1 r v2 ), σ) ⇒ σ 0 ∃vi.(σ 0 = σ [v:=vi] ∧ σ 0 (b)) (SelAssT ) (select v sth b, σ) ⇒ σ 0 σ(b) (c1 , σ) ⇒ σ 0 ¬σ(b) (c2 , σ) ⇒ σ 0 0 (If T ) (If F ) (if b then c1 else c2 , σ) ⇒ σ (if b then c1 else c2 , σ) ⇒ σ 0 σ(b) (c, σ) ⇒ σ 00 (while b do c, σ 00 ) ⇒ σ 0 ¬σ(b) (W T ) (W F ) (while b do c, σ) ⇒ σ 0 (while b do c, σ) ⇒ σ Fig. 2. Big-step semantics rules and similarly generate edge v1 r v2 σ = σ [r:=σ(r)∪{(σ(v1 ),σ(v2 ))}] The statement select v sth F (v) selects an element vi that satisfies formula F , and assigns it to v. For example, select a sth a : A∧(a r b) selects an element a instance of concept A and being r-related with a given element b. select is a generalization of a traditional assignment statement. There may be several instances that satisfy F , and the expressiveness of the logic might not suffice to distinguish them. In this case, any such element is selected, non- deterministically. Let us spell out the precondition of (SelAssT ): Here, σ [v:=vi] is an interpretation update for individuals, modifying σ at individual name v ∈ i with an instance vi ∈ ∆, similar to the interpretation update for relations seen before. We therefore pick an instance vi, check whether the formula b would be satisfied under this choice, and if it is the case, keep this assignment. In case no satisfying instance exists, the semantics blocks, i.e. the given state does not have a successor state, which can be considered as an error situation. Some alternatives to this design choice can be envisaged: We might treat a select v sth F (v) with unsatisfiable F as equivalent to a Skip. This would give us a choice of two rules, one in which the precondition of rule (SelAssT ) is satisfied, and one in which it is not. As will be seen in Sect. 4, this would introduce essentially existentially quantified variables in our formulas when computing Logical Foundations for Reasoning about Transformations ... 527 weakest preconditions and lead us out of the fragment that we can deal with in our decision procedure. Alternatively, we could apply an extended type check verifying that select-predicates are always satisfiable, and thus ensure that type- correct programs do not block. This is the alternative we prefer; details still have to be worked out. 4 Weakest Preconditions We compute weakest preconditions wp and verification conditions vc. Both take a statement and a DL formula as argument and produce a DL formula. For this purpose, while loops have to be annotated with loop invariants, and the while constructor becomes: while {f orm} f orm do stmt. Here, the first formula (in braces) is the invariant, the second formula the termination condition. The two functions are defined by primitive recursion over statements, see Fig. 3. wp(Skip, Q) = Q wp(delrel(v1 r v2), Q) = Q[r := r − (v1 , v2 )] wp(insrel(v1 r v2), Q) = Q[r := r + (v1 , v2 )] wp(select v sth b, Q) = ∀v.(b −→ Q) wp(c1 ; c2 , Q) = wp(c1 , wp(c2 , Q)) wp(if b then c1 else c2 , Q) = ite(b, wp(c1 , Q), wp(c2 , Q)) wp(while{iv} b do c, Q) = iv vc(Skip, Q) = > vc(delrel(v1 r v2), Q) = > vc(insrel(v1 r v2), Q) = > vc(select v sth b, Q) = > vc(c1 ; c2 , Q) = vc(c1 , wp(c2 , Q)) ∧ vc(c2 , Q) vc(if b then c1 else c2 , Q) = vc(c1 , Q) ∧ vc(c2 , Q) vc(while{iv} b do c, Q) = (iv ∧ ¬b −→ Q) ∧ (iv ∧ b −→ wp(c, iv)) ∧ vc(c, iv) Fig. 3. Weakest preconditions and verification conditions Without going further into program semantics issues, let us only state the fol- lowing soundness result that relates the operational semantics and the functions wp and vc: Theorem 1 (Soundness). If vc(c, Q) is valid and (c, σ) ⇒ σ 0 , then σ(wp(c, Q)) implies σ 0 (Q). What is more relevant for our purposes is the structure of the formulas gener- ated by wp and vc, because it has an impact on the decision procedure. Besides the notion of essentially universally quantified introduced in Sect. 2, we need 528 M. Chaabani, R. Echahed and M. Strecker the notion of quantifier-free formula: A formula not containing a quantifier. In extension, we say that a statement is quantifier-free if all of its formulas are quantifier-free. By induction on c, one shows: Lemma 1 (Universally quantified). Let Q be essentially universally quanti- fied and c be a quantifier-free statement. Then wp(c, Q) and vc(c, Q) are essen- tially universally quantified. 5 Decision Procedure 5.1 Overview We present a decision procedure for verifying the validity of essentially univer- sally quantified formulas. As seen in Lemma 1, this is the format of formulas extracted by wp and vc, and as motivated by the soundness result (Theorem 1), validity of verification conditions is a precondition for ensuring that a program executes according to its specification. Given an essentially universally quantified formula e, the rough lines of the procedure for determining that e is valid are spelled out in the following. Getting rid of quantifiers: 1. Convert e to an equivalent prenex normal form p, which will consist of a prefix of universal quantifiers, and a quantifier-free body: ∀x1 . . . xn .b 2. p is valid iff its universal closure ucl(p) (universal abstraction over all free variables of p) is. 3. Show the validity of ucl(p) by showing the unsatisfiability of ¬ucl(p). 4. ¬ucl(p) has the form ¬∀v1 . . . vk , x1 . . . xn .b. Pull negation inside the univer- sal quantifier prefix, remove the resulting existential quantifier prefix, and show unsatisfiability of ¬b with the aid of an extended tableau method. Computation of prenex normal forms is standard. Care has to be taken to avoid capture of free variables, by renaming bound variables. Free variables are defined as usual; the free variables of a substitution f [r := r − (v1 , v2 )] are those of f and in addition v1 and v2 (similarly for edge insertion). We illustrate the problem with the following program fragment prg: select a sth a : A ; select b sth b r a ; select a sth a r b For a given post-condition Q, we obtain wp(prg, Q) = ∀a.a : A −→ ∀b.(b r a) −→ ∀a.(a r b) −→ Q whose prenex normal form ∀a1 , b, a2 . (a1 : A −→ (b r a1 ) −→ (a2 r b) −→ Q) contains more logical variables than prg contains program variables. Logical Foundations for Reasoning about Transformations ... 529 Extended tableau method – prerequisites: The tableau method takes a quantifier- free formula f and proves its unsatisfiability or displays a model. We aim at reusing existing tableau methods (such as [3]) as much as possible. The difficulty consists in getting rid of the substitution constructor. Substitution is compatible with the constructors of formulas: Lemma 2 (Substitution in formulas). ⊥[r := re] = ⊥ (¬f )[r := re] = (¬f [r := re]) (f1 ∧ f2 )[r := re] = (f1 [r := re] ∧ f2 [r := re]) (f1 ∨ f2 )[r := re] = (f1 [r := re] ∨ f2 [r := re]) The case of formulas which are facts, missing in Lemma 2, will be dealt with separately. This is due to the fact that substitution is not compatible with con- cepts, as will be seen in Sect. 5.2: For a given concept C, there is not necessarily a concept C 0 = C[r := re]. However, substitutions can be eliminated from facts, by the equations given in Sect. 5.2. We will refer to the equations in Lemma 2 and those in Sect. 5.2 as substitu- tion elimination rules. We say that a substitution in a formula is visible if one of these rules is applicable; and that it is hidden if none of these rules is applicable. For example, the substitution in (x : (C1 u C2 ))[r := re] is visible; it is hidden in (x : (C1 [r := re] u C2 [r := re])) and only becomes visible after application of an appropriate tableau rule, for example of the system ALCN . To describe our procedure, we introduce the following terminology: An ABox is a finite set of facts (interpreted as the conjunction of its facts), and a tableau a finite set of ABoxes (interpreted as a disjunction of its ABoxes). We need the following functions: – push subst takes a formula and applies substitution elimination rules as far as possible; – f orm to tab converts to disjunctive normal form and then performs the ob- vious translation to a tableau; – tab to f orm takes a tableau and constructs the corresponding formula. Extended tableau method – procedure: Our method is parameterized by the fol- lowing interface of an implementation of your favorite tableau calculus: – a transition system T =⇒ T 0 , defining a one-step transformation of a tableau T to a tableau T 0 . – a function sat which checks, for tableaux T that are irreducible wrt. =⇒, whether T is satisfiable. From this, we construct a restricted relation T =⇒r T 0 , which is the same as =⇒ provided that T does not contain visible substitutions: T =⇒ T 0 no visible subst in T T =⇒r T 0 530 M. Chaabani, R. Echahed and M. Strecker We also define a relation =⇒s that pushes substitutions until they become hidden: T contains visible subst T 0 = f orm to tab(push subst(tab to f orm(T ))) T =⇒s T 0 From these, we define the relation =⇒sr = (=⇒r ∪ =⇒s ). The extended tableau algorithm takes a formula f and computes a Tf such that f orm to tab(f )(=⇒sr )∗ Tf . The result of the algorithm is sat(Tf ). The following lemmas show that =⇒sr is a correct and complete algorithm for deciding the decidability of formulas with substitution provided =⇒ is for substitution-free formulas. Lemma 3 (Termination). =⇒sr is well-founded provided =⇒ is. To show termination of the extended algorithm, define – the substitution size of a formula or fact as the sum of the term sizes below its substitutions. – the substitution size of a tableau as the multiset of the substitution sizes of its facts. Note that application of =⇒s leads to a reduction of the substitution size. For a well-founded measure m of =⇒, construct a well-founded measure of =⇒sr as the lexicographic order of the substitution size and m. Lemma 4 (Confluence). =⇒sr is confluent provided =⇒ is. =⇒sr has no other critical pairs than =⇒. Lemma 5 (Satisfiability). =⇒sr preserves satisfiability provided =⇒ does. The three auxiliary functions used for defining =⇒s do. 5.2 Elimination of Substitutions We now show how substitutions can be pushed into facts. The constructors equality and inequality are easiest to handle: – (x = y)[r := re] reduces to (x = y) – (x 6= y)[r := re] reduces to (x 6= y) For positive resp. negative instances of roles, we have: – (x r y)[r := r − (v1 , v2 )] reduces to (¬((x = v1 ) ∧ (y = v2 ))) ∧ (x r y) – (x (¬r) y)[r := r − (v1 , v2 )] reduces to ((x = v1 ) ∧ (y = v2 )) ∨ (x (¬r) y) – (x r y)[r := r + (v1 , v2 )] reduces to ((x = v1 ) ∧ (y = v2 )) ∨ (x r y) – (x (¬r) y)[r := r + (v1 , v2 )] reduces to (¬((x = v1 ) ∧ (y = v2 ))) ∧ (x (¬r) y) Logical Foundations for Reasoning about Transformations ... 531 whereas substitutions (x r y)[r0 := re] and (x (¬r) y)[r0 := re] for r 6= r0 are the identity. For facts of the form x : C, where C is a concept, we have the cases: – (x : ¬C)[r := re] reduces to x : (¬C[r := re]) – (x : C1 ∧ C2 )[r := re] reduces to x : (C1 [r := re] ∧ C2 [r := re]) – (x : C1 ∨ C2 )[r := re] reduces to x : (C1 [r := re] ∨ C2 [r := re]) – (x : (≥ n r C))[r0 := re], for r0 6= r, reduces to x : (≥ n r C[r0 := re]), and similarly when replacing ≥ by < – (x : (≥ n r C))[r := r − (v1 , v2 )] reduces to ite ((x = v1 ) ∧ (v2 : (C[r := r − (v1 , v2 )])) ∧ (v1 r v2 ), (x : (≥ (n + 1) r (C[r := r − (v1 , v2 )]))), (x : (≥ n r (C[r := r − (v1 , v2 )])))) and similarly when replacing ≥ by < – (x : (≥ (n + 1) r C))[r := r + (v1 , v2 )] reduces to ite ((x = v1 ) ∧ (v2 : (C[r := r + (v1 , v2 )])) ∧ (v1 (¬r) v2 ), (x : (≥ n r (C[r := r + (v1 , v2 )]))), (x : (≥ (n + 1) r (C[r := r + (v1 , v2 )])))) and similarly when replacing ≥ by < – (x : (≥ 0 r C))[r := r + (v1 , v2 )] reduces to > – (x : (< 0 r C))[r := r + (v1 , v2 )] reduces to ⊥ – Pathological case (x : C[sbst1 ])[sbst2 ]: lift inner substitution to (x : C)[sbst1 ][sbst2 ], then apply the above. 6 Conclusions This paper proposes a language for rewriting knowledge bases, and methods for reasoning about the correctness of these programs, by means of a Hoare-style calculus. DL formulas are directly integrated into the statements of the pro- gramming language. The verification conditions extracted from these programs has been shown to be decidable, by a modular extension of existing tableau algorithms. The work described here is still preliminary, in several respects, and the following points indicate directions for future investigations: – We are in the process of coding the theory in the Isabelle proof assistant. Some parts of the proofs of Sect. 4 and most of Sect. 5.1 still has to be done. The purpose is to obtain a framework that will allow us to experiment more easily with variations of the logic. – We have currently focused on the logic ALCN . It is interesting to consider both less expressive logics (which offer more space for optimizations) and more expressive logics (to explore decidability questions). The process de- scribed in Sect. 5.1 is rather generic, but it remains to be seen whether more expressive DLs, featuring more complex role expressions, can be accommo- dated. 532 M. Chaabani, R. Echahed and M. Strecker – In any case, the proof procedure sketched in Sect. 5 is rather of a theoret- ical than a practical value; an efficient implementation should not convert between formulas and tableaux as indiscriminately as suggested there, but apply propagation of substitutions locally. – In a similar vein, it would be interesting to implement a transformation engine on the basis of the language described here, also with the purpose of evaluating the practical expressiveness of the language on larger examples. References 1. Abadi, M., Cardelli, L., Curien, P.L., Lévy, J.J.: Explicit substitutions. Journal of Functional Programming 1(4), 375–416 (October 1991) 2. Baader, F., Sattler, U.: Expressive number restrictions in description logics. Journal of Logic and Computation 9(3), 319–350 (1999) 3. Baader, F., Sattler, U.: Tableau algorithms for description logics. In: Dyckhoff, R. (ed.) Automated Reasoning with Analytic Tableaux and Related Methods, Lecture Notes in Computer Science, vol. 1847, pp. 1–18. Springer Berlin / Heidelberg (2000) 4. Balbiani, P., Echahed, R., Herzig, A.: A dynamic logic for termgraph rewriting. In: 5th International Conference on Graph Transformations (ICGT). Lecture Notes in Computer Science, vol. 6372, pp. 59–74. Springer (2010) 5. Chaabani, M., Mezghiche, M., Strecker, M.: Vérification d’une méthode de preuve pour la logique de description ALC. In: Ait-Ameur, Y. (ed.) Proc. 10ème Journées Approches Formelles dans l’Assistance au Développement de Logiciels (AFADL). pp. 149–163 (Jun 2010) 6. Courcelle, B., Engelfriet, J.: Graph structure and monadic second-order logic, a language theoretic approach. Cambridge University Press (2011) 7. Immerman, N., Rabinovich, A., Reps, T., Sagiv, M., Yorsh, G.: The bound- ary between decidability and undecidability for transitive-closure logics. In: Marcinkowski, J., Tarlecki, A. (eds.) Computer Science Logic, Lecture Notes in Computer Science, vol. 3210, pp. 160–174. Springer Berlin / Heidelberg (2004) 8. Liu, H., Lutz, C., Milicic, M., Wolter, F.: Foundations of instance level updates in expressive description logics. Artificial Intelligence 175(18), 2170–2197 (2011) 9. Møller, A., Schwartzbach, M.I.: The pointer assertion logic engine. In: PLDI. pp. 221–231 (2001) 10. Nipkow, T., Paulson, L., Wenzel, M.: Isabelle/HOL. A Proof Assistant for Higher- Order Logic, Lecture Notes in Computer Science, vol. 2283. Springer Berlin / Heidelberg (2002) 11. Schirmer, N.: Verification of Sequential Imperative Programs in Isabelle/HOL. Ph.D. thesis, Technische Universität München (2006) Program Algebras with Monotone Floyd-Hoare Composition Andrii Kryvolap1, Mykola Nikitchenko1 and Wolfgang Schreiner2 1 Taras Shevchenko National University of Kyiv, Kyiv, Ukraine krivolapa@gmail.com, nikitchenko@unicyb.kiev.ua 2 Johannes Kepler University, Linz, Austria Wolfgang.Schreiner@risc.jku.at Abstract. In the paper special program algebras of partial predicates and func- tions are described. Such algebras form a semantic component of a modified Floyd-Hoare logic constructed on the base of a composition-nominative ap- proach. According to this approach, Floyd-Hoare assertions are presented with the help of a special composition called Floyd-Hoare composition. Monotonic- ity and continuity of this composition are proved. The language of the modified Floyd-Hoare logic is described. Further, the inference rules for such logic are studied, their soundness conditions are specified. The logic constructed can be used for program verification. Keywords. Program algebra, program logic, composition-nominative approach, partial predicate, soundness Key terms. FormalMethod, VerificationProcess 1 Introduction Program logics are the main formalisms used for proving assertions about program properties. A well-known Floyd-Hoare logic [1, 2] is an example of such logics. Se- mantically, this logic is defined for a case of total predicates and functions though programs can be partial. In this case assertions can be presented with the help of a special composition over total predicates and functions called Floyd-Hoare composi- tion (FH-composition). However, a straightforward extension of classical Floyd- Hoare logic for partial predicates and functions meets some difficulties. The first one is that the classical FH-composition will not be monotone. Monotonicity means that the result of the mapping evaluation remains the same on extended data, if it was evaluated on the initial data. This important property grants the possibility to reason about the correctness of the program based on the correctness of its approximations. 534 A. Kryvolap, M. Nikitchenko and W. Schreiner That is why the need of a modified definition of the classical Floyd-Hoare logic for the case of partial mappings arises. Here we will consider only mappings (predicates, ordinary functions, and program functions) defined over flat nominative data (nomi- native sets). Such data are treated as collections of named values. Mappings over such data are called quasiary mappings [3]. The obtained program algebras are called qua- siary program algebras. They form a semantic component of quasiary Floyd-Hoare logics. The syntactic component of such logics is presented by their languages and sys- tems of inference rules. We study the possibility to use classical rules for modified logics with a monotone Floyd-Hoare composition. Systems of such inference rules should be sound and complete to be of a practical use. This could be achieved by adding proper restrictions to the inference rules of the classical Floyd-Hoare logic that fail to be correct. It should be also shown that by weakening additional restrictions we obtain a system of the inference rules that is not sound. This will prove that restric- tions are necessary. The rest of the paper is structured as follows. In Section 2 we describe program al- gebras of quasiary predicates and functions on different levels of abstraction, define a modified Floyd-Hoare composition and specify the syntax for the modified logic. In Section 3 we prove the main properties of this composition. In Section 4 we study the soundness of the system of inference rules for the introduced program algebras. Fi- nally, we formulate conclusions in Section 5. 2 Quasiary Program Algebras To modify the classical Floyd-Hoare logic for partial quasiary mappings, we will use semantic-syntactic scheme [3-5]. This means that we will first define the semantics in the form of classes of quasiary program algebras. Then the language of the logic will be defined as well as the interpretation mappings. p To emphasize a mapping’s partiality/totality we write the sign for partial t mappings and the sign for total mappings. Given an arbitrary partial mapping p : D D , d D, S D, S D we write: – (d) to denote that is defined on d; – (d)= d to denote that is defined on d with a value d ; – (d) to denote that is undefined on d; – μ[ S ] {μ(d ) | μ(d ) , d S } to denote the image of S under ; – μ 1[ S ' ] {d | μ(d ) , μ(d ) S ' } to denote the preimage (inverse image) of S under . 2.1 Classes of quasiary mappings Let V be a set of names (variables). Let A be a set of basic values. Given V and A, the class VA of nominative sets is defined as the class of all partial mappings from V to A, Program Algebras with Monotone Floyd-Hoare Composition 535 p thus, VA=V A. Informally speaking, nominative sets represent states of vari- ables. Though nominative sets are defined as mappings, we follow mathematical tradi- tions and also use a set-like notation for these objects. In particular, the notation d = [vi ai | iI] describes a nominative set d where vi ai n d means that d(vi) is defined and its value is ai (d(vi)=ai). The main operation for nominative sets is the t binary total overriding operation : VA× VA VA defined by the formula d1d 2 [v a | v a n d 2 (v a n d1 a(v a n d 2 ))] . Intuitively, given d1 and d2 this operation yields a new nominative set which consists of named pairs of d2 and those pairs of d1 whose names do not occur in d2. p Let Bool {F ,T } be the set of Boolean values. Let PrV, A=VA Bool be the set of all partial predicates over VA. Such predicates are called partial quasiary predi- p cates. Let FnV, A=VA A be the set of all partial functions from VA to A. Such functions are called partial quasiary ordinary functions. Here ‘ordinary’ means that p the range of such functions is the set of basic values A. Let FPrgV, A=VA VA be the set of all partial functions from VA to VA. Such functions are called bi-quasiary functions. Quasiary predicates represent conditions which occur in programs, quasiary ordi- nary functions represent the semantics of program expressions, and biquasiary func- tions represent program semantics. The terms ‘partial’ and ‘ordinary’ are usually omitted. In a general term, elements from PrV, A, FnV, A, and FPrgV, A are called quasiary mappings. 2.2 Hierarchy of program algebras and logics Based on algebras with three carriers (PrV, A, FnV, A, and FPrgV, A) we can define logics of three types: – Pure quasiary predicate logics based on algebras with one sort: PrV,А – Quasiary predicate-function logics based on algebras with two sorts: Pr V,А and FnV,А – Quasiary program logics based on algebras with three sorts: PrV,А, FnV,А, and FPrgV,А For logics of pure quasiary predicates we identify renominative, quantifier, and quantifier-equational levels. Renominative logics [3] are the most abstract among above-mentioned logics. The main new compositions for these logics are the compositions of renomination (renam- v ,...,v ing) of the form R x1 ,..., xn : PrV,А t PrV,А. Intuitively, given a quasiary predicate p 1 n v ,...,v and a nominative set d the value of R x1 ,..., xn (p)(d) is evaluated in the following way: 1 n first, a new nominative set d is constructed from d by changing the values of the 536 A. Kryvolap, M. Nikitchenko and W. Schreiner names v1,...,vn in d to the values of the names x1,..., xn respectively; then the predicate p is applied to d . The obtained value (if it was evaluated) will be the result of v ,...,v R x1 ,..., xn (p)(d). For this composition we will also use a simplified notation Rxv . The 1 n basic compositions of renominative logics are , , and R vx . Note, that renomination (primarily in syntactical aspects) is widely used in classical logic, lambda-calculus, and specification languages like Z-notation, B, TLA, RAISE, ASM, etc. At the quantifier level, all basic values can be used to construct different nomina- tive sets to which quasiary predicates can be applied. This allows one to introduce the compositions of quantification of the form x in style of Kleene’s strong quantifiers. The basic compositions of logics of the quantifier level are , , R vx , and x. At the quantifier-equational level, new possibilities arise for equating and differen- tiating values with special 0-ary compositions of the form =xy called equality predi- cates. Basic compositions of logics of the quantifier-equational level are , , Rxv , x, and =ху . All specified logics (renominative, quantifier, and quantifier-equational) are based on algebras that have only one sort: a class of quasiary predicates. For quasiary predicate-function logics we identify the function level and the func- tion-equational level. At the function level, we have extended capabilities for the formation of new ar- guments of functions and predicates. In this case it is possible to introduce the super- position compositions S Fv and S Pv (see [4, 5]), which formalize substitution of func- tions into function and predicate respectively. Also special null-ary denomination parametric compositions (functions) 'x are introduced. The introduction of such func- tions allows one to model renomination compositions with the help of superpositions. The basic compositions of logics of the function level are , , S Fv , S Pv , x, and 'x. At the function-equational level, a special equality composition = can be intro- duced additionally. The basic compositions of logics of the function-equational level are , , S Fv , S Pv , x, 'x, and = . At this level different classes of first-order logics can be presented. This means that two-sorted algebras (with sets of predicates and functions as sorts and above-mentioned compositions as operations) form a semantic base for first-order CNL. The level of program logics is quite rich. Investigation of such logics is a special challenge; here we will study semantic properties of a modified Floyd-Hoare logic. To define such logics we should first define program algebras with program composi- tions as their operations. Such compositions correspond to the main structures of pro- grams. In the simplest case they are: – The parametric assignment composition AS x : FnV , A FPrg V , A – The composition of sequential execution : FPrg V , A FPrg V , A FPrg V , A – The conditional composition IF : PrV , A FPrg V , A FPrg V , A FPrg V , A – The cyclic composition (loop) WH : Pr V , A FPrg V , A FPrg V , A Program Algebras with Monotone Floyd-Hoare Composition 537 Additionally we need compositions that describe properties of the programs. The Floyd-Hoare composition FH : PrV , A FPrg V , A Pr V , A PrV , A is the most important of them. Its formal definition will be given in the next subsection. 2.3 Formal definition of a Floyd-Hoare composition The required definition stems from the treatment of Floyd-Hoare assertions with total predicates (see, for example, [6]). Namely, an assertion {p}f{q} is said to be valid if and only if for all d from VA if p(d) =T, f(d)= d for some d then q(d ) =T (1) Note, that we do not make a distinction between a formula and its interpretation. Thus, we treat, say, p as a formula in the assertion {p}f{q} and as a predicate of the program algebra. The definition (1) permits to treat {p}f{q} as a predicate because this is a pointwise definition. Rewriting this definition for different cases we get the following matrices (table 1) specifying the logical values of {p}f{q} for an arbitrarily d: Table 1. Logical values of {p}f{q} for total predicates. a) f(d) is defined b) f(d) is undefined p(d) \ q(f(d)) F T p(d) {p}f{q}(d) F T T F T T F T T T Our aim is to extend the notion of assertion validity for partial predicates. But first we should admit that the presented definition will not be monotone under predicate extension. Indeed, consider informally the following assertion: {T} while T do skip {F}. This Floyd-Hoare triple will be true on all data, because the infinite loop is undefined on all data, and thus on all data the condition of validity for this assertion is satisfied. Now consider a triple {T} skip {F} that is false on all data. However, the mapping ‘skip’ is an extension of ‘while T do skip’. Thus, monotonicity fails for a case when p(d)=T and f(d) is undefined. So, the value for this case should be changed. To define a monotone interpretation of Floyd-Hoare triple for partial predicates we should change the question marks in Table 2 to Boolean values. 538 A. Kryvolap, M. Nikitchenko and W. Schreiner Table 2. Logical values of {p}f{q} for partial predicates, where the question marks represent values that should be changed to proper Boolean values. a) f(d) is defined b) f(d) is undefined p(d) \ q(f(d)) F T Undefined p(d) {p}f{q}(d) F T T ? F T T F T ? T ? undefined ? ? ? undefined ? To define such interpretation we adopt the following requirements: – Monotonicity of a composition on all its arguments – Maximal definiteness of the obtained predicates (we call this requirements as Kleene’s principle) We use techniques for non-deterministic semantics described in [7]. We treat the case when a predicate is ‘undefined’ as non-deterministic values T and F. Thus, we can use matrices from table 1 to evaluate a set of Boolean values for every case. The obtained results are presented in Table 3. Table 3. Logical values of {p}f{q} for partial predicates presented as sets of Boolean values. a) f(d) is defined b) f(d) is undefined p(d) \ q(f(d)) {F} {T} {F,T} p(d) {p}f{q}(d) {F} {T} {T} {T} {F} {T} {T} {F} {T} {F,T} {T} {F,T} {F,T} {F,T} {T} {F,T} {F,T} {F,T} Now, replacing non-deterministic results {F, T} on undefined we get the final results (table 4). Table 4. Logical values of {p}f{q} for partial predicates. a) f(d) is defined b) f(d) is undefined p(d)\q(f(d)) F T undefined p(d) {p}f{q}(d) F T T T F T T F T undefined T undefined undefined undefined T undefined undefined undefined The obtained matrices define an interpretation of {p}f{q} for partial predicates. As was said earlier, we formalize such triples as a Floyd-Hoare composition FH : PrV , A FPrg V , A Pr V , A PrV , A (p, q PrV,A, fFPrgV,А, d VA): T , if q( f (d )) T or p(d ) F , FH(p,f, q)(d)= F , if p(d ) T and q( f (d )) F , undefined in other cases. Program Algebras with Monotone Floyd-Hoare Composition 539 2.4 Formal definition of program algebra compositions In the previous subsection the formal definition of FH-composition was presented. In this subsection we give brief definitions of other compositions (see details in [3-5]). Propositional compositions are defined as follows (p, q PrV,A, d VA): T , if p(d ) T or q(d ) T , T , if p (d ) F , ( p q)(d ) F , if p(d ) F and q(d ) F , (p)(d ) F , if p(d ) T , undefined in other cases. undefined if p(d ) . Unary parametric composition of existential quantification x with the parameter xV is defined by the following formula (p PrV,A, d VA): T , if b A exists : p ( dx b) T , (x p )( d ) F , p ( dx a ) F for each a A, undefined in other cases. Here dx a is a shorter form for d[ x a] . Parametric n-ary superpositions with x ( x1 ,..., xn ) as the parameter are defined by the following formulas (f, g1,…, gn FnV,A, p PrV,A, d VA): ( S Fx ( f , g1 , , g n ))(d ) f (d [ x1 g1 (d ), , xn g n (d )]) , ( S Px ( p, g1 , , g n ))( st ) f ( st[ x1 g1 ( st ), , xn g n ( st )]) . Null-ary parametric denomination composition with the parameter xV is defined by the following formula (d VA): 'x (d) = d(x). Binary equality composition = is defined as follows (f, g FnV,A, d VA): T , if f (d ) , g (d ) , and f (d ) g (d ), (f=g) (d) F , if f (d ) , g (d ) , and f (d ) g (d ), undefined in other cases. Identical program composition idFPrgV,А is the most simple: id (d ) d (d VA). Assignment composition is defined as follows (f FnV,A, d VA): AS x ( f )(d ) d [ x f (d )] . Sequential execution is introduced in the ordinary way (fs1, fs2FPrgV,А, d VA): fs1 fs2 (d ) fs2 ( fs1 (d )) . Note, that we define by commuting arguments of conventional functional compo- sition: fs1 fs2 fs2 fs1 . Conditional composition depends on the value of the first function which is the condition itself (p PrV,A, fs1, fs2FPrgV,А, d VA): fs1 (d ), if p(d ) T , IF ( p, fs1 , fs2 )(d ) fs2 (d ), if p(d ) F , undefined in other cases. 540 A. Kryvolap, M. Nikitchenko and W. Schreiner Cycle is defined by the following formulas: WH ( p, fs)(d ) d n , where d0 d , d1 fs(d0 ) , …, d n fs(d n 1 ) , moreover p(d 0 ) T , p(d1 ) T , … , V,A V,А V p(d n 1 ) T ,….. p(d n ) F (p Pr , fsFPrg , d A). It means that we have defined the following quasiary program algebra: QPA(V, A) = < PrV,A, FnV,A, FPrgV,A; , , S Fv , S Pv , x, x, =, id, AS x, , IF, WH, FH>. This algebra is the main object of our investigation. 2.5 Formal definition of program algebra terms Terms of the algebra QPA(V, A) defined over sets of predicate symbols Ps, function symbols Fs, program symbols Prs, and variables V specify the syntax (the language) of the logic. We now give inductive definitions for terms Tr ( Ps, Fs, Prs,V ) , formulas Fr ( Ps, Fs, Prs,V ) , program texts Pt ( Ps, Fs, Prs,V ) , and Floyd-Hoare assertions FHFr ( Ps, Fs, Prs,V ) . First we will define terms: – if f Fs then f Tr ( Ps, Fs, Prs,V ) – if v V then ' v Tr ( Ps, Fs, Prs,V ) – if f Fs , t1 ,, tn Tr ( Ps, Fs, Prs,V ) , and v1 , , vn V are distinct variables then S Fv ( f , t1 , , tn ) Tr ( Ps, Fs, Prs,V ) Then we will define program texts: – id Pt ( Ps, Fs, Prs,V ) – if p Prs then p Pt ( Ps, Fs, Prs, V ) – if v V and t Tr ( Ps, Fs, Prs,V ) then AS v (t ) Pt ( Ps, Fs, Prs,V ) – if p1 , p2 Pt ( Ps, Fs, Prs,V ) then p1 p2 Pt ( Ps, Fs, Prs,V ) – if p1 , p2 Pt ( Ps, Fs, Prs,V ) and b Fr ( Ps, Fs, Prs,V ) then IF (b, p1 , p2 ) Pt ( Ps, Fs, Prs,V ) – if p Pt ( Ps, Fs, Prs,V ) and b Fr ( Ps, Fs, Prs,V ) then WH (b, p) Pt ( Ps, Fs, Prs,V ) Finally, formulas and Floyd-Hoare triples are defined: – if p Ps then p Fr ( Ps, Fs, Prs,V ) – if Fr ( Ps, Fs, Prs,V ) then Fr ( Ps, Fs, Prs,V ) – if t1 , t2 Tr ( Ps, Fs, Prs,V ) then t1 t2 Fr ( Ps, Fs, Prs,V ) – if Fr ( Ps, Fs, Prs,V ) and v V then v Fr ( Ps, Fs, Prs,V ) – if , Fr ( Ps, Fs, Prs,V ) then Fr ( Ps, Fs, Prs,V ); Program Algebras with Monotone Floyd-Hoare Composition 541 – if p Ps , t1 ,, tn Tr ( Ps, Fs, Prs,V ) , and v1 , , vn V are distinct variables then S Pv ( p, t1 , , tn ) Fr ( Ps, Fs, Prs, V ) – if f Pt ( Ps, Fs, Prs,V ) and p, q Fr ( Ps, Fs, Prs,V ) then { p} f {q} FHFr ( Ps, Fs, Prs,V ) After syntax and semantics have been defined, we need to specify the interpretation mappings, assuming that interpretation mappings for the predicate symbols I Ps : Ps Pr V , A , functional symbols I Fs : Fs FnV , A , and program symbols I Prs : Prs FPrg V , A are given. Let J Fr : Fr ( Fs, Ps, Prs, V ) Pr V , A denote an inter- pretation mapping for formulas, J Tr : Tr ( Fs, Ps, Prs,V ) FnV , A denote an interpreta- tion mapping for terms and J Pt : Pt ( Fs, Ps, Prs,V ) Prg V , A denote an interpretation mapping for programs. They are all defined in a natural way, only the case with asser- tion needs special consideration: J FHFr ({ p} f {q}) FH ( J Fr ( p), J P t ( f ), J Fr (q )) . An assertion is said to be valid (denoted | { p} f {q} ) if a corresponding predicate is not refutable. 3 Monotonicity and Continuity of the Floyd-Hoare Composition In the previous section, a function-theoretic style of composition definitions was used. To prove properties of the FH-composition, it is more convenient to use a set- theoretic style of definition. The following sets are called respectively truth, false, and undefiniteness domains of the predicate p over D: pT {d | p(d ) T } , p F {d | p(d ) F } , p {d | p(d ) } . The following definitions introduce various images and preimages involved in Floyd-Hoare composition: q T , f f 1[qT ] , q F , f f 1[q F ] , q , f f 1[q ] , p T , f f [ pT ] , pF, f f [ pF ] , p , f f [ p ] . Using these notations we can define FH-composition by describing the truth and false domains of the predicate that is the value of the composition: 542 A. Kryvolap, M. Nikitchenko and W. Schreiner FH ( p, f , q)T p F q T , f , FH ( p, f , q) F pT q F , f . Validity of formulas (predicates) is considered as irrefutability, that is | p p F . From this follows that | FH ( p, f , q ) pT q F , f . Let us give a formal definition of the monotone composition. Composition C : ( FPrg V , A )n ( PrV , A ) k ( FnV , A )m PrV , A is called monotone if the following condition holds for all arguments of C: f1 g1 , , f n g n , p1 q1 , , pk qk , a1 b1 , , am bm C ( f1 , , f n , p1 , , pk , a1 , , am ) C ( g1 , , g n , q1 , , qk , b1 , , bm ) . Theorem 1. Floyd-Hoare composition is monotone on every argument. Let us prove monotonicity on every argument separately, examining their truth and false domains. For truth domain we have: p1 p2 p1T p2T p1T q F , f p2T q F , f FH ( p1 , f , q) F FH ( p2 , f , q) F . Similar, for the false domain of the precondition we have: p1 p2 p1F p2F p1F q T , f p2 F q T , f FH ( p1 , f , q)T FH ( p2 , f , q)T . Thus, p1 p2 FH ( p1 , f , q) FH ( p2 , f , q) . In the case of truth domain of postcondition the proof is similar: q1 q2 q1T q2T q1T , f q2T , f p F q1T , f p F q2 T , f FH ( p, f , q1 )T FH ( p, f , q2 )T . The same for the false domain of postcondition: q1 q2 q1F q2F q1 F , f q2 F , f pT q1 F , f pT q2 F , f FH ( p, f , q1 ) F FH ( p, f , q2 ) F . Thus, q1 q2 FH ( p, f , q1 ) FH ( p, f , q2 ) . Let us show the monotonicity of the truth domains for the FP-composition: f1 f 2 q T , f1 q T , f2 p F q T , f1 p F q T , f2 FH ( p, f1 , q )T FH ( p, f 2 , q)T . Similar, for the false domains: f1 f 2 q F , f1 q F , f2 pT q F , f1 pT q F , f2 FH ( p, f1 , q ) F FH ( p, f 2 , q ) F . Also f1 f 2 FH ( p, f1 , q) FH ( p, f 2 , q) . Thus, it was shown that the composition is monotone on every component, what is needed to be proved. For the constructed composition even stronger result is true, it is continuous. To show this, the following definitions are made and the notion of continuity is given (see, for example, [6]). Program Algebras with Monotone Floyd-Hoare Composition 543 An infinite set of indexed functions (predicates) { f 0 , f1 ,}, f i f i 1 , i is called a chain of functions (predicates). The supremum of the above-mentioned set of indexed functions (predicates) is called limit of the chain of functions (predicates), denoted as f i . i The composition C : ( Prg ) ( Pr V ,A n ) ( Fn V,A m ) Pr V ,A l V,A is called continuous on the first argument if for arbitrary chain { fi | i } the following property holds: C ( f i , g 2 , , g n , p1 , pm , q1 , ql ) C ( f i , g 2 , , g n , p1 , pm , q1 , ql ) . i i Continuity on the other arguments is defined in a similar manner. Theorem 2. Floyd-Hoare composition is continuous on every argument. Though this result follows from the general consideration, we give here its direct proof. Let us show the continuity on the first argument. In the case of other arguments the proof will be similar. Consider a chain of predicates { pi | i } . Since Floyd-Hoare composition is monotone, {FH ( pi , f , q) | i } will also be a chain. We need to show that FH ( pi , f , q ) FH ( pi , f , q ) . i i For the arbitrary data d , there are two different possibilities – pi (d ) and pi (d ) . In the first case none of the elements of the chain in de- i i fined on d . Thus j , FH ( pi , f , q )( d ) FH ( p j , f , q )( d ) , therefore needed i equality is obvious. If the limit is defined on these data, an element of the chain that is also defined on this data could be found. Otherwise the limit would have been unde- fined on those data, what is guaranteed by the inclusion relation on the elements of the chain. Let the limit be the element with index k. Then FH ( pi , f , q)(d ) = FH ( pk , f , q)(d ) and i FH ( pk , f , q )(d ) FH ( pi , f , q )(d ) , i since i k , pi (d ) pk (d ) from the definition of the chain. The following equality is obtained: FH ( pi , f , q )( d ) FH ( pi , f , q )( d ) . i i Since the data was chosen arbitrary, we get FH ( pi , f , q) FH ( pi , f , q ) , what i i was needed to be proved. The proof for the other arguments (a program and a postcondition) is similar. Thus, it is proven that the monotone Floyd-Hoare composition is also continuous on every argument. 544 A. Kryvolap, M. Nikitchenko and W. Schreiner 4 Soundness of Inference Rules System in Floyd-Hoare Algebras In this section we adopt the same convention as earlier that we do not distinguish between syntactic and semantic notation for formulas. We also assume that the alge- bra QPA(V, A) is fixed and interpretation mappings are also fixed. Since a result of the Floyd-Hoare composition can be undefined on some data, classical inference rules can be unsound. This informally means that with true pre- conditions they could give false postconditions. This happens because predicates can be partial and compositions are defined in a way that differs from the classical Floyd- Hoare composition to be monotone. Let us examine the following system of inference rules to find out what conditions are required for rules to be sound: {S [ x ] ( p, f )}AS x ( f ){ p} – Ax_AS { p}id{ p} – Ax_ID { p} f {q},{q}g{r} – Ax_SEQ { p} f g{r} {b p} f {q},{b p}g{q} – Ax_IF { p}IF (b, f , g ){q} {b p} f { p} – Ax_WH { p}WH (b, f ){b p} { p } f {q } – Ax_CONS { p} f {q} Note that we do not include additional conditions for the consequence rule, be- cause in different classes of algebras we will have different conditions. An assertion { p} f {q} is said to be derived if there exists its derivation tree with rules of the type Ax_AS, Ax_ID on its leaves. Derivability is denoted as | { p} f {q} . Let us show that for the rules Ax_SEQ, Ax_WH , and Ax_CONS without addi- tional conditions we can give such an example of the application of the inference rule that will have true preconditions and false postconditions. Consider Ax_SEQ with violation of the condition pT , f qT . If this condition fails then p, q, f , d : p (d ) T , q ( f (d )) ,| { p} f {q} . In this case we will take such r and g that | {q}g{r} and g ( f (d )) , r ( g ( f (d ))) F . This is possible if we define them in the following way: T , x f (d ), g id , r ( x) F , x f (d ). Then | { p} f g{r} does not hold, while p(d ) T and r ( f g (d )) F , what is equal to {d } pT r F , f g . Consider Ax_WH with violation of the condition (b p )T , f pT . We will construct such b , f , and p that the following properties hold: Program Algebras with Monotone Floyd-Hoare Composition 545 | {b p} f { p}, (b p )T , f pT , | { p}WH (b, f ){b p} . Let d1 d 2 d3 . Then b , f , and p are defined in the following manner: T , x d3 , b( x ) F , x d3 . x, x d1 , d 2 , f ( x) d 2 , x d1 , d ,x d . 3 2 T , x d 2 , d 3 , p ( x ) , x d 2 , F, x d . 3 It is not hard to check that the above-mentioned properties are not satisfied: d 2 (b p)T , f , d 2 pT , d1 pT , d3 WH (b, f )(d1 ) , d3 (b p) F d1 (b p) F ,WH (b, f ) , d1 (b p )T , d 2 p , d3 p F . Thus, | {b p} f { p} because for other data p is true. That proves that the additional condition is necessary because in other cases the rule is not sound while used on such examples. The case with the rule Ax_CONS is similar to the previous one with the conditions pT pT , q F qF . So, it was shown that additional conditions are not redundant. Let us show that if additional conditions hold then the rules are sound. Theorem 3. Inference rules are sound with additional conditions. In other words: | {S [ x ] ( p, f )} AS x ( f ){ p} , | { p}id{ p} , | { p} f {q} | {q}g{r} pT , f qT | { p} f g{r} , | {b p} f {q} | {b p}g{q} | { p}IF (b, f , g ){q} , | {b p} f { p} (b p)T , f pT | { p} WH (b, f ){b p} , | { p } f {q} pT pT q F qF | { p} f {q} . Let us prove this for each rule. For | {S [ x ] ( p, f )} AS x ( f ){ p} to hold it is needed that the following condition x holds: FH ( S [ x ] ( p, f ), AS x ( f ), p) F ( S [ x ] ( p, f ))T p F , AS ( f ) . Assume that it is false and the intersection is not empty. Let some data d belongs to the intersection. If d ( S [ x ] ( p, f ))T then p(d [ x f (d )]) T . 546 A. Kryvolap, M. Nikitchenko and W. Schreiner x Let d p F , AS ( f ) then p( AS x ( f )(d )) p(d [ x f (d )]) F , what is impossi- x ble, thus, the assumption is incorrect and ( S [ x ] ( p, f ))T p F , AS ( f ) , similar, x ( S [ x ] ( p, f ))T , AS ( f ) p , what means | {S [ x ] ( p, f )} AS x ( f ){ p} . | { p}id{ p} follows from the definition. Let us prove | { p} f {q} | {q}g{r} pT , f qT | { p} f g{r} . We have | { p} f {q},| {q}g{r} that means pT q F , f ; qT r F , g . We need to show that pT r F , f g . Let it be false and d : d pT d r F , f g . This means that p(d ) T r ( f g (d )) F . But using the additional condition we have pT , f qT , thus q( f (d )) T . That means f (d ) qT , then f (d ) r F , g . This contradicts the fact that F ,g r ( f g (d )) F f g (d ) r f (d ) r F . We have the contradiction, which means that the assumption is wrong and pT r F , f g . Then | { p} f g{r} . Let us prove | {b p} f {q} | {b p}g{q} | { p}IF (b, f , g ){q} . We have | {b p} f {q}, | {b p}g{q} , which means: (b p)T q F , f ;(b p)T q F , g . We need to show that pT q F , IF (b, f , g ) . Let d : d pT d q F , IF (b , f , g ) . Then p(d ) T , q( IF (b, f , g )(d )) F . Let us examine different cases of b(d ) : b(d ) is impossible, because then IF (b, f , g )(d ) leads to a contradiction with assumptions about existence of such d . b(d ) T IF (b, f , g )(d ) f (d ) d (b p)T IF (b, f , g )(d ) (b p )T , f . With properties of the upper part of the inference rule we have: (b p )T q F , f IF (b, f , g )(d ) f (d ) d (b p )T d q F ,( IF (b , f , g ) . A case with b(d ) F is similar to the case where b(d ) T . (b p)T q F , g IF (b, f , g )(d ) g (d ) d (b p)T d q F ,( IF (b, f , g ) . Thus d q F ,( IF (b, f , g ) in any case if d is defined which is guaranteed by the as- sumption. That leads us to the contradiction, so, pT q F , IF (b , f , g ) . Thus, we have | { p}IF (b, f , g ){q} . Let us prove | {b p} f { p} (b p)T , f pT | { p}WH (b, f ){b p} . We have | {b p} f { p} , that means: (b p )T p F , f . We need to show that the following condition holds: pT (b p) F ,WH (b , f ) . Program Algebras with Monotone Floyd-Hoare Composition 547 Let d : d pT d (b p) F ,WH (b , f ) . Then d n : d n WH (b, f )(d ) , and by the definition of the composition we have d1 , d 2 , d n : d d1 di 1 f (di ) , i 1, n 1 b(d j ) T , j 1, n 1 p(d1 ) T b(d n ) F and (b p)(d n ) F . Thus, (b p)(d1 ) T . By (b p )T , f pT we obtain that p(d 2 ) p( f (d1 )) T . Using the induction over a number of loop execution we obtain that p(d n ) T . That means (b p)(d n ) T . Thus, we obtained contradiction, so, F ,WH ( b , f ) p ( b p ) T . Let us prove | { p} f {q} pT pT q F qF | { p} f {q} . We have | { p} f {q} , what means: p T q F , f . We need to show that pT q F , f . Let this condition be false and d : d pT d q F , f . This means that p(d ) T q( f (d )) F . By the condition that pT p T we obtain p(d ) T . But pT q F , f , thus, d q F , f , while d q F , f , then f (d ) , and we have q( f (d )) F . We have a contradiction which means that the assumption does not hold, so, pT q F , f . Both conditions are proved, then | { p} f {q} . Thus, all rules are inspected and theorem is proved. Also the condition for the rule Ax_SEQ can be substituted by one of the following: pT , f q , q , g r F or q F , g r F , but none of them is a sufficient condi- tion, because (| { p} f {q} | {q}g{r} | { p} f g{r}) pT , f qT doesn’t hold. Similar for the rule Ax_WH , the condition could be given in the one of the follow- ing manner: (b p)T , f p , (b p) , f p F or (b p) F , f p F , and they are also insufficient. The conditions for the rule Ax_CONS also are not sufficient. To prove that we need only to show an example when the condition does not hold but the rule does. But in some cases we can avoid adding the conditions implicitly to the rules. Theorem 4. For all assertions { p} f {q} that were inferred using rules of the infer- ence system except Ax_CONS the following properties hold: pT , f qT , pT , f q , p , f q F , pF , f qF . Let us prove the first property by induction. For the fourth property the case is similar and second and third properties are consequences of the first and the fourth respectively. 548 A. Kryvolap, M. Nikitchenko and W. Schreiner Induction base: for Ax_ID and Ax_AS proof is obvious. Induction step. For Ax_SEQ we have: pT , f q T q T , g r T pT , f g r T . The proof of this fact is obvious. For Ax_IF we need to prove: (b p)T , f qT (b p)T , g (q)T pT , IF (b , f , g ) qT . Consider d pT , IF (b , f , g ) , then x : ( IF (b, f , g )( x) d ) p( x) , that leads to two cases: – b( x) T , then f ( x) qT , moreover d IF (b, f , g )( x) f ( x) , thus, d qT ; – b( x) F , then g ( x) qT , moreover d IF (b, f , g )( x) g ( x) , thus, d qT . For Ax_WH we need to prove: (b p)T , f pT pT ,WH (b, f ) (b p)T . Let d pT ,WH (b, f ) , then x : (WH (b, f )( x) d ) p( x) , we need to prove that (b p)(d ) T . Let us examine all data that are obtained during the calculation of WH (b, f )( x) : x x0 ; x1 f ( x0 ) ... d xn , b( x0 ) b( x1 ) b( xn 1 ) T , b( xn ) F , thus from (b p)T , f pT we have, that p(d ) p( xn ) T , this together with b( xn ) F gives (b p)(d ) T , what was needed to prove. The theorem is proved. Theorem 3 and Theorem 4 together give us the fact that if we declare Ax_CONS in such a way that it retains the properties of the theorem 4, then inference rules system will be sound without addition of new conditions, which will be guaranteed by Theo- rem 4. But in this case system would not be complete. Let us give an example. Let q be an arbitrary predicate that has nonempty truth, false, and undefiniteness domains, and p be such predicate that pT qT q . Then | { p}id{q} , but ( pT ,id qT ) , when the inference rules system was constructed for the following property to hold: | { p} f {q} pT , f qT . 5 Conclusions In this paper special program algebras of partial quasiary mappings have been de- scribed. Such algebras form a semantic base for a modified Floyd-Hoare logic. In this case assertions have been presented by a special composition called Floyd-Hoare composition. Monotonicity and continuity of this composition have been proved. The language of the modified Floyd-Hoare logic has been described. Further, the inference rules for such a logic have been studied and their soundness conditions have been specified. The logic constructed can be used for program verification. The major directions of further investigation are the question of completeness of the system of inference rules, invariants for rules, and types for variables and func- tions. Also the authors plan to construct a prototype of a program system in the style of [8, 9] oriented on the constructed logics. Program Algebras with Monotone Floyd-Hoare Composition 549 References 1. Floyd, R. W.: Assigning Meanings to Programs. In: Proc. American Mathematical Society Symposia on Applied Mathematics, vol. 19, pp. 19–31 (1967) 2. Hoare, C. A. R.: An Axiomatic Basis for Computer Programming. Comm. ACM, 12, 576– 580, 583 (1969) 3. Nikitchenko, M. S., Shkilniak, S. S.: Mathematical Logic and Theory of Algorithms. Pub- lishing house of Taras Shevchenko National University of Kyiv, Kyiv (2008) (in Ukrain- ian) 4. Nikitchenko, M., Tymofieiev, V.: Satisfiability and Validity Problems in Many-Sorted Composition-Nominative Pure Predicate Logics. In: V. Ermolayev et al. (eds.): ICTERI 2012, CCIS 347, pp. 89–110. Springer Verlag, Berlin Heidelberg (2013) 5. Nikitchenko, M. S., Tymofieiev, V. G.: Satisfiability in Composition-Nominative Logics. Central European Journal of Computer Science, 2(3), 194–213 (2012) 6. Nielson. H.R., Nielson, F.: Semantics with Applications: A Formal Introduction. John Wiley & Sons Inc. (1992) 7. Avron, A., Zamanskym A.: Non-Deterministic Semantics for Logical Systems. Handbook of Philosophical Logic, vol. 16, pp. 227–304 (2011) 8. Schreiner, W.: Computer-Assisted Program Reasoning Based on a Relational Semantics of Programs. In: P. Quaresma and R.-J. Back (eds.) Proc 1st Workshop on CTP Components for Educational Software (THedu'11), July 31 2011, Wrocław, Poland, No 79 of Electronic Proceedings in Theoretical Computer Science (EPTCS), ISSN: 2075-2180, pp. 124–142 (2012) 9. Schreiner, W.: A Program Calculus Technical Report. Research Institute for Symbolic Computation (RISC), Johannes Kepler University, Linz, Austria, http://www.risc.uni- linz.ac.at/people/schreine/papers/ProgramCalculus2008.pdf (2008) A Formal Model of Resource Sharing Conflicts in Multithreaded Java ? Nadezhda Baklanova and Martin Strecker Institut de Recherche en Informatique de Toulouse (IRIT), Université de Toulouse {nadezhda.baklanova,martin.strecker}@irit.fr Abstract. We present a tool for analysing resource sharing conflicts in multithreaded Java programs. We consider two models of execution: purely parallel one and sequential execution on a single processor. A Java program is translated into a system of timed automata which is verified by the model checker Uppaal. We also present our work in progress on formalisation of Real-Time Java semantics and the semantics of timed automata. Keywords. resource sharing, Java, timed automata, model checking Key terms. Model, Development, ConcurrentComputation, FormalMethod, QualityAssuranceProcess 1 Introduction Along with increasing usage of multithreaded programming, a strong need of sound algorithms arises. The problem is even more important in programming of embedded and real-time systems where liveness conditions are extremely im- portant. To certify that no thread would starve or would be deadlocked, lock-free and wait-free algorithms have been developed. Lock-free algorithms do not use critical sections or locking and allow to avoid thread waiting for getting access to a mutual exclusion object. Nevertheless, only one thread is guaranteed to make progress. Wait-free algorithms prevent starvation by guaranteeing a stronger property: all threads are guaranteed to make progress, eventually. Such algo- rithms for linked lists, described for example in [4,11], are very complex, difficult to implement and, consequently, hard to verify. What is worse, these algorithms seem to be incompatible with hard real-time requirements: the progress guarantees are not bounded in time. Thus, a lock-free insertion of an element into a linked list by a thread may need several (possi- bly infinitely many) retries because the thread can be disturbed by concurrent threads. Under these conditions, it is not possible to predict how much time is needed before the thread succeeds. ? Part of this research has been supported by the project Verisync (ANR-10-BLAN- 0310) A Formal Model of Resource Sharing Conflicts in Multithreaded Java 551 Critical sections are used in many applications in order to ensure concurrent access to objects although if the scheduling order is wisely planned, locks are not necessary since threads access objects at different moments of time. This is the motivation for our work: we develop a tool for checking resource sharing conflicts in concurrent Java programs based on the statement execution time. This gives a “time-triggered” [6] flavor to our approach of concurrent system design: resource access conflicts are resolved by temporal coordination at system assembly time, rather than during runtime via locking or via retries (as in wait-free algorithms). We assume that a program is annotated with WCET information known from external sources. The checker translates a Java program into a timed automaton which is then model checked by a tool for timed automata (concretely, Uppaal). In this paper, after an informal introduction (Section 2), we present a formal semantics of the components of the translation, namely Timed Automata (Sec- tion 3) and a multi-threaded, timed version of Java (Section 4). Then we describe the mechanism of the concrete translator written in OCaml (Section 5) and give some preliminary correctness arguments (Section 6) – the formal verification still remains to be done. The conclusions (Section 7) discuss some restrictions of our current approach and possibilities to lift them. Status of the present document: We give a glimpse at several aspects of our formalisation, which is far from being coherent. Therefore, this paper is rather a basis for discussion than a finished publication. 2 Informal Overview To show the main idea, we present an example of a concurrent Java program. It is a primitive producer-consumer buffer with one producer and one consumer where both producer and consumer are invoked periodically. The program is annotated with information about statement execution time in //@ ... @// comments. private class Run1 implements Runnable { public void run () { int value , i ; // @ 1 @ // i =0; while (i <10) { synchronized ( res ) { // @ 2 @ // value = Calendar . getInstance () . get ( Calendar . MILLISECOND ) ; // @ 5 @ // res . set ( value ) ; } Thread . sleep (10) ; // @ 2 @ // i ++; 552 N. Baklanova and M. Strecker } } } private class Run2 implements Runnable { public void run () { int value , i ; // @ 1 @ // i =0; Thread . sleep (9) ; while (i <10) { synchronized ( res ) { // @ 4 @ // value = res . get () ; } Thread . sleep (8) ; // @ 1 @ // i ++; } } } One of the possible executions is shown in Figure 1. 01 8 20 Fig. 1: Possible execution flow. Black areas represent execution without locks, blue and green areas - execution within a critical section, grey areas - sleeping, white areas - waiting for processor time. After having translated this program to a system of timed automata we run the Uppaal model checker to determine possible resource sharing conflicts. The checked formula is A[]∀(i : int[0, objN umber − 1])∀(j : int[0, autN umber − 1])waitSet[i][j] < 1, (1) where waitSet is an array of boolean flags indicating whether a thread waits for a lock of a particular object. If all array members in all moments of time are false, no thread waits for a lock therefore no resource sharing conflicts are possible. A Formal Model of Resource Sharing Conflicts in Multithreaded Java 553 3 Timed Automata Model Timed automata are a common tool for verifying concurrent systems; the under- lying theory is described in [1]. We formalize the basic semantics of an extension of timed automata used in the Uppaal model checker. The formalized syntax and semantics are adapted from [3]. An automaton edge is composed of a starting node, a condition under which the edge may be taken (guard), an action, clocks to reset and a final node. type-synonym ( 0n, 0a) edge = 0n × cconstr × 0a list × id set × 0n An invariant is a condition on a node which must be satisfied when an au- tomaton is in this node. type-synonym 0n inv = 0n ⇒ cconstr An automaton consists of a set of nodes, a starting node, a set of edges and an invariant function. type-synonym ( 0n, 0a) ta = 0n set × 0n × ( 0n, 0a) edge set × 0n inv We use the model checker Uppaal which proposes an extension of classical timed automata with variables. A full state comprises a node and a valuation function. There are two types of variables: integer (aval ) and boolean (bval ), and clock variables which have a special semantic status in timed automata. The available transitions can be defined knowing given this state. record valuation= aval :: id ⇒nat bval :: id ⇒bool cval :: id ⇒time type-synonym 0n state = 0n × valuation A timed automaton can perform two types of transitions: timed delay and edge taking. If an automaton takes an edge, variables may be updated or clocks may be reset to 0. The Transition constructor takes a list of variable and clock updates as an argument. datatype 0a ta-action = Timestep time | Transition 0a list Accordingly, the timed automata semantics has two rules: delay and tran- sition. If an automaton is delayed, it stays in the same node, and values of all clocks are increased to the value d. The invariant of the current node must be satisfied. If an automaton takes a transition, the node is changed, and clock values remain unchanged unless they are reset to 0. The invariants of both starting and final nodes must be satisfied as well as the guard of the taken transition. If the action of the transition involves variable updates, the valuation function is updated as well. This is done by the function application eval-stmts varv a. 554 N. Baklanova and M. Strecker varv 0 = varv (|cval := add (cval varv ) d |) varv |= invs l varv 0 |= invs l l ∈ nodes (nodes, init, edges, invs) ` (l , varv ) −Timestep d → (l , varv 0) (l , g, a, r , l 0) ∈ edges varv |= g varv 0 = eval-stmts varv a(|cval := reset (cval varv ) r |) varv 0 |= invs l 0 l ∈ nodes l 0 ∈ nodes (nodes, init, edges, invs) ` (l , varv ) −Transition a→ (l 0, varv 0) A network of timed automata is a product automaton with exceptions in case of handshaking actions [2]. Handshaking allows to synchronize two automata so that both take an edge simultaneously. (s 1 , g 1 , a, cs 1 , s 1 0) ∈ edges 1 (s 2 , g 2 , a, cs 2 , s 2 0) ∈ edges 2 a ∈ getEdgeActions edges 1 ∩ getEdgeActions edges 2 ((s 1 , s 2 ), g 1 d∧e g 2 , a, cs 1 ∪ cs 2 , s 1 0, s 2 0) ∈ edges-shaking edges 1 edges 2 (s 1 , g, a, cs, s 1 0) ∈ edges 1 a ∈ getEdgeActions edges 1 a ∈ / getEdgeActions edges 2 s 2 ∈ getEdgeNodes edges 2 ((s 1 , s 2 ), g, a, cs, s 1 0, s 2 ) ∈ edges-shaking edges 1 edges 2 (s 2 , g, a, cs, s 2 0) ∈ edges 2 a ∈ getEdgeActions edges 2 a ∈ / getEdgeActions edges 1 s 1 ∈ getEdgeNodes edges 1 ((s 1 , s 2 ), g, a, cs, s 1 , s 2 0) ∈ edges-shaking edges 1 edges 2 4 Java Model The look of the Java semantics has been inspired by the Jinja project [5] and its multithreaded extension JinjaThreads [7]. The Java execution flow is modeled by three transition relations: evaluation, scheduler and platform. The evaluation semantics is the semantics of a single thread, the scheduler semantics is respon- sible for thread scheduling, and the platform semantics formalizes the notion of time advancement. Taking into account the passage of time is the essential increment wrt. the above-mentioned semantics. Following Jinja, we do not distinguish expressions and statements; their datatype is the following: datatype expr = Val val | Var vname | VarAssign vname expr | Cond expr expr expr | While expr expr A Formal Model of Resource Sharing Conflicts in Multithreaded Java 555 | Annot annot expr | Sync expr expr | Sleep expr ... and others. The system state is large and complex; it stores local information of all threads such as local variable values and expression to be evaluated, shared objects, time, actions to be carried out by the platform, locks and wait sets. Given this state and scheduler logic, the further execution order is deterministic. record full-state = threads :: id ⇒ schedulable option — threads pool th-info :: thread-id ⇒ thread-state option — expression to be evaluated and state sc-info :: schedule-info — locks and thread statuses pl-info :: platform-info — global time gl-info :: heap — heap state ws-info :: waitSet — wait sets running-th :: thread-id — currently running thread pending-act :: action — action to be carried out by platform Evaluation semantics depends solely on local and heap variables therefore we use the reduced variant of full state for thread evaluation step. record eval-state = ev-st-heap :: heap ev-st-local :: locals When a thread expression is reduced, the duration of the performed action is not taken into account on the evaluation step, so the action type is passed further to the platform step where time advances according to the action. For now we assume that any action of a particular type takes a fixed amount of time for execution. The evaluation rules take an expression and a local state and translate them to the new pair of expression and state and also emit an action for the platform step. Here are some examples of evaluation rules. fs ` (e, s) −act→ (e 0, s 0) fs ` (VarAssign vr e, s) −act→ (VarAssign vr e 0, s 0) fs ` (VarAssign vr (Val vl ), s) −EvalAct (exec-time VarAssignAct)→ (Val Unit, s(|ev-st-local := ev-st-local s(vr 7→ vl )|)) Several rules use information about locks and wait sets that is not included in the local state therefore they pull it from the full state. ¬ locked a fs fs ` (Sync (Val (Addr a)) e, s) −lock-action a fs 0 → (Sync (Val (Addr a)) e, s) 556 N. Baklanova and M. Strecker locked a fs fst (the (sc-lk-status (sc-info fs) a)) 6= runnint-th fs fs ` (Sync (Val (Addr a)) e, s) −lock-action a fs 0 → (Throw [ResourceSharingConflict], s) locked a fs fs ` (Sync (Val (Addr a)) (Val v ), s) −unlock-action a fs 0 → (Val Unit, s) 5 Abstracting Java to Timed Automata We consider two models of program execution. The first one is purely parallel, i.e. each thread is assumed to execute on its own processor so that no thread waits for CPU time. Another model is the sequential one when a program executes on a single processor. The parallel model is easier, however, it does not seem to be realistic. The sequential model represents the realistic situation for real-time Java applications since the RTSJ (Real-Time Specification of Java [8]) specifies the behavior for monoprocessor systems only. The translated Java programs must be annotated with timing information about execution time of the following statement. The translation uses timing annotations to produce timed automata which model the program. The obtained system is model checked for possible resource sharing conflicts. 5.1 General principles We suppose that the translated program has a fixed number of threads and shared fields, all of them defined statically. The initialization code for threads and shared fields must be contained in the main method. The classes implement- ing Runnable interface must be nested classes in the class containing the main method. The required program structure is shown in the figure 2. Each thread created in the program is translated into one automaton, and one more additional automaton modeling the Java scheduler is added to the generated system. Java statements are translated into building blocks for condition statement, loop etc. which are assembled to obtain the final automaton. Annotated state- ment is translated into its own block. Method calls and wait/notify statements are not translated for now. The timed automata system contains an array of object monitors representing acquired locks on shared objects. When a thread acquires a lock of an object, the monitor corresponding to this object is incremented, and when the lock is released, the monitor is decremented. There is a number of checks which are performed before on the program source code which guarantee correctness of the generated model. One of the most critical is the requirement that the whole parse tree must be annotated, i.e. for any leaf of the AST there is a timing annotation somewhere above this leaf. With this requirement the behavior of the generated system can be determined in each moment of time. A Formal Model of Resource Sharing Conflicts in Multithreaded Java 557 public class Main { Res1 field1 ; // shared fields declaration public static void main ( String [] args ) { Run1 r1 ; // declarations of Runnable object instances Thread t1 , t2 ; // thread declarations r1 = new Run1 () ; // Runnable objects initialization field1 = new Res1 () ; // shared fields initialization t1 = new Thread ( null , r1 , " t1 " ) ; // thread creation t1 . start () ; // thread start } private class Run1 implements Runnable { public void run () { // thread logic implementation ... } } } private class Res1 { // resouce classes ... } Fig. 2: Required program structure 5.2 Parallel model In the parallel model threads are supposed not to wait for processor time if they want to execute a statement. However, threads can wait for a lock if they need one which has been taken by another thread. Since this model is not very realistic we concentrate on the sequential model further. 5.3 Sequential model In the sequential model we assume that at every moment of time only one thread or scheduler can execute. Threads which do not execute in a particular moment of time wait for processor time. Also threads can wait for a lock; waiting does not consume CPU time. Automata communicate with the scheduler through channels: if the scheduler has selected one thread, it sends a message to it so the thread starts executing. After finishing its execution, the thread sends a message to the scheduler, and the next scheduling cycle starts. The scheduler uses channels run[i] to call the i-th automaton, and the automata use the channel scheduler to give the control back to the scheduler. There is an array of clocks c[i], each of them corresponding to one thread automaton. These clocks are used to calculate time of annotated statements execution or sleeping time. There is also one clock cGlobal used for tracking global time. 558 N. Baklanova and M. Strecker START start1 START run[i]? run[i]? execTime[i]= , c[i]=0 final1_start2 MIDDLE MIDDLE c[i]<=execTime[i] c[i]>=execTime[i] schedule! schedule! execTime[i]=0 FINAL final2 FINAL (a) Assignment (b) Sequence (c) Annotation auxIf start1 auxIf start1 schedule! schedule! run[i]? run[i]? START final1 START final1_final2 run[i]? run[i]? schedule! schedule! auxElse auxElse start2 (d) Condition I (e) Condition II Fig. 3: Building blocks for automata. Elements added on the current step are red; blue and green elements have been generated in the previous step. The building blocks and their translation are the following for the sequential model: (a) Assignment (3a). Three new states and two transitions between them are added. The transition from START to MIDDLE listens to the channel run[i], and the transition from MIDDLE to FINAL calls the channel schedule. The state MIDDLE is urgent since we assume that any statement except the an- notated one takes time for execution. (b) Sequence (3b). Having two automata with start and final states called start1, start2 and final1, final2 correspondingly, the states final1 and start2 are merged. (c) Annotation (3c). Three new states and two transition between them are added. The transition from START to MIDDLE listens to the channel run[i], sets the variable execTime[i] to the value of the current annotation and resets the clock c[i] to 0. The transition from MIDDLE to FINAL calls the channel schedule and resets the variable execTime[i] back to 0. The state MIDDLE has an invariant forbidding the automaton to stay in this state if the value of the clock c[i] bypasses execTime[i]. The transition from MIDDLE A Formal Model of Resource Sharing Conflicts in Multithreaded Java 559 start1 final1 run[i]? schedule! _monitor[j]-- schedule! auxIn auxOut auxLoop schedule! _monitor[j]==0 run[i]? run[i]? start1 _monitor[j]++, waitSet[j][i]=0 FINAL START START schedule! run[i]? auxWait _monitor[j]>0 FINAL run[i]? auxEnd waitSet[j][i]=true schedule! (a) Lock (b) Loop MIDDLE c[i]>=execTime[i] schedule! run[i]? execTime[i]=0 START auxSleep auxWake FINAL run[i]? schedule! execTime[i]= , c[i]=0 (c) Sleeping Fig. 4: Building blocks for automata. Elements added on the current step are red; blue and green elements have been generated in the previous step. to FINAL has a guard enabling this transition only if the value of c[i] is greater or equal to execTime[i]. The invariant and the guard ensure that the automaton would be in the MIDDLE state as long as the annotation claims. (d) Condition (3d,3e). Three new states and four transitions are added. If both if and else branches are presented, the final states of automata representing the branch internals are merged. The states auxIf and auxElse are auxiliary states introduced to divide listening and calling transitions therefore they are made committed. Two transitions from START to auxIf and auxElse listen to the channel run[i], and the transitions from auxIf to start1 and from auxElse to start2 (or to final1 in case of absence of the else branch) call the channel schedule. (e) Lock (4a). Two meaningful states and two auxiliary states are added. The transition from START listens to the channel run[i] and has a guard checking whether a lock for the object in the argument of the synchronized statement is not taken by other threads. 560 N. Baklanova and M. Strecker (f) Loop (4b). Two meaningful states and two auxiliary states are added. One transition from the START goes to the next loop iteration, another one exits the loop. Both transitions from START to auxLoop and auxEnd listen to the channel run[i]. The transitions from auxLoop to start1 and from auxEnd to FINAL call the channel schedule. Both auxLoop and auxEnd are made committed. The final state of the automaton corresponding to the loop body is merged with the START state. (g) Sleeping (4c). The automaton for sleep statement resembles the automa- ton for annotated statement with additional elements for returning control to the scheduler during sleeping. There are three meaningful and two aux- iliary states with transitions connecting them into a chain. The auxiliary states, auxSleep and auxWake, are committed. The transition from START to auxSleep listens to the channel run[i], sets the variable execTime[i] to the duration of sleeping period and resets the clock c[i] to 0. The transition from auxSleep to MIDDLE calls the channel schedule so that the scheduler can schedule other threads. The transition from MIDDLE to auxWake listens to the channel run[i] and has a guard enabling this transition only if c[i] is greater or equal to execTime[i]. The update on this channel resets the value of execTime[i] back to 0. Unlike the automaton for the annotated statement, there is no invariant in the MIDDLE state because a thread is not obliged to continue its execution right after it has woken up. It may wait for processor time before. The transition from auxWake to FINAL calls the schedule channel. 5.4 Scheduler The Java scheduler maintains thread statuses and grants permission to execute to threads. The scheduler model has three states: scheduling, runThread, wait. The scheduler starts in the state scheduling which has transitions for updating thread eligibility statuses. When all thread statuses are updated, the scheduler moves to the state runThread calling the channel run[i] for some thread with index i which is eligible for execution. While the thread is executing, the sched- uler stays in the state runThread. When the thread has finished its execution, it calls a channel schedule, and the scheduler returns back to the scheduling state, and the new scheduling cycle starts. If there was no thread eligible for execution, the scheduler goes to the wait state where it can stay for some time and repeat scheduling. Each thread gets two transitions for status updates. One assumes that a dead- line for an action performing by a thread has passed, another one assumes that the deadline has not been reached yet. In the first case the flag isEligible[i] is set to true, and the thread with index i can be scheduled for execution. Oth- erwise, isEligible[i] is set to false, and the thread with index i cannot be scheduled. A Formal Model of Resource Sharing Conflicts in Multithreaded Java 561 c[0]>=execTime[0]&&!isUpdated[0] c[1]>=execTime[1]&&!isUpdated[1] isEligible[0]=true, isEligible[1]=true, isUpdated[0]=true, isUpdated[1]=true, updateAllStatuses() updateAllStatuses() c[0] . A mapping from variables to sets of possible values is a domain. Some popular domains for constraint programming are: Boolean domains, where only true/false constraints apply (SAT problem) Integer domains, rational domains Linear domains, where only linear functions are described and analyzed (although approaches to non-linear problems do exist) Finite domains, where constraints are defined over finite sets Mixed domains, involving two or more of the above Finite domains are one of the most successful domains of constraint programming. In some areas (like operation research), constraint programming is often identified with constraint programming over finite domains. Definition 2 A domain d is a function mapping variables to sets of values, such that d(x) V. The set of all domains is Dom: = X → P (V). The set of values in d for a particular variable x, d(x), is called the variable domain of x. A domain d represents a set of assignments, a constraint, defined as con(d) := a Asn | x X : a(x) d(x) (3) Said that an assignment a con(d) is licensed by d. In our example, we can implement two types of domain realization. Each domain can be realized as a state of an agent, and be (or not) omitted by propagator during insertion, or other way – store all sets of domain in environment' state. Definition 3 A constraint satisfaction problem (CSP) is a pair of a domain d and a set of constraints C. The constraints C are interpreted as a conjunction of all c Є C and are thus equivalent to the constraint a Asn | c C : a c. The solutions of a CSP are the assignments licensed by d that satisfy all constraints in C, defined as sol(< d,C >) := a con(d) | c C : a c ) 2 Propagators The basis of a propagation-based constraint solver is a search procedure, which sys- tematically enumerates the assignments licensed by the domain d of a CSP . Implementation of Propagation-Based Constraint Solver in IMS 569 For each assignment, the solver uses a decision procedure for each constraint to de- termine whether the assignment is a solution of the CSP. Enumerating all assignments would be infeasible in practice, so in addition to the decision procedure, the solver employs a pruning procedure for each constraint, which may rule out assignments that are not solutions of the constraint. Two problems, decision, and pruning procedure for constraints implemented by propagators. Each propagator induces particular constraint. Propagator decides for a given assignment, whether it satisfies the induced constraint, and it can cut off (prune) these tasks from the domain that do not satisfy the constraint. Interleaving propaga- tion and search yield sound and complete procedure for solving CSP. It is complete, because only the assignments that are not solutions are pruned by propagators, and all other assignments are in enum. This is sound, because for each of these tasks, the propagators to decide whether it's the definition of solution. The formal definition of propagators author (see [1]), reflects the minimum properties that are needed in order to get a sound and complete solver. Thus, this model differs from the one commonly found in the literature. Furthermore, knowledge of the unique characteristics of the propagators induced constraints is new. The authors define the propagators in terms domains. A propagator is a function p that takes a domain as its argument and returns a stronger domain, it may only prune assignments. If the original domain was an as- signed domain {a}, the propagator either accepts it (p({a}) = {a}) or rejects it (p({a}) = 0), realizing the decision procedure for its constraint. In fact, each propagator in- duces a unique constraint, the set of assignments that it accepts. To make this setup work, we need one additional restriction. The decision procedure and the pruning procedure must be consistent: if the decision procedure accepts an assignment, the pruning procedure must never remove this assignment from any domain—this prop- erty is called soundness. Definition 4 A propagator is a function p Dom Dom that is: Contracting: p(d) d for any domain d Sound: for any domain d Dom and any assignment a Asn , if a d , then p( a ) p(d) The set of all propagators is Prop. If a propagator p returns a strictly stronger do- main (p(d) d) , we say that p prunes the domain d. The propagator p induces the constraint c p defined by the set of assignments accepted by p: c p := a Asn | p( a) = a (5) Soundness expresses exactly that the decision and the pruning procedure realized by a propagator are consistent. A direct consequence is that a propagator never re- moves assignments that satisfy its induced constraint. Focusing on our problem, we implement the idea of propagators in additional func- tions that will proceed the domains (as agent state or environment state) before inser- tion. Then after the insertion call other propagators to prune from their induced do- main unnecessary values to decreasing with each step the search field. 570 I. Ol. Blinov For abstracting the solution of the problem we should give the definition and de- scribing of propagation problem, as a higher model of solution of problems of given type. 2.1 Propagation Problem Propagators were defined as a refinement of constraints – each propagator induces one particular constraint, but in addition has an operational meaning, its pruning pro- cedure. It is possible to define the operational equivalent of a CSP, a propagation problem. Propagation problems realize all constraints of a CSP using propagators. Definition 5 A propagation problem (PP) is a pair of a domain d and a set of propagators P. The induced constraint satisfaction problem of a propagation prob- lem is the CSP d, c p | p P > . The solutions of a PP are the solu- tions of the induced CSP, sol(< d, P >) := sol(< d, c p | p P >) . The set of solutions of a PP d, P can be defined equivalently as sol(< d, P >) := a Asn | p P : p( a ) = a , just applying the definitions of in- duced constraints and solutions of CSPs. Solution of propagation problem make by using propagators, at each step of insert- ing an agent into environment. For this, as we mentioned earlier, we will inspect by propagator each domain that is stored in the attributes of the agent. Before each inser- tion, a domain stored in the attributes of the agent will checked by parameters gained while working. Let’s we look at insertion machine architecture Fig. 1. Insertion machine architecture The calculation of a insertion function can be tested for the ability to insert the propagator of the agent, which is inserted. If propagator exhausted domain that store in the agent, inserting step will rejected. Existence of strongest and weakest propagators. Propagators combine a deci- sion procedure with a pruning procedure. While the decision procedure determines the constraint a propagator induces, there is some liberty in the definition of the prun- Implementation of Propagation-Based Constraint Solver in IMS 571 ing, as long as it is sound. Thus, there are different propagators for the same con- straint, and they can be arranged in a partial order according to their strength: Definition 6 Let p1 and p2 be two propagators that induce the same constraint. Then p1 is stronger than p2 (written p1 p2 ) if and only if for all domains d, p1(d) p2 (d) . 2.2 Propagation as a Transition System Propagation as a Transition System. A propagation-based solver interleaves con- straint propagation and search, where constraint propagation means to prune the do- main as much as possible using propagators, before search resorts to enumerating the assignments in the domain. Propagating as much as possible means, in the context of propagation problems, to compute a mutual fixed point of all propagators. Transitions. Let be a propagation problem. If there is a propagator p P that can prune the domain d, that is, if p(d) d , then applying p yields a new, simpler propagation problem, . Soundness of p makes sure that the new problem has the same set of solutions as the original problem, sol(< d, P >) = sol(< p(d), P >) . A propagation problem thus induces a transition system, where a transition is pos- sible from a domain d to a domain d' d if there is a propagator p P such that p(d) = d′ . Written such a transition d | p d' (6) Definition 7 Let d be a domain. A transition d | p d' with a propagator p to a domain d′ is possible if and only if d′ = p(d) and d' d . The transition system of a propagation problem
consists of all the transitions that are possible with propagators p P , starting from d. A terminal domain, that is, a domain d such that there is no transition d | p p(d) for any propagator p P , is called stable. Written d d' if there is a sequence of transitions that transforms d into a stable domain d′. This sequence is empty, d d , if d is stable. The transition system of a propagation problem is non-deterministic, as there are many possible chains of propagation that result in a stable domain. Fixed points. The important theorem that ensures that constraint propagation is useful in practice is that, given a propagation problem , its transition system is finite and terminating. No matter in what order the propagators are applied, we reach a stable propagation problem after a finite number of steps. The naive approach to solving a propagation problem is to generate all as- signments a d , and then use the propagators p P to check whether a satisfies all constraints. This approach makes use of the fact that propagators realize decision procedures for their induced constraints, but does not use their pruning capabilities. A solver that proceeds naively in this fashion is said to follow the generate-and-test approach. 572 I. Ol. Blinov 3 Conclusion In conclusion, it can be said that the search for solutions by the generate-and-test approach is inefficient, so we will consider other options. Nevertheless, this option works well for prototyping, because of ease of implementation. In the future, we plan to create a working prototype of the university schedule, which plans to make a uni- versal, independent of the input parameters, the types of activities and a list of les- sons. The most effective solution to this problem now is supposed to use multi-layer environments, for pruning each of input domains by few environments and few propagators. References 1. Tack, G.: Constraint Propagation. Models, Techniques, Implementation. Saarbrucken (2009) 2. Letichevsky, A., Letichevskyi, O., Peschanenko, V., Blinov, I., Klionov, D.: Insertion Modeling System and Constraint Programming. In: Ermolayev, V. et al. (eds.) Proc. 7th Int. Conf. ICTERI 2011, Kherson, Ukraine, CEUR-WS vol. 716 (2011) 3. Gilbert, D.R., Letichevsky, A.A.: A Universal Interpreter for Nondeterministic Concurrent Programming Languages. Fifth Compulog Network Area Meeting on Language Design and Semantic Analysis Methods (1996) 4. Letichevsky, A., Gilbert, D.: A General Theory of Action languages. Cybernetics and Sys- tem Analyses, l(1), 16–36 (1998) 5. Letichevsky, A., Gilbert, D.: A Model for Interaction of Agents and Environments. In: D. Bert, C. Choppy, P. Moses, (eds.): Recent Trends in Algebraic Development Techniques. LNCS 1827, pp. 311–328, Springer Verlag, Berlin Heidelberg (1999) 6. Letichevsky, A.: Algebra of Behavior Transformations and its Applications. In: Kudryavtsev, V.B., Rosenberg, I.G. (eds) Structural theory of Automata, Semigroups, and Universal Algebra, NATO Science Series II. Mathematics, Physics and Chemistry, Springer , vol. 207, pp. 241–272 (2005) 7. Martin, G., Selic, B. (eds.): UML for Real: Design of Embedded Real-Time Systems. Kluwer Academic Publishers, Amsterdam (2003) 8. Letichevsky, A., Kapitonova, J., Letichevsky, A. Jr., Volkov, V., Baranov, S., Kotlyarov, V., Weigert, T.: Basic Protocols, Message Sequence Charts, and the Verification of Re- quirements Specifications. Computer Networks, 47, 662–675 (2005) 9. Kapitonova, J., Letichevsky, A., Volkov, V., Weigert, T.: Validation of Embedded Sys- tems. In: Zurawski, R. (ed.) The Embedded Systems Handbook, CRC Press, Miami (2005) 10. Letichevsky, A., Kapitonova, J., Volkov, V., Letichevsky, A. Jr., Baranov, S., Kotlyarov, V., Weigert, T.: System Specification with Basic Protocols. Cybernetics and System Analyses, 4, 479–493 (2005) UniTESK: Component Model Based Testing Alexander K. Petrenko1, Victor Kuliamin1 and Andrey Maksimov1 1 Institute for System Programming of Russian Academy of Sciences (petrenko, kuliamin, andrew)@ispras.ru Abstract. UniTESK is a testing technology based on formal models or formal specifications of requirements to the behavior of software and hardware com- ponents. The most significant applications of UniTESK in industrial projects are described, the experience is summarized, and the prospective directions to the Component Model Based Testing development are estimated. Keywords. Specification, verification, model-based testing, automated test generation Key terms. SoftwareSystem, FormalMethod, SpecificationProcess, Verifica- tionProcess 1 Introduction Model Based testing (MBT) is a rapidly developing domain of software engineering. One of the reasons for such rapid development is the fact that MBT is at the intersec- tion of various other domains of software engineering. In particular, those domains include methods for defining, formalization and modeling of requirements, methods for analysis of both formal specification and formal models as well as the software code, methods for abstraction level control, model transformation and many other software engineering domains. It provides MBT with the ability to quickly adopt the recent achievements proved to be useful in joint domains, in particular, in methods of static analysis and mixed static/dynamic analysis. However, there is no market-ready well-established MBT tool that could be recommended for use in wide range of soft- ware development and testing projects. To further develop MBT, we should first ana- lyze the experience gained in the last 15-20 years in this domain. This should help to identify some common problems and focus on their solutions. In this paper, we briefly describe the stages of UniTESK (Unified TEsting & Specification toolKit) develop- ment – one of the first MBT tools targeted at testing of wide class of software sys- tems. In the course of the paper, we highlight both positive and negative lessons learned during development and using UniTESK tools. The paper has a subtitle – industrial paper. It means that here we don’t reveal any new solutions and don’t set 574 A. K. Petrenko, V. Kuliamin and A. Maksimov any new scientific problems. We analyze the experience and try to learn lessons that would be useful to researchers working in this domain. The UniTESK technology [1, 2] was initially developed on the basis of the experi- ence gained in the project on the creation of the automated testing KVEST [3] system, which was developed for testing of the real-time operating system kernel. The work started in 1994 when the term Model Based Testing did not exist. This term appeared at the edge of the 21st century. Currently, MBT is rapidly developed. There are many enthusiasts of this approach and many interpretations of the term itself. To properly position UniTESK in the wide spectrum of MBT solutions, we should first clarify the meaning of MBT within the UniTESK framework. The following definition is currently given in Wikipedia: “Model-based testing is application of Model based design for designing and optionally also executing arti- facts to perform software testing. Models can be used to represent the desired behav- ior of a System Under Test (SUT), or to represent testing strategies and a test envi- ronment”. This definition includes almost all known interpretations of this term. However, most researchers and practitioners mean more specific approaches and testing tech- niques by MBT. The first main dividing line is the choice of the modeled object: some model the behavior of the target system (SUT), others model the environment of the target system, in particular, the test itself or the testing system, which, of course, are external to the target system. In UniTESK, the model specifies the system behav- ior. There are also various types of MBT in this approach, which differ in the way of the behavior description. About the first type Jan Paleska [4] says: “the behavior of the system under test (SUT) is specified by a model elaborated in the same style as a model serving for development purposes”. Such specifications or models are called executable. The role of the executable model can be played either by prototype im- plementation algorithm or by some model which explicitly contains the notion of calculation/execution, for example, finite-state machine, Petri net, ASM [5], etc. Ex- amples of other types of MBT, i.e. “nonexecutable”, are algebraic specifications, software contracts in the form of pre- and post-conditions of functions. Each of the model types has its own advantages and drawbacks when testing different SUTs. Be- sides, when generating tests, not only test data and the sequence of calls of the tested functions should be generated. Also required are the “artifacts” mentioned in the Wikipedia definition, for example, test oracles – the components of the test suite that automatically evaluate the results of the SUT execution whether they meet the re- quirements or not. Executable models often do not allow test oracle creation, but they are very good for generation of test sequences. Software contracts simplify generation of test oracles, but they do not allow effective generation of test sequences. In other words, several types of models required for generation of effective test suites. Back to UniTESK, we can say that the main model type in it is the software contract in the form of pre- and post-conditions of the functions. In addition, online construction of finite-state machine is used in UniTESK making possible the generation of rather non-trivial test sequences. UniTESK is a technology that can be implemented on various software platforms and, therefore, can be used for testing of API in various programming languages. UniTESK: Component Model Based Testing 575 Currently, the most actively used implementations of UniTESK are for C, C++, Java, Python. The corresponding tools are: CTESK, C++TESK, JavaTESK, and PyTESK [6]. UniTESK is an academic product developed in ISPRAS. The UniTESK tools are available under the free license. Experience of industrial application of UniTESK is fused into the tools. Some test suites developed with UniTESK are included in official test suites for certification of industrial software. For example, the OLVER test suite [7] is one of the biggest MBT test suites in the world and yields to the only test suite developed within the framework of the Microsoft Interoperability Initiative [8]. 2 UniTESK Usage Review Let’s consider the most interesting examples of UniTESK application and experience gained in them. The first application of UniTESK was in the project supported by Microsoft Re- search on development of the MBT test suite for IPv6 implementation [9]. The project started in 2000. At that time UniTESK was at the beginning of its development, so a simplified (light) implementation of API testing in C was used – CTESK-light. In spite of the tool instability, it allowed creation of the effective test suite that detected defects, which were not detected by other test suites. It was the first experience of using contract specifications for telecommunication protocol testing. It was demon- strated that contract specifications in combination with the technique of testing sys- tems with asynchronous interfaces developed within UniTESK [10, 11] allow creating effective tests (they detected more defects, consumed less space and required less effort for development and maintenance than tests developed with traditional tech- nologies). However, the experience of protocols testing showed that besides the post- conditions of the functions in the form of predicates it is useful to have executable models when testing protocols. One of the first experiences of using UniTESK implementation for testing Java API [12] was the project on testing of Java run-time infrastructure developed as an alternative to the popular Java-platforms. The development of the models and the tests was not a problem since the interfaces were well documented. In addition to Java interfaces, the target system contained also the interfaces in C++, but they also were not a big problem since UniTESK architecture provides the layer of mediators- adapters. The problems revealed when the actual testing started. MBT test suite with online test generation is a fairly complicated program that has strong requirements to the execution platform. In this case, the SUT itself was the execution platform which still was not stable at that time. As a result, the test suite indicated the presence of defects “everywhere”, which, in turn, was of no help to the developers. The significant application of UniTESK on Java platform (JavaTESK) is the pro- ject on testing of infrastructure of the distributed information system of one of Rus- sian major mobile telephony provider. This project is still in progress. The possibility of formal and rigorous specification of the components interfaces became the main advantage of UniTESK for the customer in comparison with the other tools for func- 576 A. K. Petrenko, V. Kuliamin and A. Maksimov tional testing. Hundreds of components were formally specified and tested with UniTESK. By the end of the first year of using UniTESK the positive effect appeared in shorter time of integration of new versions of the distributed system. However, a serious problem revealed. In the previous UniTESK applications the requirements to most interfaces were defined by standards and other well-developed documents. But here the level of components documentation often appeared to be insufficient for creation of consistent specifications. Recovery of documentation or requirements to interfaces in the systems of such size becomes almost unsolvable task, which often makes it impossible to use MBT in corpore. Possible solution of this problem will be briefly discussed in Conclusion. The largest example of UniTESK application is the OLVER (Open Linux VERifi- cation) project [7] fulfilled in 2005-2007 under support of the Russian Ministry of Education and Science. The goal of the project was to create formal specifications of interfaces defined in the Linux Standard Base (LSB) standard or in LSB Core – the central part of this standard, to be more exact. The LSB Core includes the most im- portant libraries of OS Linux which implement most of the POSIX standard. The rigorous description of the LSB standard and the test suite capable of a high-quality checking of conformance of any Linux library implementation to the requirements of the standard is a very powerful tool for providing portability of OS Linux applications from one Linux distribution to another. The portability problem is very critical in the Linux ecosystem, since several hundreds of very different distributions are available. The project results are open [7]. The contract specifications of more than 1500 inter- faces in C were created. Naturally, the CTESK tool was used for modeling and test generation. In this project, the problems in the standards were also revealed: in LSB (ISO/IEC 23360) and in The Single UNIX Specification containing the POSIX.1 standard (aka IEEE Std 1003.1, aka ISO/IEC 9945, aka The Open Group Base Speci- fications Issue 6) as its significant part. The developed test suite is included into the package of the certification tests of the international consortium The Linux Founda- tion [13]. The experience of interface formalization for a large industrial standard and test suite development for such standard gave many lessons to learn. One of such lessons is importance of informational and methodological organization of such project. The amount of documentation and sources, especially with respect to multiple versions and variants for different hardware platforms, is huge. Besides, the development of the standard and development of interface implementations involve thousands of peo- ple around the world. It means that the documentation maintenance and availability is one of the most important concerns of the projects of such scale. On the organiza- tional and methodical side, we faced the fact that the training of new employees and the specification and tests quality control require a lot of effort, and the quick achievement of the required professional level is still impossible. In other words, the scalability of the MBT projects in the part of increasing the number of specification and verification experts is one of the most complicated problems preventing MBT from wide introduction. One of the methodical problems is the choice of the abstraction level for the model. More abstract models or models separated into two-three layers of different abstrac- UniTESK: Component Model Based Testing 577 tion levels simplify the reuse of the models and tests yielding, however, the bigger and more complex test system. In the long term, it’s better to have multilayer models, while in the short term the models close to implementation in the detail level (of course, if the implementation already exists) are more appropriate. A professional and experienced verification expert can find the balance between the abstract description of the behavior, for example, of a file system and specifics and details of the interface of its particular implementation. UniTESK provides special support for the separation of abstraction levels. In particular, the specifics of interfaces can be encapsulated in the mediator-adapter layer. The choice of the balance is determined by the long-tem plans on using and improvement of the models and the test suite. So, the work of such kind requires a broad experience and long-term planning skills, which can hardly be expected from ordinary test engineers. The results of the OLVER project were used later on in the development of the test suite for the Russian real-time operating system OS2000/3000 [14]. This system pro- vides two groups of interfaces. The first group meets the requirements of the POSIX standard, the second one – the requirements of the ARINC-653 international standard for the embedded and other safety critical systems. The definition of the adapter layer separating model and implementation representations of the interfaces provided by the UniTESK architecture significantly simplified the OLVER reuse in this project. Along with the start of the OLVER project, the work on the UniTESK application to testing of microprocessor designs [15] has been started. Hardware units being parts of Russian microprocessors with the MIPS architecture and microprocessors with VLIM/EPIC elements became the systems under test in this case. The size of typical units in such microprocessors is several millions of gates. The tools required no sig- nificant modifications for specification and test generation since CTESK was used as the basis. Technically, binding CTESK to corresponding API of microprocessor model simulator is not a problem, because most simulators that work with modeling languages for microprocessors logic (HLD – High Level Design languages), for ex- ample, VHDL or Verilog, provide suitable interface to C programs. Pre-conditions semantics in contract specifications had to be slightly modified. They now describe not just the domain of input data, but rather the operation execution readiness condi- tions in the given time frame. The same as in the case of protocols modeling, the use of explicit models of the target device behavior (functionality) along with the post- conditions in the form of predicates appeared to be necessary. Similar to the projects on verification of software systems, one of the main prob- lems preventing MBT from introduction into practice (as well as many other verifica- tion methods) is the lack of documentation and other descriptions of functional re- quirements to components. However, the situation in microprocessors development is slightly better than in the case of software development, because in this case it is cus- tomary to build system and architectural models of instruction set semantics along with the HLD models. Elements of these architectural models can be used to fill the gap in the knowledge on behavior of some microprocessor design units [16]. It ap- pears also relatively simple to implement parallel test execution on clusters. Typical size of the finite-state machine generated during test execution for one microprocessor unit is millions of nodes and dozens of millions of transitions. The algorithm of FSM 578 A. K. Petrenko, V. Kuliamin and A. Maksimov generation and exploration on clusters with up to 200 nodes appeared to require just 10-15% overhead, i.e. scalability coefficient is close to 1. It is important to mention the verification tasks that, on the one hand, could not be reduced to modeling with contract specifications, and, on the other hand, pushed for- ward the development of new MBT methods. In the first place, the task of compiler testing should be mentioned, as well as the task of testing microprocessor as a whole, the so-called “core testing”. The both cases are the tasks of system testing, where test data and test stimuli are submitted to a big “black box” (in our case, these are test programs submitted to the compiler or loaded into memory of the microprocessor simulator), and it is interesting to test not just everything, but some specific behavior modes or specific group of units. In the case of compiler testing, the OTK tool has been developed that was used for testing of optimizing Intel compilers and Simulink [17, 18]. It allows targeting on specific kinds of optimizations. In the case of micro- processor design verification, the MicroTESK tool [19, 20, 21] was developed. The main goal of this tool is checking of various situations appearing in the most compli- cated subsystems of memory control: TLB, cache and Memory Management Unit (MMU) as a whole. 3 Conclusions and Further Work Let’s start with positive conclusions. 3.1 Positive Conclusions on Modern State of Using MBT The world experience is confirmed [22], MBT can be effectively used in industrial projects, and in comparison with the traditional testing MBT gives a unique advan- tage – many defects can be found in requirements, which are often much more ex- pensive than the defects in implementation The achievable level of test coverage is significantly higher than the traditional one (even in comparison with the “white box” testing). Thus, in the case of using OTK for testing GCC compiler, the achieved test coverage was 95%, and in the case of the Intel compiler this level was 75% that was significantly higher than the level achieved by traditional tests [18]. Although the multi-level structure of specifications (several levels of abstraction) is seldom used in practice, the explicit separation of adapters layer simplifies tests porting and maintenance and, vise versa, the lack of the corresponding level of ad- aptation makes test suite development significantly more complicated, which was demonstrated in the Microsoft Interoperability Initiative program [23] Online generation of test sequences with the FSM exploration method can be effi- ciently parallelized and allows using computational resources of clusters with just 10-15% overhead, at least in the case of microprocessor models testing The demand of MBT in the safety critical area increases. This tendency can be found in standards defining requirements to development processes for safety criti- cal systems, for example, in DO178C [24] and in Common Criteria [25]. UniTESK: Component Model Based Testing 579 3.2 Negative Aspects of the Modern State in the MBT Area The main obstacle preventing MBT from wide introduction into practice is the absence of specifications/models in casual software development. That is, the lack of specifications is often not only the consequence of insufficient attention to specification development or the consequence of short resources. The main reason is often the lack of qualified specialists who are experts in the knowledge domain and at the same time can create specification/model necessary for test generation. If MBT is used in projects that do not involve MDD (Model Driven Development) approach, then the model development delays the appearance of first tests – this does not allow obtain tests early in the development. If MBT is used within MDD, then the problems still remain, because different models required for development and for testing, in particular, for generation of different artifacts of the test suite. It is often considered as unacceptable additional cost, while with proper planning many components of the development models can be reused during test generation as demonstrated, for example, in M. M. Chupilko paper [16]. Bilingual test generation systems like UniTESK and first versions of SpecEx- plorer [26], specification notations even close to conventional programming lan- guages, for example, JML [27] make deployment of such systems difficult. Bilin- gual notations require special training of the staff and need permanent and expen- sive maintenance. Still note that the modern object-oriented languages already have advanced means for writing specifications just in the same language [26-31]. 3.3 Directions of Further Works A variety of modeling paradigms should be used in various project contexts, in particular, contract specifications, various types of executable models, for example, finite-state machines, Kripke structures, etc. [32]. It is not obvious that the trans- formation of models from one paradigm into another one will bring real benefit. Each of the model kinds is suitable for analysis of specific aspects of the system behavior, so we should not expect that, for example, a functional model will facili- tate estimation of the execution time and memory required. However, obtaining some skeleton or a prototype of the model of one kind on the basis of another kind is quite possible. The development of various tools for modeling and specification description for the MBT purposes is required. In spite of the progress in the area of technologies for development of Domain Specific Languages (DSL), practically, the systems based on universal languages benefit from the large number of programmers know- ing such languages. The same can be also said about monolingual systems – they overtake multilingual ones. Modern achievements in the area of static and hybrid static-dynamic analysis allow integration of these techniques into the MBT systems, at that, the mod- els/specifications as well as software implementations should be the subject of this analysis. 580 A. K. Petrenko, V. Kuliamin and A. Maksimov To overcome the problems with the extreme lack of specifications in real practice, tools for work with requirements and models (see, for example, [33]), in particular, with system models [34, 35] should be developed and deployed. For multicompo- nent systems, MBT tools should be integrated with the tools for architecture and process mining. References 1. Bourdonov, I. B., Kossatchev, A. S., Kuliamin, V. V., Petrenko, A. K.: UniTesK Test Suite Architecture. In: FME 2002. LNCS 2391, pp. 77–88. Springer-Verlag (2002) 2. Kuliamin, V. V., Petrenko, A. K., Kossatchev, A. S., Bourdonov, I. B.: The UniTesK Ap- proach to Designing Test Suites. Programming and Computer Software, 29(6), 310–322 (2003) 3. Bourdonov, I. B, Kossatchev, A. S., Petrenko, A. K., Galter. D.: KVEST: Automated Gen- eration of Test Suites from Formal Specifications. In: Proceedings of Formal Method Con- gress, Toulouse, France, 1999. LNCS 1708, pp. 608-621 (1999) 4. Peleska, J.: Industrial-Strength Model-Based Testing – State of the Art and Current Chal- lenges. Invited Talk. In: Petrenko, A. K., Schlingloff, H. (eds.) Proceedings Eighth Work- shop on Model-Based Testing (MBT 2013), Rome, Italy, 17th March 2013. Electronic Proceedings in Theoretical Computer Science, 111, pp. 3–28. DOI: 10.4204/EPTCS.111.1 (2013) 5. Börger, E., Stärk, R.: Abstract State Machines: A Method for High-Level System Design and Analysis. Springer-Verlag (2003) 6. UniTESK technology, http://unitesk.ispras.ru 7. OLVER project, http://linuxtesting.org 8. Microsoft Interoperability Initiative, http://www.microsoft.com/openspecifications 9. Pakulin, N. V., Khoroshilov, A. V.: Development of Formal Models and Conformance Testing for Systems with Asynchronous Interfaces and Telecommunications Protocols. Programming and Computer Software, 33 (6), 316-335 (2007) 10. Khoroshilov, A. V.: Specification and Testing of Components with Asynchronous Inter- faces. Candidate's thesis, Moscow (2006) 11. Kuliamin, V. V., Petrenko, A. K., Pakulin, N. V.: Extended Design-by-Contract Approach to Specification and Conformance Testing of Distributed Software. In: Proceedings of WMSCI'2005, Orlando, USA, July 10-13, 2005. Model Based Development and Testing, v. VII, pp. 65-70 (2005) 12. Bourdonov, I. B., Demakov, A. V., Jarov, A. A., Kossatchev, A. S., Kuliamin, V. V., Petrenko, A. K., Zelenov, S. V.: Java Specification Extension for Automated Test Devel- opment. In: Proceedings of PSI'01. LNCS 2244, pp. 301-307. Springer-Verlag (2001) 13. The Linux Foundation Consortium. LSB Certification Test Suite, http://ispras.linuxbase.org/index.php/LSB_Certification_System 14. Maksimov, A. V.: Requirements-Based Conformance Testing of ARINC 653 Real-Time Operating Systems. In: Proceedings of the Data Systems in Aerospace (DASIA 2010) Conference, 2010. ESA SP-682, ISBN 978-92-9221-246-9 (2010) 15. Ivannikov, V. P., Kamkin, A. S., Kossatchev, A. S., Kuliamin, V. V., and Petrenko, A. K.: The Use of Contract Specifications for Representing Requirements and for Functional Testing of Hardware Models. Programming and Computer Software, 33(5), 272–282 (2007) UniTESK: Component Model Based Testing 581 16. Chupilko, M. M.: Developing Test Systems of Multi-Modules Hardware Designs. ISSN 0361-7688, Programming and Computer Software, 38(1), 34–42, Pleiades Publishing, Ltd. (2012) 17. Zelenov, S. V., Zelenova, S. A.: Model-Based Testing of Optimizing Compilers. In: Proc. of the 19th IFIP TC6/WG6.1 International Conference on Testing of Software and Com- municating Systems – 7th International Workshop on Formal Approaches to Testing of Software (TestCom/FATES 2007). LNCS 4581, pp. 365–377. Springer-Verlag, Berlin Heidelberg (2007) 18. Zelenov, S. V., Silakov, D. V., Petrenko, A. K., Conrad, M., Fey I.: Automatic Test Gen- eration for Model-Based code Generators. In: IEEE ISoLA 2006 Second Intern. Sympo- sium on Leveraging Applications of Formal Methods, Verification and Validation. Paphos, Cyprus, pp. 68-75 (2006) 19. Kamkin, A. S.: A method of Automation of Simulation Testing of Microprocessors with Conveyer Architecture Basing on Formal Specifications. Candidate's thesis, Moscow (2009) 20. Kornykhin, E. V.: A Method of Automation of Testing Program Generation for MMU Verification. Candidate's thesis, Moscow (2010) 21. Kamkin, A.S., Tatarnikov, A.: MicroTESK: An ADL-Based Reconfigurable Test Program Generator for Microprocessors. In: Proceedings of the 6th Spring/Summer Young Re- searchers’ Colloquium on Software Engineering (SYRCoSE 2012), May 30-31, 2012, Perm, Russia (2012) 22. MBT Survey, http://www.robertvbinder.com/docs/arts/MBT-User-Survey.pdf 23. Grieskamp, W.: Microsoft’s Protocol Documentation Program: A Success Story for Model-Based Testing. In: Testing – Practice and Research Techniques. LNCS 6303, p. 7 (2010) 24. Adams, C.: Safety-Critical Software for Mission-Critical Applications to Get Boost with Release of DO-178C. Military & Aerospace Electronics, 10 (2010) 25. Common Criteria, http://www.commoncriteriaportal.org 26. SpecExplorer, http://research.microsoft.com/en-us/projects/specexplorer 27. The Java Modelling Language (JML), http://www.eecs.ucf.edu/~leavens/JML/index.shtml 28. Pakulin, N. V.: Integrated Modular Avionics: New Challenges for MBT. In: ETSI TTCN-3 User Conference and Model Based Testing Workshop, Bangalore, India, 11-14 June 2012 (2012) 29. Code Contracts, http://research.microsoft.com/en-us/projects/contracts 30. C++TESK, http://forge.ispras.ru/projects/cpptesk-toolkit 31. Kuliamin, V. V.: Component Architecture of Model-Based Testing Environment. Pro- gramming and Computer Software, 36(5), 289–305 (2010) 32. Kuliamin, V. V.: Multi-paradigm Models as Source for Automated Test Construction. In: Proceedings of the 1-st Workshop on Model Based Testing (MBT'2004, in ETAPS'2004), Barcelona, Spain, March 27-38, 2004, Electronic Notes in Theoretical Computer Science, 111:137-160, Elseveir, (2005) 33. ReQuality tool, http://requality.org/en/doc.en.html 34. Khoroshilov, A. V., Albitskiy, D., Koverninskiy, I. V., Olshanskiy, M. Yu., Petrenko, A. K., Ugnenko, A. A.: AADL-Based Toolset for IMA System Design and Inte- gration. SAE Int. J. Aerosp. 5(2) (2012) 35. Systems Modeling Language (SysML), http://www.sysml.org Protoautomata as Models of Systems with Data Accumulation Irina Mikhailova1 and Boris Novikov2 and Grygoriy Zholtkevych2 1 Luhansk Taras Shevchenko National University, Institute of Information Technology, 2, Oboronna Str., 91011, Luhansk, Ukraine mia irina@rambler.ru 2 V.N. Karazin Kharkiv National University, School of Mathematics and Mechanics, 4, Svobody Sqr., 61022, Kharkiv, Ukraine {bvnovikov46,g.zholtkevych}@gmail.com Abstract. In the paper formal models of software systems and their components based on the notion of an abstract machine are discussed. Necessity to model systems with data accumulation sets the problem of study of generalizations of the notion of an abstract automaton. In the paper two generalizations, namely, preautomata and protoautomata, are considered. It is shown that passing from automata via preautomata to protoautomata can be naturally realized using the language and methods of category theory. Keywords. system modelling, abstract automaton, category of automata, preautomata, category of preautomata, globalization, protoautomaton, category of protoautomaton, reflector, free protoautomaton Key terms. MathematicalModel, SpecificationProcess, VerificationPro- cess 1 Introduction Theory of abstract state machines or abstract automata is widely applied in different areas of Computer Science. While the early applications of automata theory were connected with theory of compilators design (see, for example, [1]), the more recent its applications are focused on the problems of specification and verification of behaviour of software components [3, 11]. Such changing of the object of the theory was marked by R. Milner in [11]: “In the classical theory, rather little attention is paid to the way in which two automata may interact, in the sense that an action by one entails a complementary action by another. This kind of interaction requires us to look at automata in new light; in particular, this interdependency of automata via their actions seems to demand a new approach to behavioural equivalence”. Protoautomata as Models of Systems with Data Accumulation 583 But the practice of modelling system behaviour based on the automata ap- proach has shown that the approach is inadequate if data accumulation for the correct response is necessary. Using the concept of partial action of a semigroup on a set [7, 10], we have defined the notion of preautomaton and studied its properties [4, 12]. The further study has shown that preautomata can be used for modelling some aspects of behaviour of systems with a delayed response [13, 14]. In this paper, we consider a more general class of automaton-liked systems — the class of protoautomata. All necessary information from the theory of semi- groups, automata theory, and category theory can be found in the monographs [5, 6, 8, 9]. We use the notation ϕ : A 99K B for the partial mapping of A to B (unlike the complete mapping A → B). If ϕ(a) is not defined for a ∈ A, we write ϕ(a) = ∅. The free monoid on the alphabet Σ is denoted by Σ ∗ , and its unit by ε. All actions and preactions used in the paper are right, as it is common in the automata theory. 2 Preliminaries We will use the definition of the automaton in the following form (the condition of the finiteness is ignored): Definition 1. Given a set X and a free monoid Σ ∗ over the alphabet Σ, an automaton is a mapping X × Σ ∗ → X : (x, a) 7→ xa such that for all x ∈ X and u, v ∈ Σ ∗ xε = x, (1) x(uv) = (xu)v. (2) More general concept is the following Definition 2 (see [4]). A preautomaton is such a partial mapping of X × Σ ∗ 99K X : (x, a) 7→ xa, that a) the condition (1) is fulfilled; b) if xu 6= ∅ and (xu)v 6= ∅, then x(uv) 6= ∅ and equality (2) is fulfilled; c) if xu 6= ∅ and x(uv) 6= ∅, then (xu)v 6= ∅ and equality (2) is fulfilled. The preautomata over the monoid Σ ∗ form a category PAut(Σ); its mor- phisms are such maps ϕ : X → Y that (∀ a ∈ Σ ∗ )(∀ x ∈ X)( xa 6= ∅ =⇒ ∅ 6= ϕ(x)a = ϕ(xa)). (3) The category Aut(Σ) of the automata over Σ is a full subcategory of PAut(Σ). Preautomata appear in the following situation. Let Y be an automaton and X an arbitrary nonempty subset of Y . Then a restriction of an action on X is a preautomaton. Conversely, let X × M 99K X be a preautomaton. The construction which is inverse to restriction is called globalization. More precisely: 584 I. Mikhailova, B. Novikov and G. Zholtkevych Definition 3. A globalization of the preautomaton X is an automaton Z with an injection ι : X → Z such that for all a ∈ Σ ∗ , x ∈ X xa 6= ∅ =⇒ ∅ 6= ι(x)a = ι(xa), ι(x)a ∈ ι(X) =⇒ xa 6= ∅ & ι(xa) = ι(x)a. Obviously, ι is a morphism of PAut(M ). We also call it a globalization. Definition 4. A globalization ι : X → Z is called universal if for any global- ization ι0 : X → Z 0 there is an unique morphism κ : Z → Z 0 such that ι0 = κι. The following construction gives an universal globalization (obviously unique up to isomorphism) for any preautomaton X × Σ ∗ 99K X. Define a relation ` on the set X × Σ ∗ : (x, ab) ` (xa, b) ⇐⇒ xa 6= ∅. (4) Let ' be an equivalence relation generated by `, and X U = (X × Σ ∗ )/ '. An equivalence class of ' containing a pair (x, a) is denoted by [x, a]. For [x, a] ∈ X U and b ∈ Σ ∗ , we set [x, a]b = [x, ab]. Thus a complete action on X U is defined. Theorem 1. The automaton X U with a morphism ιU : X → X U : x 7→ [x, ε] is the universal globalization of the preautomaton X. Proof. See [4, Theorem 2] t u 3 Protoautomata The main object of this paper is a generalization of the notion of preautomaton: Definition 5. A protoautomaton is a partial mapping X × Σ ∗ 99K X : (x, a) 7→ xa such that a) the condition (1) is fulfilled; b) if xu 6= ∅ and (xu)v 6= ∅, then x(uv) 6= ∅ and equality (2) is fulfilled. We will also denote the protoautomaton from this definition simply by X, if it does not cause a confusion. Example 1. Let S be a free subsemigroup of Σ ∗ and α : X × S → X an au- tomaton. Define a partial mapping X × Σ ∗ 99K X as an extension of α, putting xu = ∅ for u ∈ Σ ∗ \ S; so we get a protoautomaton over Σ ∗ . Note that in general it is not a preautomaton. In addition, this example shows that the au- tomaton over an infinite alphabet can be represented as a protoautomaton over a two-letter alphabet. Example 2. Let X = {x, y} be a two-element set, L a subset of Σ ∗ . Define a protoautomaton X × Σ ∗ 99K X putting for a 6= ε y, if a ∈ L, xa = ∅, if a ∈ / L, and ya = ∅. This example shows that protoautomata recognize all languages. Protoautomata as Models of Systems with Data Accumulation 585 We denote the category of protoautomata with morphisms defined by the condition (3) by PtAut(Σ); clearly, PAut(Σ) is its subcategory. It follows from the theory of partial action of semigroups [6, Theorem 5.7], that a protoautomaton which is not a preautomaton has no globalization. More precisely, for the protoautomaton X we can construct an automaton X U as in Sec. 2, but in this case the morphism ιU is not injective in general. In this situation, the concept of a reflector is useful. We recall its definition [9]: Definition 6. A subcategory D of a category C is called reflective if with each object C ∈ C an object RD (C) ∈ D is associated (called D-reflector of the object C) and a morphism ρD (C) : C → RD (C) (reflection morphism) such that for each D ∈ D the diagram ρD (C) C −→ RD (C) ↓ D can be extended uniquely to a commutative diagram by some morphism out HomD (RD (C), D). It is convenient to use another description of the equivalence ': Lemma 1. Define a relation ] on the set X × Σ ∗ : (x, a) ] (y, b) ⇐⇒ (∃ a0 , b0 , p ∈ Σ ∗ )(a = a0 p & b = b0 p & xa0 = yb0 6= ∅). Let ≈ be the equivalence relation generated by ]. Then ≈ coincides with '. Proof. If (x, a) ] (y, b) then (x, a) = (x, a0 p) ` (xa0 , p) = (yb0 , p) a (y, b0 p) = (y, b), whence ≈ ⊆ '. Conversely, if (x, a) ` (y, b), then a = cb, y = xc for some c ∈ Σ ∗ . Hence ` ⊆ ]. Consequently, ≈ ⊇ ' t u Remark 1. Obviously, ` is reflexive and transitive, while ] is reflexive and sym- metric. Lemma 2. Let X be a protoautomaton, Y be a preautomaton (both over Σ ∗ ), α : X → Y be a morphism, x, y ∈ X, a ∈ Σ ∗ . Then [x, ε] = [y, a] implies α(x) = α(y)a 6= ∅.. Proof. It follows from the condition that (x, ε) ] (z1 , b1 ) ] . . . ] (zn , bn ) ] (y, a) for some z1 , . . . , zn ∈ X, b1 , . . . , bn ∈ Σ ∗ . 586 I. Mikhailova, B. Novikov and G. Zholtkevych Apply induction on n. Since x = z1 b1 6= ∅ then α(x) = α(z1 )b1 . Suppose that α(x) = α(zn )bn 6= ∅. By definition of the relation ] bn = cp, a = dp, zn c = yd 6= ∅ for some c, d, p ∈ Σ ∗ . Then α(zn )c 6= ∅ and by the induction α(zn )(cp) 6= ∅. Since Y is a preautomaton then α(x) = α(zn )(cp) = α(zn c)p = α(yd)p = (α(y)d)p = α(y)a. Proof has completed t u Similarly (and even easier) one can prove Lemma 3. Let X be a protoautomaton, Y be an automaton (both over Σ ∗ ), α : X → Y be a morphism, x, y ∈ X, a, b ∈ Σ ∗ . Then [x, a] = [y, b] implies α(x)a = α(y)b. Proof is omitted t u We set [X, ε] = {[x, ε] ∈ X U | x ∈ X}. Obviously, [X, ε], being a subset of X U , is a preautomaton, and in addition, ιU (X) = [X, ε]. Theorem 2. Let X be a protoautomaton over Σ ∗ , then 1. [X, ε] is a reflector for X in the category PAut(Σ), 2. X U is a reflector for X in Aut(Σ), 3. X U is a reflector for [X, ε] in Aut(Σ). Proof. 1) Let Y be some preautomaton and α : X → Y be a morphism of protoautomata. The required morphism β : [X, ε] → Y is uniquely determined from the equality α = βιU . Indeed, for x ∈ X we have α(x) = βιU (x) = β([x, ε]). It follows from Lemma 2 that β is well-defined. 2) Similarly, using Lemma 3. 3) Follows from 1), 2), and the following well-known fact [9]: If A ⊂ B ⊂ C are categories, A is reflective in B, and B is reflective in C, then A is reflective in C. Moreover, the reflection morphism from C to A is the product of the corresponding reflection morphisms from C to B and from B to A t u Corollary 1. Aut(Σ) is a reflective subcategory of PAut(Σ). Moreover, the universal globalization of a preautomaton is its reflector. Example 3. Let X = {x, y, z, t}, p, u, v ∈ Σ ∗ \ {ε}. We set zu = zv = t, z(up) = x, z(vp) = y and sw = ∅ for all s ∈ X, w ∈ Σ ∗ \ {ε, p, u, v}. In such a manner X turns into a protoautomaton. Since (x, ε) ] (z, up) ] (z, vp) ] (y, ε) then [x, ε] = [y, ε] and the reflection morphism of X is non-injective. A large class of protoautomata is contained in the following example. Protoautomata as Models of Systems with Data Accumulation 587 Example 4. Consider a preautomaton X × Σ ∗ 99K X as a directed weighted multigraph with states as vertices and with edges of the form (x, u, y), where x, y ∈ X, u ∈ Σ ∗ , and y = xu. Let U be an arbitrary subset of edges of X. Build a transitive closure U t of the set U , extending it step by step by the rule: if the edges (x, u, y) and (y, v, z) are at some stage in the expansion, then on the next step we include the edge (x, uv, z). Then U t is a protoautomaton. Example 3 shows that there exists a protoautomaton such that it can not be embedded into some preautomaton (and thus into some automaton). 4 Free Protoautomata It is well known [2] that free automata play a significant role in the theory of automata (for example, in the problem of constructing a minimal realization). Therefore, it is advisable to consider the question about the existence of free objects in the category of protoautomata. Recall the necessary definitions: Definition 7. Let C and D be categories, F : C D be a functor, C be an object of C. An object D of D is called free on C with respect to the functor F , if there is a morphism α : C → F D such that for any object D0 ∈ D and any morphism β : C → F D0 there exists the unique morphism γ : D → D0 such that F (γ)α = β. (5) We consider a category Rel(Σ) whose objects are pairs (X, ρ), where X is a set (X ∈ Set), ρ ⊂ X × Σ ∗ is a binary relation such that X × {ε} ⊂ ρ. A morphism φ : (X, ρ) → (Y, σ) of Rel(Σ) is a map φ : X → Y such that (φx, u) ∈ σ for (x, u) ∈ ρ. Next, let F be a forgetful functor F : PtAut(Σ) Rel(Σ) mapping each protoautomaton X × Σ ∗ 99K X to the pair (X, ρ) with ρ = {(x, u) | xu 6= ∅}. Theorem 3. For each object (X, ρ) ∈ Rel(Σ) there is a protoautomaton that is free on it with respect to the forgetful functor F . Proof. For (X, ρ) ∈ Rel(Σ) construct a protoautomaton M = (ρ × Σ ∗ 99K ρ), defining the action by the rule (x, uv), if (x, uv) ∈ ρ (x, u)v = ∅, if (x, uv) ∈ / ρ. Then F M = (ρ, ρ̂), where ρ̂ = {((x, u), v) | (x, u)v = (x, uv)} ⊂ ρ × Σ ∗ . Define the morphism α : (X, ρ) → (ρ, ρ̂) by the formula α(x) = (x, ε). Let us show that M is a free protoautomaton on (X, ρ). Let N = (Y × Σ ∗ 99K Y ) be some protoautomaton and F Y = (Y, σ). For the required morphism γ : M → N of (5) we have: γ(x, ε) = F (γ)(x, ε) = F (γ)α(x) = β(x). 588 I. Mikhailova, B. Novikov and G. Zholtkevych Then for any u ∈ Σ ∗ one can obtain γ(x, u) = γ(x, ε)u = β(x)u, i.e. γ is uniquely determined t u 5 Conclusion It seems that the class of protoautomata, which has been introduced in the paper, gives the most abstract models for systems with discrete behaviour. This class of abstract machines includes not only machines reacting on the received data immediately, as automata, but it also includes machines whose reactions depend on the accumulated information. The machines of this class having a greedy behaviour are united into a sub- class whose instances are called preautomata. Machines of the subclass are used for modelling behaviour systems for complex event processing as it was shown earlier [13, 14]. This class of machines, in contrast to the class of automata, is closed under structural decomposition, and hence, is more suitable for specify- ing complex systems. But the condition c) in the definition of a preautomaton (see Definition 2) seems unnatural. This condition also impedes definition of a nondeterministic preautomaton. Therefore, by eliminating the condition c) we provide a possibility to study nondeterministic models. In our opinion, the models derived in this way (pro- toautomata) are interesting objects that can be used for specification and veri- fication of complex systems. References 1. Aho, A.V., Ullman, J.D.: Theory of Parsing, Translation, and Compiling. Prentice- Hall, New York (1972) 2. Arbib, M.A., Manes, E.G.: Machines in a category: an expository introduction. SIAM Rev. 16, 163–192 (1974). 3. Börger, E., Stärk, R.: Abstract State Machines: A Method for High-Level System Design and Analysis. Springer-Verlag, Berlin Heidelberg (2003) 4. Dokuchaev, M., Novikov, B., Zholtkevych, G.: Partial actions and automata. Alg. and Discr. Math. Vol. 11, 2, 51–63 (2011) 5. Eilenberg, S.: Automata, Languages, and Machines, vol. B. Academic Press, New York (1976) 6. Holcombe, W.M.L.: Algebraic Automata Theory. Cambridge Univ. Press (1982) 7. Hollings, C.: Partial actions of monoids. Semigroup Forum. 75, 293–316 (2007) 8. Lallement, G.: Semigroups and combinatorial applications. John Wiley, New York (1979) 9. MacLane, S.: Categories for the Working Mathematician. Springer, Berlin (1971) 10. Megrelishvili, M., Schröder, L.: Globalization of confluent partial actions on topo- logical and metric spaces. Topol. and Appl. 145, 119–145 (2004) 11. Milner, R.: Communicating and Mobile Systems: The Pi Calculus. Cambridge Uni- versity Press, Cambridge (1999) Protoautomata as Models of Systems with Data Accumulation 589 12. Novikov, B., Perepelytsya, I., Zholtkevych, G. Pre-automata as mathematical mod- els of event flows recognisers. In: V. Ermolayev et al. (eds.) Proc. 7-th Int. Conf. ICTERI 2011, 41–50 (2011) 13. Perepelytsya, I., Zholtkevych, G.: On some class of mathematical models for static analysis of critical-mission asynchronous systems. Syst. ozbr. ta viysk. tehn. Vol. 27, 3, 60–63 (2011) 14. Perepelytsya, I., Zholtkevych, G.: Hierarchic Decomposition of Pre-machines as Models of Software System Components. Syst. upravl. navig. i zv’iazku. Vol. 20, 4, 233–238 (2011) Models of Class Specification Intersection of Object- Oriented Programming Dmitriy Buy1 and Serhiy Kompan1 Taras Shevchenko National University of Kyiv, Faculty of Cybernetics, 03680 Academician Glushkov Avenue 4d, Kyiv, Ukraine buy@unicyb.kiev.ua, skompan@mail.ru Abstract. This paper describes the application of heterogeneous algebraic sys- tem for the construction of the formal model of object database instead of object algebra. Complete formalization of the operation of intersection of class speci- fications is given. Keywords. object-oriented programming, object database, object algebra, class specification Key terms. MathematicalModel 1 Introduction In applications of information technologies there is a problem of construction of the so-called dependable and stable systems and infrastructures – the systems which be- have stably under all, especially, critical working circumstances. Similarity of risks and increasing actuality of their decline to an acceptable level for critical applications led to the appearance of a special term “safeware”, by the analogy with the terms “hardware”, “software”, “firmware” etc., which combines two components: safe – secure and ware – a product, an item. This term was suggested and patented by the leading expert of NASA on the questions of infrastructure security, professor N. Leveson, who registered the appearance of a modern field of knowledge called safeware engineering [1]. We mention a fundamental statement both obvious, and elusive in its nature: it’s impossible to talk about stability of a working system, espe- cially of the infrastructure, if there is no formal model of its operation which has been constructed and verified. Moreover, for the construction of a formal model, more or less complex, not “toylike”, there should exist a mathematical apparatus with the help of which software developers create a formal model and verify it according to the source demands of a customer could. For the full confidence in the fact that informational system will work stably (will be dependable and stable), one should single out system components, describe them formally and verify. Indeed, nowadays there is nothing instead of a “divide and rule” Models of Class Specification Intersection of Object-Oriented Programming 591 approach to cope with this difficulty. In fact, one of the most important components of any complex system (infrastructure) is databases. That’s why there should exist an appropriate formal model. For the relational databases such a formal model has been already constructed and explored considerably. This issue is exhaustively covered in the literature, beginning from the pioneering works by E.F. Codd (see, e.g. [2], the first textbooks [3, 4] and modern textbooks [5, 6]). We mention only a collection of works done by the collaborators of Taras Shevchenko National University of Kiev on the natural generalization of classical results of the databases relational approaches [7- 14]. Nowadays, there are a lot of formal models of object-oriented databases (OODB) [15-20]. Each of these models elaborates OODB to a certain extent by applying cer- tain mathematical apparatus. The analysis of research papers dedicated to OODB has shown that authors overlook the question arising from the necessity to construct a new class specification with the two given specifications. For example, the construction of a super class from two specified classes (the operation of intersection of class specifi- cation), the construction of a subclass from two super classes (the operation of union of class specification). The intersection of class specifications is important, in our opinion, as it provides for the opportunity to construct the core of a new program with two programs which allows integrating these two programs that results in the Frame- work version. This paper is dedicated to the exploration of the operation intersection of class specifications and refining conditions under which the intersection of classes is possible. 2 Practical results The authors of this paper have conducted a number of investigations in the field under research: for example, in the article [21] it has been suggested to consider an object algebraic system as a model. Formally it can be formulated like this: , ; obj ; spec , , where is a set of objects’ classes, is a set of class speci- fication, obj is a set of operations over objects, spec is a set of operations over class specifications, and a relation is a partial order which formalizes in- heritance. The main objective of this article is specification of the intersection opera- tion and the difference of class specifications. Let’s start with the intersection operation . Let us formalize the notion of a class: by a class we mean a pair K s, , where s is a functional binary relation which associates an attribute with its meaning (from a universal domain D ), and is a functional binary relation, which brings to conformity a method with its signature. Therefore [21], the relations s and determine a class specification. The intersection operation (of class specifications) is an operation of the form : , where: s1 , 1 s2 , 2 s1 s2 , 1 2 , where is a standard set-theoretical intersection. 592 D. Buy and S. Kompan We will demonstrate some results concerning the structure of a partially ordered set (poset) F , , where is a set of all the functional binary relations (on the uni- versal domain ), а is an ordinary set-theoretical inclusion. These results will sup- plement the results of the paper [22]. All undetermined notions and designations are understood in terms of this paper. Lemma 1. For the arbitrary functional binary relations and the following equal- ity is true: f g ( f g ) (domf domg) □ def Proof. ■ Let us start with X domf domg . Let us use generally valid properties of the set-theoretical restriction operation (a binary ratio on a set) (monotony, dis- tributivity etc.) [19]. Firstly, we have an inclusion dom( f g ) domf domg X . Secondly, from this the next chain of equalities and inequalities follows: f g ( f g ) dom( f g ) ( f g ) X f X g X f g . Thus, f g ( f g ) X ( f g ) (domf domg ) □ def Below is a relation of consistency: f g f X g X , where def X domf domg . In [7] the main property of consistency was determined as: f g f g is a functional binary relation. The following lemma’s corollary forms another criterion of consistency. Corollary (the criterion of consistency of functional binary relations). Let f , g be def arbitrary functional binary relations, and X domf domg . Then: f g dom( f g ) X , ( f g ) dom( f g ) X . □ Proof. ■ The proof is performed by using a Lemma 1 and inclu- sion dom( f g ) X . It’s important to note that the second (the first) equivalence is a formal corollary of the first one (of the second one accordingly). □ As for the structure of the poset F , , there are two statements. Statement 1. Poset F, is a lower semilattice, and at the same time, inf f , g f g . □ The proof results from the fact that is a commutative idempotent semigroup and from a well-known connection between such semigroups and lower semilattices (see, e.g. [23]). More complete information about the poset F , is given by the following statement. Statement 2. (the structure of poset F , ). The following statements are true: 1. The empty function f is the smallest element (“a bottom”) Models of Class Specification Intersection of Object-Oriented Programming 593 2. The largest element in poset F , exists if and only if the universe D is single- ton 3. The infimum exists for any nonempty set F and inf F f F f 4. The supremum of the set F exists if and only if in the case when the set F is re- stricted, and sup F f F f 5. The element f is an atom only when f is singleton 6. Poset F , is a relatively complete poset and a complete (upper) semilattice □ Let’s proceed to the substantial interpretation of above results. The operation constructs a new class which will be basic (paternal) for classes arguments. This intersection can also be empty, in this case we will get a special empty class. As the relation on the specifications is component wise ( s, s, s s ) , all properties of the relation (statements 1, 2) can be lifted to the relation . The corresponding formulations are obvious and thereby are omitted. 3 Results and conclusions The model of intersection operation of class specifications has been examined. This operation has been specified as set-theoretical intersection. The specification f g has been interpreted as the largest total part of f and g , that is, the specification from which specifications-arguments can be obtained by inheritance (in other words, the result specification is the specification of a paternal class). The conditions for nonempty (equivalent, empty) intersection have been examined. As for formal results, natural criteria of function consistency have been presented (corollary) which supplement the already known criteria; the structure of a partially ordered set of partial functions has been specified (statements 1, 2). References 1. Kharchenko, V. S.: Safety of Critical Infrastructures: Mathematical and Engineering Methods of Analysis and Ensuring. N.E. Zhukovsky National Aerospace University (2011) (in Russian) 2. Codd, E. F.: A Relational Model of Data for Large Shared Data Banks. Comm. ACM, 13 (1970) 3. Maier, D.: The Theory of Relational Databases. Computer Science Press (1983) 4. Ullman, J., Garsia-Molina, H., Widom, J.: Database Systems: The Complete Book. Pren- tice Hall Inc., Stanford (2002) 5. Kroenke, D. M.: Database Processing: Fundamentals, Design, and Implementation. Pren- tice Hall (2011) 6. Date, C. J.: An Introduction to Database Systems. In: Addison-Wesley, (2000) 594 D. Buy and S. Kompan 7. Buy, D. B., Kahuta, N. D.: Full Image, Restriction, Projection, Relationship Compatibility. Theoretical and Applied Aspects of Program Systems Development: International Confer- ence, December 8-10, pp. 244-260 (2009) (in Ukrainian) 8. Buy, D. B., Bogatiryova, J. A.: The Theory of Multisets: Bibliography, Use the Table in Databases. Radio Electronic and Computer Systems, 7(48), 56–62 (2010) (in Ukrainian) 9. Buy, D. B., Polyakov, S. A.: Compositional Semantics of Recursive Queries in SQL-like Languages. Bulletin of Kyiv University. Series. Phis.-Math. Science, 1, 45–56 (2010) (in Ukrainian) 10. Buy, D. B., Glushko, I. M.: Generalized Table Algebra, Generalized Tuple Calculus, Gen- eralized Domain Calculus and Theirs Equivalence. In: Bulletin of Kyiv University. Series. Phys.-Math. Science. 1, 86–95 (2011) (in Ukrainian) 11. Buy, D. B., Puzikova, А. V.: Completeness of Armstrong Axioms. In: Bulletin of Kyiv University. Series. Phys.-Math. Science, 3, 103–108, (2011) (in Ukrainian) 12. Redko, V. N., Brona, J. Y., Buy, D. B., Polyakov, S. A.: Relational Databases: Tabular Algebra and SQL-like Language. AcademPeriodika (2001) (in Ukrainian) 13. Buy, D., Silveystruk, L.: Formalization of Structural Constraints of Relationships in «En- tity-Relationship» Model. In: Electronic Computers and Informatics 2006: International Scientific Conference, September 20-22, pp. 96-101, Kosice, Slovakia (2006) 14. Buy, D., Glushko, I.: Equivalence of Table Algebras of Finite (Infinite) Tables and Corre- sponding Relational Calculi. In: Proceedings of the Eleventh International Conference on Informatics INFORMATICS’2011, November 16-18, pp. 56-60. Rožňava, Slovakia, (2011) 15. Piskunov, А. G.: The Formalization of the Object-Oriented Programming Paradigm, http://www.realcoding.net/dn/docs/machine.pdf (in Russian) 16. Piskunov, A. G.: The Formalization of the OOP: Types, Sets, Classes, http://agp1.hx0.ru/articles/typeSetsClasses.pdf (in Russian) 17. Chaplanova, Е. B.: Operating Specification of Object-Relational Data Model. Radіoelektronіka, Informatika, Upravlіnnya, 12, 75–79 (2011) (in Russian) 18. Richta, K, Toth, D.: Formal Models of Object-Oriented Databases. In: Objekty 2008. Žil- ina: Žilinská univerzita v Žiline, Fakulta Riadenia a Informatiky, pp. 204-217, http://www.ksi.mff.cuni.cz/~richta/publications/richta-toth-Objekty2008.pdf (2008) 19. Sarkar, M., Reiss, S.: A Data Model and a Query Language for Object-Oriented Database. In: Island, Department of Computer Science Brown University Providence, Rhode, CS-92- 57, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.34.4531&rep=rep1&type= pdf (1992) 20. Gail, M., Shaw, S.: Zdonik A Query Algebra for Object-Oriented Databases. Island, De- partment of Computer Science Brown University Providence, Rhode, CS-89-19 http://trac.common-lisp.net/elephant/raw-attachment/wiki/RelationalAlgebra/shaw89 query.2.pdf (1989) 21. Buy, D. B., Kompan, S. V.: Union and Intersection Operations of Classes Specifications in Heterogen Algebraic System for Object-Oriented Programming. In: Proc. SWorld. Int. Sci- Pract. Conf. Modern Problems and Solutions in Science, Transportation, Manufacturing and Education. KUPRIENKO, Odessa, vol. 4, pp. 45–49 (2012) (in Russian) 22. Buy, D. B., Kahuta, N. D.: Properties Related Confinality and Order a Set of Partial Func- tions. Bulletin of Kyiv University. Series. Phys.-Math. Science, 2, 125–135, (2006) (in Ukrainian) 23. Skornyakov, L. A.: Elements of the Theory of Structures. Nauka, Мoskow (1982) (in Rus- sian) Author Index A E Alferov, Eugene................................... 108 Alobaidi, Mizal...................................... 18 Echahed, Rachid ..................................521 Arkatov, Denis B. ................................ 178 Ermolayev, Vadim ...... II, 64, 99, 108, 295 Aronov, Andrey ................................... 252 G B Glazunova, Olena G.............................411 Baiev, Oleksandr ................................. 118 Glukhovtsova, Kateryna ........................48 Baklanova, Nadezhda .......................... 550 Guba, Anton.........................................490 Batyiv, Andriy ....................................... 18 Becker, Karsten ................................... 424 I Beletsky, Alexsander ........................... 352 Isomöttönen, Ville ...............................221 Beletsky, Anatoly ........................ 311, 352 Itkonen, Jonne......................................221 Beletsky, Evgeny ................................. 311 Iurtyn, Ivan ..........................................187 Bilousova, Lyudmyla........................... 209 Ivanov, Ievgen .....................................448 Blinov, Igor Ol..................................... 565 Bodnenko, Dmitry ............................... 281 Bonda, Darya....................................... 360 K Buy, Dmitriy........................................ 590 Kandyba, Roman .................................352 Keberle, Natalya G.................................79 C Kharchenko, Vyacheslav .....................146 Klionov, Dmitriy M. ............................464 Chaabani, Mohamed ............................ 521 Kobets, Vitaliy ........................ II, 310, 329 Cochez, Michael .................................. 221 Kolgatin, Oleksandr .............................209 Kolgatina, Larisa..................................209 D Kompan, Serhiy ...................................590 Davidovsky, Maxim ...................... 99, 295 Kotkova, Vera......................................236 Derevianko, Andrii ................................ 30 Kravtsov, Hennadiy ................ II, 236, 410 Didenko, Ievgen................................... 118 Kropotov, Aleksandr..............................30 Doroshenko, Anatoliy............................ 38 Kryukov, Sergey ..................................310 Dzyubenko, Artem............................... 252 Kryvolap, Andrii..................................533 Kukharenko, Vladimir .................273, 410 596 Author Index Kuliamin, Victor.................................. 573 Popov, Peter.........................................146 Kushnir, Nataliya................................. 195 Pratt, Gary L. ...........................................3 Kuzminska, Olena ............................... 264 Protsenko, Galina.................................264 L R Lavrischeva, Ekaterina ........................ 252 Ralo, Aleksandr .....................................30 Lazareva, Elena ................................... 339 Richter, Harald.....................................424 Lazurik, Valentine ............................... 118 Romenska, Yuliia.................................130 Letichevsky, Alexander ........................... 4 Rudenko, Margarita .............................401 M S Maksimov, Andrey .............................. 573 Schreiner, Wolfgang ............................533 Mallet, Frédéric ................... 130, 289, 475 Selyutin, Victor....................................401 Mantula, Elena....................................... 91 Semenyuk, Andriy ...............................393 Manzhulam Anna ................................ 195 Shushpanov, Constantin.......................490 Mashtalir, Vladimir ............................... 91 Shyshkina, Mariya ...............................436 Matthes, Ralph..................................... 506 Sitzmann, Daniel..................................424 Matzke, Wolf-Ekkehard .......................... 2 Sokol, Vladyslav....................................48 Mayr, Heinrich C.................................... II Spivakovska, Evgeniya ........................236 Mazol, Sergey...................................... 360 Spivakovskiy, Aleksander............... II, 236 Mazol, Sergey...................................... 366 Strecker, Martin ................... 447, 521, 550 Mesropyan, Karine .............................. 385 Styervoyedov, Sergiy.............................30 Mikhailova, Irina ................................. 582 Moiseeva, Oksana................................ 366 T Möller, Dietmar P.F............................. 424 Morze, Natalia ..................................... 264 Tatarintseva, Olga ..................................64 Morze, Natalia V. ................................ 411 Tirronen, Ville .....................................221 Tkachuk, Nikolay...................................48 Tolok, Vyacheslav .................................99 N Nikitchenko, Mykola .............. II, 447, 533 V Novikov, Boris .................................... 582 Valko, Nataliya ....................................195 Varava, Anastasiia ...............................163 O Vasylevych, Leonid .............................187 Odarushchenko, Oleg .......................... 146 Vozniy, Oleksiy .....................................30 Odarushchenko, Valentina................... 146 W P Weissblut, Alexander J. .......................374 Payentko, Tanya .................................. 310 Winckel, Mathias .................................506 Peschanenko, Vladimir ........... II, 447, 490 Petrenko, Alexander K......................... 573 Y Petukhova, Lyubov .............................. 236 Yatsenko, Olena.....................................38 Models of Class Specification Intersection of Object-Oriented Programming 597 Z Zhereb, Kostiantyn.................................38 Zholtkevych, Galyna............................475 Zaporozhchenko, Yulia........................ 410 Zholtkevych, Grygoriy..... II, 18, 163, 475, Zaretska, Iryna ..................................... 475 582 Zavileysky, Mikhail................................ II