Knowledge Driven Approach To Auto-Generate Digital Twins for Industrial Plants Amar B.1 , Subhrojyoti R. C.1 , Barnali B.1 , Dhakshinamoorthy R.2 , Rajesh N.2 and Venkatesh C.3 1 TCS Research, 54-B, Tata Research Development and Design Center, Pune, Maharahstra, India, 411013 2 Tata Consultancy Services Ltd., Siruseri, Chennai, Tamil Nadu 603103 3 International Institute of Information Technology Hyderabad, Professor CR Rao Rd, Gachibowli, Hyderabad, Telangana 500032 Abstract Control systems operate industrial plants to accomplish stakeholder objectives like achieving production targets, complying with environmental ordinances, handling faults, etc. Such stakeholder objectives get realised by identifying and executing valid control actions on the plant’s control system. E.g., to achieve fault management a command is fired to place machines in a fault mode when the plant is under an error state. Arriving at such control actions is a non-trivial task demanding a detailed understanding of the plant’s structure and behaviour. Besides, it is also essential to verify the consequences of such control actions relative to other cross-cutting objectives and plant behaviour. E.g., to fulfil fault management objectives, the action to set machines in fault mode may affect production goals due to the machine unavailability. Hence, validation of control actions is vital before executing them using the actual plant’s control system. With digital twin technologies (DT), it is now possible to verify the implications of such control actions against a plant’s behaviour and objectives in a simulated environment without affecting the actual plant operations. DTs get developed autonomously as one-off solutions to simulate and validate plant control actions in the current state of practice, demanding high efforts and domain expertise. Our paper proposes a knowledge-driven approach enabling automation in DT development. The result of our approach is an auto-generated digital twin that pro-actively mimics the plant’s control system behaviour and helps with the validation of control actions before their execution. We use this approach to build three fault management DTs in a power plant. The application of our approach significantly reduces the manual efforts and development time to build such DTs. Keywords knowledge-driven engineering, digital twins, control system, knowledge translation, state machines, domain- specific languages 1. Motivation modes of components as well as the overall system. 5) Identifying faults and notifying operators. Modern industrial plants execute multiple processes Control systems enable industrial processes to to accomplish stakeholder objectives like achieving achieve goals e.g., power generation in thermal production targets, managing faults, complying with power plants, by actuating commands to plant com- environmental regulations [1], and so on. Process ex- ponents like pulveriser[4], boiler[5], turbine[5], etc., ecution involves a control system[2] orchestrating[3] taking into account their working states and modes. the plant components and sub-systems to produce It is a common practice to store the list of such appli- the desired outcomes for each process. Some of cable commands in a plant operation manual[6]. The the primary responsibilities of a control system are : plant operators and engineers refer to such manuals 1) Commissioning and integrating the plant compo- to identify, decide and execute commands through nents and sub-systems. 2) Identifying and commu- the control system. While executing such commands, nicating set points to the components. 3) Receiving the control system ensures that the machines are un- and processing sensor data. 4) Managing states and der appropriate states to accept and process these commands or else flag them back to the operator or 4th WORKSHOP ON KNOWLEDGE-DRIVEN ANALYTICS engineer as inappropriate actions. AND SYSTEMS IMPACTING HUMAN QUALITY OF LIFE Instructions in operation manuals mostly do not (KDAH-CIKM-2021) provide any view into the causal effects of command " amar.banerjee@tcs.com (A. B.); subhrojyoti.c@tcs.com executions on other processes and their objectives, (S. R. C.); barnali.basak@tcs.com (B. B.); dhaks.r@tcs.com as the number of such scenarios can be too large to (D. R.); rajesh.natesan@tcs.com (R. N.); venkatesh.choppella@iiit.ac (V. C.) incorporate in the operation manual. Plant operators © 2021 Copyright for this paper by its authors. Use permitted under Cre- ative Commons License Attribution 4.0 International (CC BY 4.0). use their judgment to arrive at such actions based CEUR Workshop http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR- on their experience in operating plants over time. Proceedings WS.org) For plants executing multiple processes, manually This paper proposes a knowledge-driven ap- understanding the implications of each command proach aiming to achieve the following objectives for becomes difficult, insufficient, complex and time- developing a DT. 1) Enable re-use of domain knowl- consuming. Emerging digital twin[7] technologies edge in DT development. 2) Provide mechanisms to prove useful to reason about such plant behaviour reduce time and manual efforts in developing DTs. for various what-if actuation scenarios[8]. We implement a framework that exploits the crit- A digital twin(DT) helps to generate an abstract ical idea of domain knowledge explication and it’s representation of an operational plant, amenable to re-use towards the semi-automated realisation of human understanding and reasoning. A DT can control system DTs. The main features of the frame- mimic a plant’s behaviour from various aspects, in- work are cluding its control system aspect. This makes it pos- sible for a plant operator to arrive at an appropriate 1) Domain meta-models to explicate plant’s do- control action, by executing them in a DT and check- main knowledge in terms of its structure and ing their implications against other cross-cutting con- behaviour cerns such as reliability, compliance, etc. DTs execute 2) Knowledge translation mechanisms to translate in a sand-box environment[9] making it possible to plant domain knowledge into a control system try out various actions and observe their implica- model, an abstract representation of the desired tions within the twin without affecting the actual DT plant. An action found suitable on the twin could then be executed on the existing plant control sys- 3) Completeness and correctness validations on the tem. For example, a DT for a power plant can mimic translated control system model and finally the behaviour of it’s individual components like a Boiler, Condenser, Turbine, Steam Outlet etc. and 4) Auto realisation of the final DT from the control their inter-dependencies E.g. steam is generated in system model. the boiler which is then used to rotate the turbine to This paper discusses our approach and the results & produce electricity. Each of the components has it’s experiences from applying it to build DTs for three own specific individual behaviour that contributes fault management use-cases in a power plant. Our to the overall plant behaviour. Such abstract mod- approach shows significant results and the potential els representing the plant structure, behaviour and to reduce manual efforts by over 50% and develop- processes to enable reasoning is popularly known ment time by almost 60%. as a Digital Twin of the plant. Digital - because it is The paper is structured as follows:- in section 2, a virtual model residing in a computer and Twin - we discuss related work from the literature on build- because it mimics the plants structure and behaviour ing DTs using knowledge. Section 3 discusses our based on realtime data from the actual plant. architecture to generate a DT from knowledge au- In the current practice, DTs are developed man- tomatically. Section 4 discusses the usage of our ually for specific aspects of a plant (e.g. asset be- approach to semi-automate the development of a DT haviour monitoring, fault management, energy flow to support the fault management aspect of an actual and utilization etc.) in an ad-hoc manner, relying plant. The discussion provides initial results from heavily on the knowledge of domain experts. DT using our approach versus the manual approach development is a collaborative task carried out be- previously used by our engineering teams to build tween domain experts, designers, and developers. similar DTs for an existing plant. Finally, in section The domain experts provide the necessary inputs 6, we conclude the paper and outline future work. to construct DT for the aspect of interest and help designers & developers realise it in design and com- puter programs. The implicit domain knowledge 2. Related Work residing with the experts becomes the basis for de- veloping the DT. The implicit nature of knowledge This section presents a discussion of current prac- constrains its re-use, making the twin development tices, gaps and challenges in DTs development. highly dependent on domain experts and manual ef- The article by Wang et al. studies the application of forts. As a result, a DT is typically built from scratch DTs for fault detection in smart manufacturing[10] for each problem, adding more time and cost to the units. The main recommendations of their study plant operations management. A typical plant may are:1) Use domain understanding in developing a need a large number of DTs, can be in hundreds. digital model of the actual plant. 2) Develop data Hence, a manual approach towards their develop- analytic strategies to analyse plant operations based ment can be highly challenging and cost intensive. on their objectives, and 3) Create knowledge bases of operation strategies, actions, faults & errors and remains unaddressed in this study. decisions based on historical data provided by tech- A framework based on manufacturing cells nicians and diagnosticians. . Wang et al.’s approach (DTMC) by Zhang et al. [13] describes a knowledge also proposes primary elements in developing a DT and data-driven approach for DT development. The like 1) results from analysis of historical data from DTMC framework achieves manufacturing automa- plant operations 2) experiences gained from operat- tion by enabling intelligence during operations by ing plants , 3) knowledge of the experts working in 1) perceiving data by using analysis approaches, various areas of plant operations, and 4) offline data 2) simulating various what-if scenarios to arrive at analysis of operational data to derive new insights. suitable conclusions, 3) understanding the emergent This approach emphasises the importance of gather- behaviour of the twin based on data and domain ing knowledge from historical data analysis results, knowledge, 4) predicting future behaviours based on experiences and expert recommendations; however, historical data as well as domain knowledge, 5) op- there is no specific structure suggested in this paper timising the executing processes based on domain to capture the knowledge. The approach is human- constraints and the simulation results, and finally centric and requires manual efforts to develop a DT. 6) implementing strategies to control the plant opera- Marmolejo-Saucedo [11] describe a case study on tions based on reasoning and analysis. The proposed designing and developing a DT for a supply chain approach relies on identifying the right experts, cre- process. The approach highlights key enabling tech- ating practical learning mechanisms, data analysis, nologies to build DTs, such as simulators, constraint simulation modules, and optimisation approaches to solvers, and data analytic tools. Technical and busi- include intelligence in the DT. The critical building ness domains experts primarily configure all of these block technologies for DTMC implementation are technologies and tools. Significant manual effort and static and dynamic knowledge bases and intellectual practical knowledge go into the design of simulators, skills gathered from experts. The DTMC is a futuris- analytics algorithms, data models, and constraint tic approach to developing DTs; all the intelligence is solvers. This approach focuses on the creation of still manually gathered and assembled into the twin. technology and tool configuration based on experi- Along with the state of the art approaches, we ence and knowledge. The implementation of the DT also study the potential challenges in twin develop- for the supply chain scenario is described as a man- ment. Boschert et al. [14] have identified challenges ual process. The designers use their experience and in building future DTs. The authors have also taken domain understanding to design the twin. On the an initial step towards defining a next-generation other hand, developers create suitable algorithms, DT. The challenges in building DTs identified in this data models and configure solvers based on domain study are : understanding. The approach to build the DT stands on manual efforts and implicit knowledge, in their 1. Integration of multiple simulation technolo- work. gies - different simulation technologies simu- Gabor et al. in [12] propose an architectural ap- late different aspects of the plants like physics, proach for cyber-physical systems and safety engi- chemistry, electronics etc. The authors high- neering. The architecture provides a three-model light the need to integrate various simulation concept to design DTs. As per the authors, phys- technologies to develop holistic simulations ical, cognitive, and contextual(world) models are for the DTs. essential building blocks for a DT. Physical models 2. Changes in context affect plant operation capture the physical entities and interactions in the strategies, e.g. the plant area, the network cyber-physical system. Cognitive models capture bandwidth, the component units etc. influ- the emergent behaviour of the cyber-physical sys- ence the plant operations. Such changes are tem from a cognitive aspect. Finally, the contextual better understood through a DT that also ac- model describes the real-world elements affecting counts for context details. Hence, the study the execution of cyber-physical system. The archi- emphasises the need to capture and include tecture emphasises identifying interactions between the context details of the plants that could these three models to build DTs. The approach uses support extensive reasoning during the plant domain expertise to describe the three models and operations. connect the models based on their interactions in the real world. A significant contribution of this study is 3. Addressing real-life problems - The study to identify the interactions of models that typically finally describes the need for DTs to focus work in the back-end of a cyber-physical system. on real-life challenges. The primary reason The approach to populate and describe these models for this is to enhance the exploration of the problem and solution space. More real-life without dependencies on human experts, an imple- problems would lead to better domain under- mentation domain specific model of a control system, standing and gather new knowledge. allows reasoning and knowledge-to-code translation for DT realizations using MDE. A key concept ex- The authors suggest addressing the above challenges ploited in our work is knowledge translator. A knowl- by working with domain experts and building mod- edge translator allows mapping and translating se- els, algorithms, and semantic structures for the DTs. mantic domain knowledge into engineering models State of the art in developing DTs is highly depen- of a system, which makes automatic implementation dent on manual jobs such as: of DTs possible. 1. Gathering domain knowledge from experts. The following section discusses our knowledge- driven approach for DT development and addresses 2. Translating knowledge to simulation environ- the challenges of manual DT development. ments, models. 3. Developing twin models and software. 3. Knowledge Driven Approach 4. Relying on domain experts to validate the behaviour of the DT against the actual plant to Auto-generate control behaviour post-development. system DT The challenges that arise due to the manual tasks As per the challenges mentioned in the state of prac- and approaches are: 1) Ensuring the domain under- tice, we propose a knowledge-driven approach en- standing of the expert is complete and correct and suring the reuse of domain knowledge towards the in sync with the aspect simulated by the DT. E.g. to development of DTs in a faster-better-cheaper man- build a DT for control systems aspect, the knowl- ner. Our approach results in a framework that al- edge should be relevant and complete with respect lows us to explicate domain knowledge and reuse to control systems. 2) Ensuring correct translation the same in an automated manner to construct con- of domain understanding into computer software. trol system DTs for industrial plants. The frame- 3) Investing additional efforts for testing and valida- work allows capturing knowledge about different as- tion of the developed twin. 4) Re-working the whole pects of a plant as semantic ontologies. The captured approach if gaps get identified in knowledge or the knowledge gets translated into a control system DT knowledge itself is updated, or the implementation through a multi-step translation process. Figure 1 technologies get replaced. Overcoming these gaps shows a high-level architecture for our proposed needs an approach that can assist domain experts framework. We demonstrate our framework on a to explicate domain knowledge and support devel- fault management use case in a power plant and dis- opers to directly re-use this knowledge to create DT cuss the results later in the paper. models and implement the DT. The knowledge-base is populated by capturing From our survey of the current state of art and fault-management domain knowledge using a Con- practice of DT development, we see the opportunity trolled Natural Language (CNL)[17]. The fault- to leverage the ideas of a) semantic web [15] and management knowledge consists of plant compo- b) model driven engineering (MDE)[16] from the nents and their fault detection rules, captured in field of computer science, to mitigate the above chal- terms of a hierarchy of fault types with detection lenges, as they have mostly been untapped for DT rules for each type. The captured knowledge then development so far. We see that knowledge about gets translated to a control system model[18] that repre- specific domains such as power plants, their struc- sents the abstract control model of the plant, serving tures, behaviour and processes can be captured us- as the key input to build the DT of interest. The trans- ing semantic web technologies. This enables to de- lation uses domain-specific translation templates, liver much faster, the necessary inputs to the design- also stored as part of domain knowledge. The trans- ers and developers of DTs, hence improving knowl- lated control system model serves as a formal repre- edge re-use towards domain specific DT develop- sentation to perform checks to ensure completeness ment. Model driven engineering, on the other hand, and correctness of the control system model as well enables synthesis of the knowledge to easily com- as to validate the knowledge from where it is trans- pile and realize them into the DT implementations lated. Finally, the control system model translates into automatically. This enables harnessing the power of an implementation using model-to-text[19] transla- both knowledge-driven as well as model-driven ap- tor. In our approach, we use executable finite state proaches. While semantic knowledge enables query- machines[20] to implement the final control system ing of relevant domain knowledge at a high level Capture Specify Store Knowledge in Translate Domain Translation Repositories Knowledge to Knowledge Mappings Digital Twin Model Semantic Domain Knowledge Knowledge Control Application System Design Domain Translator Model Instance Language Meta Model Knowledge (Components, (SADL) Faults, (Plant details, Fault Instances, Relations, refers Data conditions) Data) Validated Model Control Custom refers Validator System Domain Model Specific Domain Language Technology Knowledge Expert / End User 1. Stores mapping between Domain Meta-Model and view-of-interest 2. Uses reusable templates to capture mapping using domain-specific language Code Digital Generator Twin Figure 1: High level solution architecture DT. In the following sub-sections, we discuss the as a front end to populate and store knowledge as elements of the solution architecture in detail. OWL-based ontologies. SADL comes with an inbuilt SADL editor that provides support for content assis- tance, syntax validation, error highlighting, seman- 3.1. Describing Knowledge Using tic refactoring, and type checking. It uses Apache Controlled Natural Language Jena[27] based reasoners for type checking and vali- One of the significant challenges in adopting dating the described knowledge in the background. knowledge-driven approaches is providing usable It also provides a query language, Simple Protocol and efficient interfaces to capture knowledge [21]. and RDF Query Language (SPARQL)[28], to execute Web Ontology Language(OWL)[22] is commonly knowledge queries[29] and a Semantic Web Rules used for describing knowledge in the form of ontolo- Language(SWRL)[30] based rule engine to execute gies captured in XML formats. Popular tools like Pro- semantic rules. SADL syntax[26] allows the descrip- tege provide GUIs to describe OWL-based ontolo- tion of types, sub-types, properties, relations, con- gies. However, GUI based interfaces make it chal- straints, rules as the basic building blocks to describe lenging to capture large ontological structures and knowledge. It also provides a feature to import exist- reduces human readability significantly[23]. Ideally, ing OWL models and represent them using English knowledge description tools should allow humans based SADL syntax. SADL reduces the learning to describe and read knowledge with minimal effort curve due to its natural language interface allow- using simple and intuitive interfaces. Controlled ing its users to focus on knowledge description. It Natural Languages (CNL) help to mitigate such also makes the knowledge transfer process easier for challenges and can be used for capturing knowl- domain experts. edge descriptions[24] using semi-structured English. This makes it easy for experts to describe and read 3.2. Describing Base Ontology Using knowledge[25] captured semantically using ontolo- SADL gies. In our approach, we use Semantic Application The first step to describe knowledge is creating a Design Language(SADL)[26], a tool that provides base ontology. In our approach, we describe the English like textual interface for knowledge descrip- fault-management ontology using SADL as shown in tions, conforming to the idea of a CNL. SADL acts figure 2 and 3. Sensor HAS_ SENSOR generated_ data FaultTree Component sensor_ sensor_ stream_ stream_ data data HAS_ FAULT_ MODE Subclass of Subclass of HAS_ FAULT process_ data FaultMode Subclass of Equation Fault (external) Subclass of HAS_ ROOTCAUSE RULE Subclass of DataRule HAS_ DATA_ RULE RootCause Figure 2: SADL description of fault knowledge Figure 3: Fault base ontology A Component description in the ontology can have knowledge. We use a discrete control system de- multiple Faults represented by the HAS_FAULT rela- sign language named M&CML[31] to represent the tion. RootCause, FaultMode and FaultTree are the sub- generated control system model. M&CML provides types of Faults. A RootCause is the lowest level fault the vocabulary and constructs to design a discrete that can occur in a Component. Multiple RootCause control system using textual syntax. We discuss the faults can be aggregated as a FaultMode. Multiple semantics of M&CML separately in the subsequent FaultModes can be aggregated as FaultTrees. Com- subsections. ponents have Sensors producing stream_data which is processed by DataRules. A DataRule defines the 3.3.1. Knowledge Translator rules or boolean equations that detect anomalies in stream_data to assign a RootCause. A RootCause fault The translator consumes two inputs: 1) knowledge can be assigned by multiple DataRules violations. elements from the knowledge base and 2) a trans- The base ontology acts like a vocabulary to capture lation template. The translation template is created the actual faults and rules in the context of a fault using a DSL that captures the mapping logic from management problem. the knowledge elements to the M&CML(control sys- tem model) syntax. A knowledge-driven transla- tor uses the mappings to generate a description of 3.3. Translation From Knowledge to the control system model in M&CML. An example Control System Model of the mapping between knowledge elements and The manual description of domain knowledge may M&CML syntax is shown in figure 4. This approach have errors or incompleteness issues. As a result, it allows the knowledge-driven translator to be reused becomes necessary to verify if the captured knowl- for other domains. edge is good enough to construct a control system A pseudo code describing the internal logic of the DT. To verify the knowledge, we translate it into a translator is shown in algorithm 1. control system model and then validate the gener- The generated control system model consists of ated control system model against completeness and a hierarchy of controllers where every controller is consistency checks. To translate domain knowledge derived from a fault type in the base ontology. The into a control system model, we first map concepts controllers have operating states that are derived from the problem domain into control system con- from the instances of the fault types. The hierarchi- cepts using a mapping template implemented using cal relation between the controllers is derived from a domain-specific language(DSL). This mapping in- the captured fault propagation knowledge. A struc- formation is passed onto a translator, which gener- tural representation of the generated control system ates the control system fault model from the domain model is shown in figure 5 3.4. Knowledge and Model Validation Since the knowledge is described manually by the framework users, there are possibilities for errors or gaps to be present in the knowledge. This poses the risk of such errors propagating into the generated control system model and eventually into the final DT implementation. As the control system model is derived from the knowledge, any gaps in the control system model provide hints about potential gaps in the knowledge. We check the control system model for correctness and completeness, which indirectly validates the knowledge. The validation checks per- formed on the generated control system model are Figure 4: Translation of fault knowledge to control system model using DSL shown in table 1. As the control system model gets derived from knowledge and translation templates, any gaps, er- Algorithm 1: Translation Algorithm rors or inconsistencies identified in the model dur- Input: knowledge base, mappings ing validation imply gaps in the captured knowl- Output: control model edge and translation mappings. In our approach, we /* instantiate empty control model */ rely on the inbuilt validation mechanisms offered by 1 set control model ← empty description M&CML to perform such validation checks. More /* iterate through the mappings */ checks can easily get added to the control system 2 for map : mappings do model by extending the M&CML interpreter. The /* set ke as knowledge element */ validation checks ensure that the generated control 3 set ke ← map.ke model is good enough to get used as a DT model /* set cs as M&CML syntax from map */ and can be used to implement the DT. The valida- 4 set cs ← map.mnc tions also ensure the knowledge itself is complete /* fetch instances of ke */ and correct from the control system viewpoint. 5 instances_ke ← f etch instances(ke) /* loop over fetched instances */ 3.5. Semantics of Control System 6 for ike : instances_ke do Model /* append instance to M&CML syntax */ 7 set css yntax ← ike.append(cs) We discuss the semantics of the control system model /* insert cs in control model */ to understand the structure of generated model. 8 set control model.insert(css yntax ) The control system model embeds into it seman- tics of discrete control systems. The control system /* return control model */ model allows capturing a hierarchical structure of 9 return control model controllers using a parent-child relationship between them. The InterfaceDescription block captures the in- terface(input/output) items for a controller such as command-response, events or alarms, data, ports, address and operating states. The behaviour for a controller is captured using the Transition block. The Transition can be specified for various input-output control ac- tions like command-response, events, alarms and data. These semantics provide a basis for the cor- rectness checks of the populated model against the control system viewpoint. M&CML allows design- ers to describe the control system design using the discussed semantics. Figure 5: Control system model for fault management Sr. No Checks Description 1 Correctness It ensures that the generated model follows the semantics of the control system and correctly uses concepts like commands, events, etc. If there are any correctness issues in the control system model like non-terminating cyclic-state-transitions, we can conclude that the knowledge is incorrect from the control system viewpoint. Multiple such checks can be incrementally added in our framework to support the translation process. 2 Completeness It ensures that the model is complete concerning the control system viewpoint. E.g. A control system design without having a valid state machine description is an incomplete model. Such incompleteness checks can get incorporated into the model checking step based on the needs of the application. 3 Consistency It ensures that the generated control system model is consistent and does not contain any semantically conflicting terms. For E.g. A state transition rule in the model should not conflict with other transitions. Such consistency checks can also be defined as per the needs of the plant. Table 1 3.6. Implementing Control System Model as a DT The generated control system model represents the interactions, behaviour and hierarchical structure of the controllers in the model. However, the gener- ated control system model in M&CML[31] cannot directly execute as a DT. We use SCXML[20], a java- based state machine execution framework, to further translate the control system model into, to realize the final DT. SCXML being a W3C[32] standard, be- comes a suitable implementation format for the con- trol system model to translate into. SCXML supports concepts like states, transition, input & output events, Figure 6: SCXML description of a state-machine struc- streaming data, rule specification using XML based syn- ture performing fault management for a control system tax. As the control system model already captures DT such information, it becomes possible to easily de- rive an SCXML based specification from the model. We use model-to-text[19] translation approaches to any violation of the rules leads to an appropriate generate SCXML specifications from the control sys- state transition in the state machine. Violation of tem model. We use Xtend[33], a Java-based library fault detection rules leads to a transition to the Root- to implement translation templates for translating Cause state. The state machine moves to a suitable the control system model to SCXML. RootCause sub-state based on the data source pro- The correspondence between the control system ducing erroneous data. The transitions continue till model and the SCXML state machine structure is the topmost Fault state is reached following the fault shown in figure 6 aggregation logic. The controllers in the control system model from The SCXML implementation represents the final figure 5 are represented as hierarchical states in control system DT. The DT uses the real-time data SCXML. The controller states are represented as par- from the actual data sources and mimics the be- allel states inside each of the hierarchical states. The haviour of the plant components in terms of their sensors from the control system model are repre- fault hierarchy relations and propagation rules. The sented as datasources in the translated SCXML. The generated control system DT can now be used to per- translated SCXML description is an XML file that is form study and analysis of various what-if scenar- executed by the Apache SCXML[34] execution en- ios like 1) Setting control system in different states. gine. The state machine starts executing from the 2) Injecting erroneous data in the state machine and initial Start state that receives and processes data observing the emergent behaviour. 3) Interacting from the sensors. The state machine executes and with the state machine by raising injected or dummy processes data from the data sources (sensors), and events and alarms. 3.6.1. Deploying DTs as a Service We use this approach for three fault management scenarios to generate DTs for a power plant. The use- The generated control system DT is deployed as a case details, results and experiences are discussed in service in a Java Virtual Environment(JVM). The us- the next section. age of service-based architecture makes it easier for consumers to interact with the DT. Multiple DTs de- veloped for multiple components are deployed as 4. Use-Case - Power Plant Fault independent services. The DT services allow con- sumers to 1) Interact with the DT (e.g. setting states, Management injecting errors, events and data etc.) using simple We apply our knowledge-driven framework to gen- service APIs. 2) Observe DT behaviour using API erate digital twins for multiple fault managements calls. (e.g. getting current state, getting the next scenario in a power plant, as discussed below. state etc.) 3) Integrate DT services with third party analysis platforms. The behaviour analysis and DT 1) Main Steam Temperature Low[35] - The objective usage are consciously kept out of scope in this paper. of this scenario is to maintain the main steam Our approach provides an automated mechanism temperature at the desired operating range for to generate the DT implementation from high-level a given load, coal quality, ambient conditions semantic knowledge. and detect if the temperature goes below the low threshold and indicate it as a fault. 3.7. Summary 2) Main Steam Pressure High[35] - This scenario main- Using the knowledge-driven approach, we per- tains the main steam pressure at the desired op- formed the tasks of 1) capturing knowledge using erating range for a given load, Coal quality and SADL. 2) translating the knowledge to a control sys- ambient conditions and detects if the tempera- tem model represented using M&CML. 3) enabling ture goes above the high threshold to indicate a validations of the control system model for correct- fault. ness, consistency and completeness. 4) and finally, 3) Mill Outlet Temperature High[36] - This scenario translating the M&CML based control system model maintains the main mill outlet temperature to executable SCXML descriptions. Our approach within a certain range and detects if the tempera- does not require developers to manually program ture goes beyond the high threshold for a possible the DT using general programming languages like fault. C++, Java etc. The translation mechanisms auto- mate the generation of state machine descriptions. Each of the above scenarios has a fault management SCXML format further reduces the need to write structure associated with it. The different fault in- computer programs by providing a configuration stances and rules described in SADL are stored as based execution engine. The only manual tasks in fault knowledge. For other scenarios, the knowledge our approach are describing 1) knowledge using is described similarly. SADL and 2) providing mapping information be- Next, we create the mapping template by de- tween domain knowledge and control system con- scribing mapping from the captured knowledge to cepts as templates created using our DSL. With our the control system model represented in M&CML. approach, it also becomes possible to absorb changesThis mapping specification enables the translator to at multiple levels. translate the knowledge to generate a control sys- 1) The knowledge acts as a single point of truth. tem model in M&CML. Next, we create a model-to- Any changes in the knowledge can get handled text template using the Xtend library to derive the by simply regenerating the DT. Hence there is no SCXML specification from the control system model. manual intervention required to perform refac- toring or code changes. 4.1. Evaluation & Results 2) Translation templates capture the traceability of We record the results from applying our approach knowledge to control system model. against the previous efforts of our engineering teams, who manually built similar fault management dig- 3) Any changes in the control system model can ital twins for the same power plant. We compare be handled by updating the mapping template, the two approaches using parameters such as devel- used to translate the control system model into opment time, number of programmers and lines of SCXML format and regenerating the DT. code. Table 2 shows the comparison of the manual SN. Fault Management Usecase Hours Dvlprs LoC Manual Effort for Digital Twin Construction 5) it was possible to reuse the knowledge captured 1 Main Steam Temperature Low 18 2 500 for one use-case in other subsequent use-cases 2 Main Steam Pressure High 12 2 300 too. This further reduced the time to capture 3 Mill Outlet Temperature High 6 2 300 Knowledge Driven DT Generation knowledge, and the experts did not have to start 1 Main Steam Temperature Low 3 1 220 knowledge descriptions from scratch for each use 2 Main Steam Pressure High 2 1 140 case. 3 Mill Outlet Temperature High 1 1 90 6) Use of off-the-shelf framework like SCXML al- Table 2 lowed the teams to execute the generated SCXML Manual vs Knowledge-driven development of DTs for file without writing any code. This resulted in Power Plant Fault-Management Usecases investing more time in verifying the model and the final solution. approach against our approach. From the compar- 7) The teams had to be trained to use the ison, it is evident that our approach significantly knowledge-driven approach and get them famil- outperforms the manual approach by reducing time iar with the techniques. This required some learn- and efforts for digital twin development. The de- ing curve that is not measured in this study. velopment time was reduced by almost half and, only half the developers were required while using our approach. We acknowledge that the results are 5. Conclusion & Future Work early, and a thorough evaluation of the approach with extensive experimentation is planned going for- In this paper, we discussed the role of a control sys- ward. Nevertheless, the early results are promising tem in operating industrial plant systems. We dis- enough to motivate and establish the applicability of cussed the significance and need for digital twins knowledge-driven approaches for developing digi- in arriving at crucial decisions to control and oper- tal twins. ate complex and mission-critical plants. The state of practice in developing digital twins indicates ad- hoc approaches to use domain knowledge for de- 4.2. Learning and Experiences veloping DTs. The manual approach cannot scale We gather the experiences and feedback from the de- well for large industrial plants, as approaches miss velopment teams for our approach. The experiences finer details because of informal communications of and feedback are as described below: required domain knowledge that serves as the crit- ical input for DT realisation. Hence, to reduce the 1) Knowledge description using English is much time and effort in developing digital twins, we pro- easier and human friendly. While experts used posed an approach that automates domain knowl- presentations and text documents to describe the edge reuse to realise DTs. In our approach, we focus domain knowledge, it became effortless to use on generating a control system digital twin from do- the SADL based interface. main knowledge. Our approach captures domain knowledge using a controlled natural language. The 2) The captured semantic knowledge is the single captured knowledge is translated into a control sys- point of truth captured from the experts and tem model using a knowledge-driven translation used towards the automatic realization of the DT approach. Finally, the control system model is used through multiple layers of validation and trans- to generate an implementation of the digital twin lation. using the SCXML framework. 3) The development team could easily consume the Our approach significantly reduces the time and domain knowledge with fewer efforts by reading efforts to construct digital twins, as is evident from the SADL descriptions. They could read and un- its usage for a real-life use case discussed in the pa- derstand complex elements like rules, equations per. Our approach can be easily scaled for large and in plain English better than the same written us- complex systems. Going forward, we will apply our ing programming languages. approach in generating digital twins for multiple as- pects of a power plant, such as emission compliance, 4) The auto-generation of the control system model component wear and tear etc. We want to extend the provided ample time for the verification team idea of knowledge-driven DT development to other to validate the correctness of the model. This plant systems and enhance it to build digital twins resulted in lesser time consumed during post- to simulate the behaviour of a system based on the development testing. underlying physics and chemistry knowledge. References [16] S. Kent, Model driven engineering, in: Inter- national conference on integrated formal meth- [1] J. J. Downs, E. F. Vogel, A plant-wide indus- ods, Springer, 2002, pp. 286–298. trial process control problem, Computers & [17] H. Safwat, M. Zarrouk, B. Davis, Rewriting sim- chemical engineering 17 (1993) 245–255. plified text into a controlled natural language [2] F. Akbarian, E. Fitzgerald, M. Kihl, Synchro- (2018). nization in digital twins for industrial con- [18] P. Patwari, A. Banerjee, S. R. Chaudhuri, trol systems, arXiv preprint arXiv:2006.03447 S. Natarajan, Learning’s from developing a (2020). domain specific engineering environment for [3] M. Loskyll, I. Heck, J. Schlick, M. Schwarz, control systems, in: Proceedings of the 9th In- Context-based orchestration for control of dia Software Engineering Conference, 2016, pp. resource-efficient manufacturing processes, Fu- 177–183. ture Internet 4 (2012) 737–761. [19] M. Albert, J. Muñoz, V. Pelechano, Ó. Pastor, [4] H. Guo, Y. Cheng, T. Ren, L. Wang, L. Yuan, Model to text transformation in practice: gen- H. Jiang, H. Liu, Pulverization characteristics erating code from rich associations specifica- of coal from a strong outburst-prone coal seam tions, in: International Conference on Concep- and their impact on gas desorption and diffu- tual Modeling, Springer, 2006, pp. 63–72. sion properties, Journal of Natural Gas Science [20] J. Barnett, Introduction to scxml, in: and Engineering 33 (2016) 867–878. Multimodal Interaction with W3C Standards, [5] C. Maffezzoni, Boiler-turbine dynamics in Springer, 2017, pp. 81–107. power-plant control, Control Engineering Prac- [21] M. Glawe, C. Tebbe, A. Fay, K.-H. Niemann, tice 5 (1997) 301–312. Knowledge-based engineering of automation [6] O. Why, DUST-COVERED OPERATIONS AND systems using ontologies and engineering data., MAINTENANCE MANUALS, Ph.D. thesis, in: KEOD, 2015, pp. 291–300. Doctoral dissertation, Worcester Polytechnic In- [22] D. L. McGuinness, F. Van Harmelen, et al., Owl stitute, 2017. web ontology language overview, W3C recom- [7] M. Batty, Digital twins, 2018. mendation 10 (2004) 2004. [8] W. Kuehn, Digital twins for decision making [23] N. E. Fuchs, K. Kaljurand, T. Kuhn, Attempto in complex production and logistic enterprises, controlled english for knowledge representa- International Journal of Design & Nature and tion, in: Reasoning Web, Springer, 2008, pp. Ecodynamics 13 (2018) 260–271. 104–124. [9] M. Ayani, M. Ganebäck, A. H. Ng, Digital twin: [24] R. Schwitter, K. Kaljurand, A. Cregan, C. Dol- Applying emulation for machine recondition- bear, G. Hart, A comparison of three controlled ing, Procedia Cirp 72 (2018) 243–248. natural languages for owl 1.1 (2008). [10] J. Wang, L. Ye, R. X. Gao, C. Li, L. Zhang, Digital [25] P. R. Smart, Controlled natural languages and twin for rotating machinery fault diagnosis in the semantic web (2008). smart manufacturing, IJPR 57 (2019) 3920–3934. [26] A. Crapo, A. Moitra, Toward a unified english- [11] J. A. Marmolejo-Saucedo, Design and develop- like representation of semantic models, data, ment of digital twins: a case study in supply and graph patterns for subject matter experts, chains, Mobile Networks and Applications 25 IJSC 7 (2013) 215–236. (2020) 2141–2160. [27] S. Siemer, Exploring the apache jena framework [12] T. Gabor, L. Belzner, M. Kiermeier, M. T. Beck, (2019). A. Neitz, A simulation-based architecture for [28] J. Pérez, M. Arenas, C. Gutierrez, Semantics smart cyber-physical systems, in: 2016 IEEE and complexity of sparql, ACM TODS 34 (2009) ICAC, IEEE, 2016, pp. 374–379. 1–45. [13] C. Zhang, G. Zhou, J. He, Z. Li, W. Cheng, A [29] L. Lim, H. Wang, M. Wang, Semantic queries data-and knowledge-driven framework for dig- by example, in: Proceedings of the 16th in- ital twin manufacturing cell, Procedia CIRP 83 ternational conference on extending database (2019) 345–350. technology, 2013, pp. 347–358. [14] S. Boschert, C. Heinrich, R. Rosen, Next gener- [30] I. Horrocks, P. F. Patel-Schneider, H. Boley, ation digital twin, in: Proc. tmce, Las Palmas S. Tabet, B. Grosof, M. Dean, et al., Swrl: A se- de Gran Canaria, Spain, 2018, pp. 209–218. mantic web rule language combining owl and [15] T. Berners-Lee, J. Hendler, O. Lassila, The se- ruleml, W3C Member submission 21 (2004) mantic web, Scientific american 284 (2001) 34– 1–31. 43. [31] P. Patwari, S. R. Chaudhuri, S. Natarajan, G. Muralikrishna, M&c ml: A modeling lan- guage for monitoring and control systems, Fu- sion Engineering and Design 112 (2016) 761– 765. [32] D. Schnelle-Walka, S. Radomski, T. Lager, J. Bar- nett, D. Dahl, M. Mühlhäuser, Engineering in- teractive systems with scxml, in: Proceedings of the 2014 ACM SIGCHI symposium on En- gineering interactive computing systems, 2014, pp. 295–296. [33] K. Birken, Building code generators for dsls using a partial evaluator for the xtend language, in: ISoLA, Springer, 2014, pp. 407–424. [34] Apache (2009) SCXML, howpublished = https://commons.apache.org/proper/ commons-scxml/, note = Accessed: 2021-07-10, ???? [35] N. Mazalan, A. Malek, M. A. Wahid, M. Mailah, Review of control strategies employing neural network for main steam temperature control in thermal power plant, Jurnal Teknologi 66 (2014). [36] H. QIAN, D.-j. MAO, C. LI, G.-l. HE, Alarm trigging faults diagnosis system for over-high temperature of mill outlet, Journal of Shanghai University of Electric Power (2010) 01.