-

Development of a knowledge base based oncontext analysis of external information resources

N Yarushkina

A Filippov

V Moshkin

0 0 Ulyanovsk State Technical University , Severny Venetz str. 32, Ulyanovsk, Russia, 432027

2018

328 337

The article describes the process of developing a knowledge base (KB). The content of KB is formed as a result of the analysis of the contexts of external information resources. In this case, the context is a certain ”point of view” on the problem area (PrA) and its features. A graph database (DB) Neo4j is used as the basis for storing the contents of the KB in the form of an ontology. An attempt is made to implement the mechanism of inference by the contents of a graph database. The mechanism is used to dynamically generate the screen forms of the user interface to simplify the work with the KB. This article also describes the method of extension of KB based on the content of the wiki-resources and relational databases.

inference module. (iii) A subsystem for interaction with users:

screen forms generation module. (iv) A subsystem for importing data from wiki-resources: a module for importing data from wiki-resources. (v) A subsystem for importing data from relational databases: a module for importing data from relational databases.

2. The organization of the ontology store of KB

Ontology is a model of the representation of the PrA in the form of a semantic graph [9].

Graph-oriented database management system (Graph DBMS) Neo4j is the basis of the ontology store for KB. Neo4j is currently one of the most popular graph databases and has the following advantages: (i) Having a free community version. (ii) Native format for data storage. (iii) One copy of Neo4j can work with graphs containing billions of nodes and relationships. (iv) The presence of a graph-oriented query language Cypher. (v) Availability of transaction support.

Neo4j was chosen to store the description of the PrA in the form of an applied ontology, since the ontology is actually a graph. In this case, it is only necessary to limit the set of nodes and graph relations into which ontologies on RDF and OWL will be translated.

The context of an KB is some state of content of KB, obtained during versioning or building a content of KB using di erent "points of view" [6, 8].

Figure 2 shows an example of the translation of the owl representation of ontology of family relations into the entities of the KB.

Formally, the content of the KB can be represented by the following equation: O = hT; CTi ; ITi ; P Ti ; STi ; F Ti ; RTi i; i = 1; t; (1) where t is a number of the KB contexts, T = fT1; T2; : : : ; Ttg is a set of KB contexts, CTi = fC1Ti ; C2Ti ; : : : ; CnTi g is a set of KB classes within the i-th context, ITi = fI1Ti ; I2Ti ; : : : ; InTi g is a set of KB objects within the i-th context, P Ti = fP1Ti ; P2Ti ; : : : ; PnTi g is a set of KB classes properties within the i-th context, STi = fS1Ti ; S2Ti ; : : : ; SnTi g is a set of KB objects states within the i-th context, F Ti = fF1Ti ; F2Ti ; : : : ; FnTi g is a set of the logical rules xed in the KB within the i-th context, RTi is a set of KB relations within the i-th context de ned as:

RTi = fRCTi ; RITi ; RPTi ; RSTi ; RFTi g; where RCTi is a set of relations de ning hierarchy of KB classes within the i-th context, RITi is a set of relations de ning the "class-object" KB tie within the i-th context, RPTi is a set of relations de ning the "class-class property" KB tie within the i-th context, RSTi is a set of relations de ning the "object-object state" KB tie within the i-th context, RFTi is a set of relations generated on the basis of logical KB rules in the context of i-th context.

Principles similar to the paradigm of object-oriented programming are at the basis of the content of the KB:

KB classes are concepts of the PrA; classes can have properties, the child-class inherits properties of the parent class; objects of KB describe instances of the concepts of the PrO; speci c values for the properties of objects inherited from the parent class are determined by the states; logical rules are used to implement the functions of inference by the content of KB.

3. The inference on the contents of KB

The inference is the process of reasoning from the premises to the conclusion. Reasoners are used to implement the function of inference. Reasoners form logical consequences on the basis of many statements, facts and axioms. The most popular at the moment reasoners are [5, 17]: Pellet; FaCT++; Hermit;

Racer, etc.

These reasoners are actively used in the development of intelligent software. However, Neo4j does not assume the possibility of using similar default reasoners. Thus, there is a need to develop a mechanism for inference based on the content of a KB [3, 4].

Currently the Semantic Web Rule Language (SWRL) is used to record logical rules [ 24 ].

These SWRL rules describe the conditions under which object a has "nephew-uncle" relation with object c. Formally the logical rule of the KB is:

F Ti = hAT ree; ASW RL; ACypheri; where Ti is the i-th context of the the KB, AT ree is the tree-like representation of a logical rule F Ti , ASW RL is the SWRL representation of the logical rule F Ti , ACypher is the Cypher representation of the logical rule F Ti .

The tree-view AT ree of a logical rule F Ti is:

AT ree = hAnt; Consi; where Ant = Ant1 Ant2 : : : Antn is the antecedent (condition) of the logical rule F Ti ; 2 fAN D; ORg is a set of permissible logical operations between antecedent atoms; Cons is the consequent (consequence) of a logical rule F Ti .

Figure 3 shows an example of a tree-like representation of two logical rules for the ontology of family relations. That rules describes the father-child relationships.

The tree-like logical rule is translated into the following SWRL: hasFather(?a,?b) => hasChild(?b,?a) hasSister(?c,?a) & hasFather(?c,?b) => hasChild(?b,?a) and the following Cypher view:

MATCH (s1:Statement{name: "hasChild", lr: true}) MATCH (r1a)<-[:Domain]-(:Statement{name:"hasFather"})-[:Range]->(r1b) MERGE (r1b)-[:Domain]->(s1) MERGE (r1a)-[:Range]->(s1) MATCH (s1:Statement{name: "hasChild", lr: true}) MATCH (r2c)<-[:Domain]-(:Statement{name:"hasSister"})-[:Range]->(r2a) MATCH (r2c)<-[:Domain]-(:Statement{name:"hasFather"})-[:Range]->(r2b) MERGE (r2b)-[:Domain]->(s1)

MERGE (r2a)-[:Range]->(s1)

Thus, the rules are translated into their tree-view when imported into the KB of logical rules in the SWRL language.

The presence of a tree-like representation of a logical rule allows to form both a SWRLrepresentation of a logical rule and a Cypher-representation based on it.

Relations of a special type are formed by using Cypher to represent the logical rule between entities of the KB. Figure 4 shows the content of KB after executing the Cypher queries that were built for the logical rule shown in Figure 3. These relations correspond to the antecedent atoms of the logical rule. Formed relationships provide the inference from the contents of the KB. 4. Building a Graphical User Interface based on the contents of a KB The dynamic graphical user interface (GUI) mechanism is used to simplify the work with KB of untrained users and control of user input [11, 13, 21].

You need to map the KB entities to the GUI elements to build a GUI based on the contents of the KB. Formally, the GUI model can be represented as follows:

U I = hL; C; I; P; Si; (2) where L = fL1; L2; : : : ; Lng is a set of graphical GUI components (for example, ListBox, TextBox, ComboBox, etc.), C = fC1; C2; : : : ; Cng is a set of KB classes, I = fI1; I2; : : : ; Ing is a set of KB objects, P = fP1; P2; : : : ; Png is a set of properties of KB classes, S = fS1; S2; : : : ; Sng is a set of states of KB objects.

The following function is used to build a GUI based on content of KB:

(O) : fCO; IO; P O; SO; F O; ROgTi ! fLUI ; CUI ; IUI ; P UI ; SUI g; where fCO; IO; P O; SO; F O; ROgTi is a set of entities of KB represented by expression 1 within the i-th context; fLUI ; CUI ; IUI ; P UI ; SUI g is a set of GUI entities of KB represented by the expression 2.

Thus, the contents of the KB are mapped to set of GUI components. This makes it easier to work with KB for a user who does not have skills in ontological analysis and knowledge engineering. It also allows you to monitor the logical integrity of the user input, which leads to a reduction in the number of potential input errors.

5. Extracting knowledge from wiki-resources

At present, wiki-technologies are used to organize corporate KB. It is necessary to solve the task of knowledge extracting from wiki-resources [ 14, 15, 16, 23, 27 ]. Table 1 contains the result of mapping the KB entities to the wiki-resource entities [22]. Thus, it becomes possible to import the structure of external wiki resources for initial lling of the KB contents.

Also, a content of KB can be built on the basis of an analysis of the content of wiki-resources pages. In this work the Syntaxnet [22] framework to construct a syntactic tree Synt of content of wiki-resources pages is used. Further, using a set of rules RuleSynt, a syntax tree Synt is translated into entities of KB.

Formally the functions of translating a syntactic tree into entities of KB:

Struct(Synt) : fNSynt; RuleSStyrnutctg ! fCO; P O; RPOgTi ; Content(Synt) : fNSynt; RuleSCyonnttentg ! fIO; SO; RIO; RSOgTi ;

where NSynt is a set of nodes of the syntactic tree Synt, RuleSStyrnutct is a set of rules to translating nodes of syntactic into structure entities of the KB, RuleSCyonnttent is a set of rules to translating nodes of syntactic into content entities of the KB, fCO; P O; RPOgTi is a set of structure entities of the KB within the context Ti (eq. 1), fIO; SO; RIO; RSOgTi is a set of content entities of the KB within the context Ti (eq. 1).

Formally the rules to translating nodes of syntactic into entities of the KB: RuleSStyrnutct =

N1Synt; N2Synt; : : : ; NiSynt; : : : ; NnSynt ! fCO; P O; RPOg; RuleSCyonnttent =

N1Synt; N2Synt; : : : ; NiSynt; : : : ; NmSynt ! fIO; SO; RIO; RSOg; where N Synt is the i-th node of syntactic tree.

Thus, it becomes possible to extract knowledge from the structure of wiki-resource and contents of wiki-resource pages and present the extracted knowledge as a content of KB.

6. Extracting knowledge from relational databases

Relational databases are widely used for data storing and contains subject area description in the form of interconnected tables. Nowadays, researchers of various scienti c groups are involved in solving the problem of extracting knowledge from relational databases.

The relational data model can be represented as the following expression:

RDM = (E; R) ;

F (x)

Ri = Ej G (x) Ek; where E = fE1; E2; : : : ; En is a set of database tables (entities), R = fR1; R2; : : : ; Ri; : : : Rn is a set of relationships between database tables: where Ej , Ek are database entities; F (x) is the relationship between entity Ej and entity Ek, G (x) is the relationship between entity Ek and entity Ej .

Scope of functions F (x) and G (x) are U { single relationship and N { multiply relationship. For mapping of relational database structure with KB structure special functions are used:

Struct (RDM ) : fERDM ; RRDM g ! fCO; P O; RPOgTi ; Content (RDM ) : fERDM ; RRDM g ! fIO; SO; RIO; RSOgTi ;

where fERDM ; RRDM g is a set of entities of relational database and relationships between them, fCO; P O; RPOgTi is a set of structure entities of the KB within the context Ti (eq. 1), fIO; SO; RIO; RSOgTi is a set of content entities of the KB within the context Ti (eq. 1).

Importing data from a relational database to the KB were nish after mapping the structure of the relational database to the set of structure entities fCO; P O; RPOgTi of the KB ends. Set of content entities of KB fIO; SO; RIO; RSOgTi are created during the import of data basis from the relational database (row set ) to the Ti context. Table 2 contains a comparison of KB entities with relational database entities.

Thus, it becomes possible to extract knowledge from the contents of relational databases and present the extracted knowledge as a content of KB.

7. Conclusion

Thus, the use of KB stored in the Graph DBMS in the decision support process presupposes the existence of a certain set of mechanisms: organization of inference on the content of KB by translating SWRL-rules into Cypherstructures; building a graphical user interface based on the contents of KB; automated import of knowledge from structure and content of wiki-resources; automated import of knowledge from relational databases.

These mechanisms allow to automate the learning process of KB and simplify the work of specialists with KB. The application of a contextual approach to the storage of knowledge raises the e ectiveness of the use of subject ontologies, allowing to adapt the KB to the characteristics of the PrA and to the requirements of specialists. This approach provides them with a tool that is convenient in a software dynamically changeable depending on the contents of the KB. 8. References [1] Berant J, Chou A, Frostig R and Liang P 2013 Semantic parsing on freebase from question-answer pairs Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) 1533-1544 [2] Bianchini D, De Antonellis V, Pernici B and Plebani P 2005Ontology-based methodology for eservice discovery Information Systems 31 361380 [3] Bobillo F and Straccia U 2008 FuzzyDL: an expressive fuzzy description logic reasoner Proceedings of the 17th IEEE International Conference on Fuzzy Systems 923-930 [4] Bobillo F and Straccia U 2010 Representing fuzzy ontologies in OWL 2 Proceedings of the 19th IEEE International Conference on Fuzzy Systems 2695-2700 [5] D entler K, Cornet R, Aten Teije and N de Keizer 2011 Comparison of r easoners f or l arge ontologies i n t he OWL 2 EL pro le Semant. web 2 7187 [6] Falbo R A, Quirino G K, Nardi J C, Barcellos M P, Guizzardi G and Guarino N 2016 An ontology pattern language for service modeling Proceedings of the 31st Annual ACM Symposium on Applied Computing 321-326 [7] Farid DM, Al-Mamun MA, Manderick Band Nowe A2016 An adaptive r ule-based classi er for mining big biological data Expert Systems with Applications 64 305316 [8] Gao Mand Liu C2005 Extending OWL by f uzzy description l ogic Proceedings of the 17th IEEE International Conference on Tools with Arti cial Intelligence 562-567 [9] Guarino N and Musen M A 2015 Ten years of Applied Ontology Applied Ontology 10 169170 [10] Guizzardi G, Guarino N, Almeida J P A 2016 Ontological Considerations About the Representation of Events and Endurants in Business Models International Conference on Business Process Management 20-36 [11] Hattori S and Takama Y 2014 Recommender System Employing Personal-VallueBased User Model J Adv. Comput. Intell. Intell. Inform. 18 157165 [12] Neo4j (Access mode: https://neo4j.com/product) (14.05.2018) [13] Lti H , Kolski C, Ayed M B and Alimi A M 2013 A human-centred design approach f or developing dynamic decision support system based on knowledge discovery i n databases Journal of Decision Systems 22 6996 [14] Mikhaylov D V, Kozlov A P and Emelyanov G M 20A1n5 approach based on TF-IDF metrics to extract the knowledge and relevant linguistic means on subject-oriented text sets Computer Optics 39(3) 429-438 DOI: 10.18287/0134-2452-2015-39-3-429-438 [15] Mikhaylov D V, Kozlov A P and Emelyanov G M 20E1x6traction of knowledge and relevant linguistic means with efficiency estimation for the formation of subject-oriented text sets Computer Optics 40(4) 572-582 DOI: 10.18287/2412-6179-2016-40-4-572-582 [16] Mikhaylov D V, Kozlov A P and Emelyanov G M 2A01n7 approach based on analysis of n-grams on links of words to extract the knowledge and relevant linguistic means on subjectoriented text setsComputer Optics 41(3) 461-471 DOI: 10.18287/2412-6179-2017-41-3-461-471 [17] Pellet Framework (Access mode: http://github.com/stardog-union/pellet) (14.05.2018) [18] Rajpathak D, Chougule R and Bandyopadhyay P 2012 A domain-speci c decision support system f or knowledge discovery using association and t ext mining Knowledge and Information Systems 31 405432 [19] Renu R S, Mocko G and Koneru A 2013 Use of Big Data and Knowledge Discovery to Create Data Backbones for Decision Support Systems Procedia Computer Science 20 446453 [20] Rubiolo M, Caliusco ML, Stegmayer G, Coronel M, Fabrizi MG 2012 Knowledge discovery through ontology matching: An approach based on an Arti cial Neural Network model I nformation Sciences 194 107-119 [21] Ruy F B, Reginato C C, Santos V A, Falbo R A and Guizzardi G 2015 Ontology Engineering by Combining Ontology Patterns 34th International Conference on Conceptual Modeling 173-186 [22] Shestakov V K 2011 Development and maintenance of information systems based on ontology and Wiki-technology Advanced Methods and Technologies, Digital Collections 299-306 (in Russian) Acknowledgments This work was financially supported by the Russian Foundation for Basic Research (Grant No. 16-47-732054).

[23] Suchanek F M , Kasneci G and Weikum G 2007 YAGO:

A Core of Semantic Knowledge Unifying WordNet and WikipediaProceedings of the 16th

International Conference on World Wide Web 697706

[24] SWRL: A Semantic Web Rule Language Combining OWL and RuleML (Access mode: https://www .w3.org/Submission/SWRL) ( 14 . 05 . 2018 )

[25] SyntaxNet: Neural Models of Syntax (Access mode: https://github .com/tensorflow/models/tree/ master/research/syntaxnet) ( 14 . 05 . 2018 )

[26] Yarushkina

, Filippov

and Moshkin

V 2017

Development of theUnified Technological Platform for Constructing the Domain Knowledge Base Through the Context Analysis Creativity in Intelligent Technologies and Data Science 6272

[27] Zarubin

, Koval

, Filippov

and Moshkin

2017Application of Syntagmatic Patterns to Evaluate Answersto Open-Ended Questions Creativity in Intelligent Technologies and Data Science 150162