Combining Ontologies and Markov Logic Networks for Statistical Relational Mobile Network Analysis Kasper Apajalahti1 , Eero Hyvönen1 , Juha Niiranen2 , Vilho Räisänen3 1 Aalto University, Semantic Computing Research Group (SeCo), Finland 2 University of Helsinki, Department of Mathematics and Statistics, Finland 3 Nokia Networks Research, Finland Abstract. Mobile networks are managed by means of operations support sys- tems (OSS) which facilitate performance, fault, and configuration management. Network complexity is increasing due to the heterogeneity of cell types, devices, and applications. Characterization and configuration of networks optimally in such a scenario is challenging task. This paper introduces an experimental plat- form that combines statistical relational learning and semantic technologies by integrating a mobile network simulator, Markov Logic Network model (MLN) and an OWL 2 ontology into a runtime environment tool. Our experiments, based on a prototype implementation, indicate that the combination of an ontology and MLN model can be utilized in network status characterization, optimization and visualization. 1 Introduction Mobile networks have become a crucial part of our society, and yet their significance will increase in the future, as the number of users, devices, and applications are ex- pected to drastically increase [5]. Thus, the future 5G networks need to cater for a mas- sive increase in data volume and number of terminals, the latter due to the widespread adoption of Internet of Things (IoT) [12]. Networks need to be configured optimally to provide customers a high service qual- ity with low operational expenses. In legacy systems, network configurations have been handled manually, but due to the increasing complexity of networks, more automation is needed from Operations and Support Systems (OSS) [9]. Currently researched and standardized technology in the telecommunication field is Self-Organizing Networks (SON), which is essentially a closed-loop agent system reacting to measurements, typi- cally by means of a fixed rule base [9]. The challenge with this approach is creating and maintaining the rule bases in view of geographic and temporal variety on a cell level of radio access networks. Machine Learning (ML) is expected to increase the level of automation in the OSS field by, for example, analyzing traffic patterns and cell-related data to learn statistical correlations. The output of an ML system can be characterized as hypothetical in con- trast to deterministic results of a traditional rule-based system. In the long run, it has been argued that knowledge models would bring benefits as a basis of future telecom- munication systems both in view of systems design and also from the perspective of value networks [18]. In this paper we propose a new approach to automated mobile network management by using statistical relational learning with a Markov Logic Network model (MLN) [20] for handling uncertainty in mobile network analysis. In addition to the MLN model, we propose an OWL 2 ontology above it to provide global meaning and description logic (DL) reasoning capabilities to the system. The paper is divided as follows: after presenting a short view of our system, section 3 briefly presents the MLN and how it is applied to our implementation. After this, section 4 describes our OWL 2 model and section 5 presents dataset statistics and how MLN reasoning results can be examined via an RDF-based GUI. Finally, section 6 discusses related work and section 7 concludes the paper and presents ideas for future work. 2 System Overview The novel idea of our system is to combine modeling of uncertainty and semantics in an automated OSS system. The system contains a mobile network (LTE) simulator, an MLN model for statistical reasoning, an OWL 2 model for semantic modeling and a GUI for representing and interacting with the system. We simulate a small urban area with 5000 citizens (network users) and with 32 cells. The interface between the LTE simulator and the MLN model contains OSS man- agement activities, such as reading performance data from the simulator and sending configuration data back to it. The performance data contains key performance indica- tors (KPIs) for various measurement cases. KPIs utilized in the MLN model are channel quality indicator (CQI) for measuring the signal quality of a cell and radio link failures (RLF) for measuring the amount of connection failures per cell. The configuration man- agement contains changes in the transmission power (TXP) and angle (remote electrical tilt, RET) of a cell antenna. The MLN model processes the performance data into evidence that is used in the MLN reasoning. The reasoner infers posterior marginal probabilities for potential net- work configuration changes. The model parameters of the MLN reasoner can be fitted based on historical performance data and executed configuration actions. The OWL 2 model is constructed by transforming the MLN reasoner’s model into a SHIF OWL 2 DL ontology that can be utilized with a Pellet reasoner1 . In addition to DL reasoning capabilities, the OWL 2 ontology is published as an RDF graph in a SPARQL endpoint. An operator interface for managing the system and the underlying mobile network is implemented on top of the SPARQL endpoint in HTML5. 3 Markov Logic Network Model This section briefly introduces the Markov logic [20] by giving a definition of the MLN, explaining the inference of marginal probabilities, and describing how the MLN is adapted to our OSS system. 1 https://github.com/Complexible/pellet 3.1 Definition MLNs allow uncertain and contradictory knowledge in a first-order logic (FOL) model by introducing a weight parameter for each formula in the FOL knowledge base. The weighted set of formulas defines a template for a Markov network, in which the cliques and clique potentials are determined by the formulas and formula weights. Definition 1. A Markov logic network L is a set of pairs (Fi , wi ), where Fi is a first- order formula and wi is a real-valued weight parameter. Together with a set C of con- stant terms, over which the formulas in L are applied, it defines a Markov network ML,C with a binary variable for each possible grounding of each predicate appearing in L and a feature for each possible grounding of each formula in L. The value of the feature corresponding to a grounding of formula Fi is 1 if the ground formula is true, and 0 otherwise. The weight of the feature is wi , the weight associated with Fi in L. Each state of the variables in a Markov network ML,C represents a possible world, i.e., a truth assignment for each of the ground atoms for (L, C). The probability distribution over possible worlds x ∈ X specified by ML,C is ! 1 X P (X = x) = exp wi ni (x) , (1) Z i where ni (x) isPthe number Pof true groundings of Fi in x and Z is a partition function given by Z = x∈X exp ( i wi ni (x)). Intuitively this means that the weights of the true ground formulas give the logarithmized factors of the distribution function. If two worlds differ only on a single ground formula, then the weight of the formula gives the logarithmic odds of choosing one world over the other. 3.2 Inference A typical inference task is to infer the most likely state or a marginal distribution of some subset of the variables using the values of all other variables as evidence. In prac- tice, exact inference over an MLN model is infeasible. Richardson and Domingos [20] introduce an efficient stochastic algorithm for this problem. 3.3 Application in OSS setting We define our MLN model in terms of three types of predicates: – Context predicates indicate the current status of the network and the environment. A context predicate can indicate, for example, that some KPI value for a cell is currently below the acceptable level or that two cells are neighbors in the network topology. – Objective predicates indicate required changes to KPI values to achieve perfor- mance targets defined by the operator. For example, an objective predicate can in- dicate that a particular KPI value for some cell is too low and needs to be increased. – Action predicates indicate changes to network configuration parameter values. Each predicate represents an attribute of a cell in the network or a relation among the cells. The domain of a predicate can be either the set of cells X or an n-ary Cartesian product of X. The MLN model is composed of rules with these predicates. We wish the rules to describe a correlation between a set of Objectives and a set of Actions in a certain Context. A typical inference task is to query for appropriate actions using the cur- rent context data and objective requirements as evidence. Therefore, the rule format is defined as  C(x) ⇒ O(x) ⇔ A(x) . Example 1. Let L be a simple MLN model consisting of the weighted rules defined in Table 1. Here variables c and d denote a cell in the mobile network. Suppose that wi Fi 0.5 HighRlf (c) ⇒ (DecRlf (c) ⇔ IncT xp(c)) 0.2 HighRlf (c) ⇒ (DecRlf (c) ⇔ DecRet(c)) 0.7 (N eighbor(c, d) ∧ LowCqi(c)) ⇒ (IncCqi(c) ⇔ (DecRet(c) ∧ IncT xp(d))) Table 1. Examples of weighted rules in the MLN model we have a mobile network with two neighbor cells named C1 and C2 and that we measured a low CQI value for cell C1 and high RLF value for C2. We would like to use this information to infer proper configuration actions to get the CQI and RLF values to a normal level. We use the MLN reasoner to query the MLN model L for marginal probability distributions for action proposals IncT xp(c), DecT xp(c), IncRet(c), and DecRet(c) for each cell c given the evidence: E = {N eighbor(C1, C2), LowCqi(C1), HighRlf (C2), IncCqi(C1), DecRlf (C2)} An example of the reasoning output is shown in Table 2, which shows inferred marginal probabilities for cell configurations given the model L and the evidence E. The output indicates that decreasing RET for cell C1 and increasing TXP for cell C2 are the most likely actions to achieve the objectives according to the model. Action P (Action|L, E) Action P (Action|L, E) IncT xp(C1) 0.32 IncRet(C1) 0.27 DecT xp(C1) 0.36 DecRet(C1) 0.50 IncT xp(C2) 0.56 IncRet(C2) 0.31 DecT xp(C2) 0.24 DecRet(C2) 0.37 Table 2. Marginal probabilities for cell configurations 4 OWL 2 Model The OWL 22 ontology generalizes (with URIs and semantic annotation) terms and vari- ables used in the MLN reasoning into a semantic model. Logically, the ontology con- sists of two subontologies: the OSS and MLN ontology. A semantic mapping between the OSS and MLN ontologies will enhance the interoperability between the heteroge- neous mobile network environment and the MLN rule base extracted from it. 4.1 OSS Ontology The OSS ontology describes network context retrieved from the LTE simulator and used by the MLN reasoner as evidence. Figure 1 briefly depicts the TBox structure of the OSS ontology. The most fundamental class in the model is Cell which defines 1) network topology with a cellHasNeighbor relation to other Cell instances, 2) configuration parameters with a cellHasParameter relation to Parameter in- stances, and 3) performance measurements with a cellHasIndicator relation to Indicator instances. The current version of the OSS ontology models the MLN evi- dence rather than network context generally. Thus, the only subclasses for Parameter are Txp and Ret and for Indicator, Rlf and Cqi. The Parameter and Indica- tor instances can be related to events by the property cellHasEvent, such as con- figurations inferred by the MLN model. Events are instances of the class Event where the value of the property eventHasImpact is an instance of Impact (Decrease or Increase) with a numerical value indicating the current and previous values. Fig. 1. The OSS ontology TBox 2 https://www.w3.org/TR/owl2-overview/ 4.2 MLN Ontology The MLN ontology can be seen as an OWL 2 interpretation of the MLN evidence and action proposals that are semantically bound to mobile network concepts. Figure 2 depicts a TBox model of an MLN rule. The MLNRule class defines a rule and it has a numerical value ruleWeight defining its weight and relations to rule parts MLNContext, MLNObjective, and MLNAction. The figure also shows that the rule parts are bound to network classes Parameter, Indicator, and Impact. For example, an MLNAction class has a relation to a subclass of Parameter (e.g. Txp) and to a subclass of Impact (e.g. Increase). Similarly, MLNObjective have re- lations to subclasses of Indicator and Impact and MLNContext have relations to a subclass of Indicator and to a crisp value of an indicator (High, Medium, or Low). Fig. 2. Rules in the MLN ontology TBOx Figure 3 shows how an ActionProposal class is modeled in the MLN ontol- ogy. Cell has a relation to ActionProposal whose content is defined with rela- tions to an Impact and Parameter (for example Increase and Txp). Moreover, a data property hasActionProbability defines the marginal probability of the ActionProposal. Fig. 3. Action proposals in the MLN ontology TBox 5 Evaluation The model presented above has been implemented and tested using the LTE simulator. This section presents evaluation of our system by analyzing sizes of the MLN and OWL model and then showing a visualization of the reasoning outcome. 5.1 Dataset Statistics Table 3 clarifies the average sizes of the two models. As it can be seen, the rule base is the major part of both the MLN model (5450 rules) and the OWL 2 ontology (58000 triples). The high amount of MLN rules is due to the combinatorial generation of rules. The rule base could be optimized by removing insignificant rules, i.e. rules which have little or no impact on the inference in view of data used in analysis. MLN evidence contains 100 lines which are processed into 1600 triples and 15 action proposals are processed into 70 triples. Altogether, the OWL 2 model is approx- imately 5.7 times bigger than the corresponding MLN model. Evidence Rules Action proposals File size in MB MLN model (lines) 100 5450 15 0.49 OWL 2 model (triples) 1600 58000 70 2.8 Table 3. Statistics of datasets 5.2 Visualization of the Reasoning Outcome We have built an HTML5 application [1] to provide an empiric evaluation of the MLN reasoning results (action proposals). The interface is built by using faceted browsing activities and interactive visualization methods in order to support user’s information exploration from RDF-based data. Figure 4 displays latest cell-specific action proposals and their relations to the evidence of the reasoner (latest KPI values). Selected facet values help the user to examine patterns that might occur between the network context and MLN reasoning. For example, the figure demonstrates that if a cell has high RLF value and low CQI value, it usually implies a proposal to increase the TXP of that cell. 6 Related Work Reasoning under uncertainty has in the past been performed with a variety of methods, including Markov Networks [11], Bayesian Networks [14], as well as probabilistic ex- tensions to description logic ontologies such as to OWL [7]. A more recent additional approach is the MLN [20], which makes it possible to compactly define statistical rela- tional models containing both certain and uncertain facts as well as potentially contra- dictory pieces of data using First Order Logic (FOL). In the telecommunications field, there have been plenty of research projects which adapt these techniques into different Fig. 4. Visualization for MLN reasoning outcome and evidence network management tasks. For example, Bayesian Networks are proposed for auto- matic network fault management [10][2] and MLN to diagnose anomalous cells [4]. In mobile network research, ontologies have been used to model general concepts of the telecommunication field [17] as well as to model context in mobile network man- agement [21][24]. The Linked Open Data (LOD)3 paradigm has also been addressed in [22], where cells and terminals are modeled and combined with other data sources, for example, with event data. However, there exists no research of using ontologies and statistical reasoning together to analyze and configure the mobile network, as in this paper. In other problem domains there have been experiments of combining an ontological approach with statistical relational learning. For example, Bayesian Networks and their relational extensions, such as multi-entity Bayesian Networks (MEBN), have been ap- plied with OWL ontologies in order to infer probabilistic results from a certain domain [6]. BN-specific projects have use cases, for example, for medical decision support [25], financial fraud detection [3], and instance matching in a geological domain [15]. MLNs have been applied with semantic technologies mainly in problem domains for ontology matching [13] and for natural language processing, in which ontological concepts are extracted from text [8][16]. Another model used for similar problems as MLN is the Infinite Hidden Relational Model (IHRM) [23] and its semantic extension, the Infinite Hidden Semantic Model (IHSM), which also combines certain and uncertain facts. This technique has been demonstrated in social network analysis. [19] 7 Conclusion and Future Work We have generated a mapping from our MLN model into a consistent SHIF OWL 2 DL ontology. The OWL 2 model is currently utilized only as an RDF graph by using SPARQL queries (faceted navigation) in the HTML5 GUI. Advanced DL reasoning tasks have not yet been implemented. The next step in this project is to enhance the OWL-MLN interaction so that MLN model settings can be dynamically modified by a human or a DL reasoner (Pellet). Model settings include selection of measurement variables, their threshold values and rules to be generated from the set of variables. Model settings could even include some 3 http://linkeddata.org/ initial rule weights with respect to prior knowledge. Moreover, our system will be en- hanced by creating high-level goals which the user can use to modify the behaviour of the reasoning system. For example, high-level goals could be mapped to corresponding MLN model settings. Altogether, the practical reason to combine semantic technologies with statistical relational reasoning was to generalize the representation of the MLN model in order to make it semantically adaptable. Current implementation gives promising results to continue this work in order to enhance the system towards autonomic computing and towards managing a more dynamic and heterogeneous environment. References 1. Apajalahti, K., Hyvönen, E., Niiranen, J., Räisänen, V.: StaRe: Statistical Reasoning Tool for 5G Network Management. Submitted to ESWC 2016 Demo and Poster Papers (May 2016) 2. Bennacer, L., Ciavaglia, L., Chibani, A., Amirat, Y., Mellouk, A.: Optimization of fault diag- nosis based on the combination of Bayesian Networks and Case-Based Reasoning. In: Net- work Operations and Management Symposium (NOMS), 2012 IEEE. pp. 619–622. IEEE (2012) 3. Carvalho, R., Matsumoto, S., Laskey, K., Costa, P., Ladeira, M., Santos, L.: Probabilis- tic Ontology and Knowledge Fusion for Procurement Fraud Detection in Brazil. In: Bo- billo, F., Costa, P., d’Amato, C., Fanizzi, N., Laskey, K., Laskey, K., Lukasiewicz, T., Nickles, M., Pool, M. (eds.) Uncertainty Reasoning for the Semantic Web II, Lecture Notes in Computer Science, vol. 7123, pp. 19–40. Springer Berlin Heidelberg (2013), http://dx.doi.org/10.1007/978-3-642-35975-0 2 4. Ciocarlie, G., Connolly, C., Cheng, C.C., Lindqvist, U., Nováczki, S., Sanneck, H., Naseer-ul Islam, M.: Anomaly Detection and Diagnosis for Automatic Radio Network Verification. In: Agüero, R., Zinner, T., Goleva, R., Timm-Giel, A., Tran-Gia, P. (eds.) Mobile Networks and Management, Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol. 141, pp. 163–176. Springer International Publishing (2015), http://dx.doi.org/10.1007/978-3-319-16292-8 12 5. Cisco: Cisco Visual Networking Index: Global Mobile Data Traffic Fore- cast Update 2014–2019 White Paper. Tech. rep., Cisco (02 2015), http://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking- index-vni/white paper c11-520862.html 6. Costa, P.C., Laskey, K.B., Laskey, K.J.: PR-OWL: A Bayesian Ontology Language for the Semantic Web. In: Costa, P.C., D’Amato, C., Fanizzi, N., Laskey, K.B., Laskey, K.J., Lukasiewicz, T., Nickles, M., Pool, M. (eds.) Uncertainty Reasoning for the Semantic Web I. pp. 88–107. Springer-Verlag, Berlin, Heidelberg (2008), http://dx.doi.org/10.1007/978-3- 540-89765-1 6 7. Ding, Z., Peng, Y.: A Probabilistic Extension to Ontology Language OWL. In: Proceedings of the Proceedings of the 37th Annual Hawaii International Conference on System Sciences (HICSS’04) - Track 4 - Volume 4. pp. 40111.1–. HICSS ’04, IEEE Computer Society, Wash- ington, DC, USA (2004), http://dl.acm.org/citation.cfm?id=962752.962957 8. Drumond, L., Girardi, R.: Extracting Ontology Concept Hierarchies from Text Using Markov Logic. In: Proceedings of the 2010 ACM Symposium on Ap- plied Computing. pp. 1354–1358. SAC ’10, ACM, New York, NY, USA (2010), http://doi.acm.org/10.1145/1774088.1774379 9. Hämäläinen, S., Sanneck, H., Sartori, C.: LTE Self-Organising Networks (SON): Network Management Automation for Operational Efficiency. Wiley Online Library, 1st edn. (2012) 10. Hounkonnou, C., Fabre, E.: Empowering Self-diagnosis with Self-modeling. In: Proceedings of the 8th International Conference on Network and Service Management. pp. 364–370. International Federation for Information Processing (2012) 11. Kindermann, R., Snell, J.L.: Markov Random Fields and Their Applications, Contemporary Mathematics, vol. 1. American Mathematical Society, Providence, R.I., USA (1980) 12. NGMN Alliance: 5G White Paper. Tech. rep. (Feb 2015), https://www.ngmn.org/uploads/media/NGMN 5G White Paper V1 0.pdf 13. Niepert, M., Meilicke, C., Stuckenschmidt, H.: A Probabilistic-Logical Framework for On- tology Matching. In: AAAI. Citeseer (2010) 14. Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1988) 15. Poole, D., Smyth, C., Sharma, R.: Semantic Science: Ontologies, Data and Probabilistic Theories. In: da Costa, P., d’Amato, C., Fanizzi, N., Laskey, K., Laskey, K., Lukasiewicz, T., Nickles, M., Pool, M. (eds.) Uncertainty Reasoning for the Semantic Web I, Lecture Notes in Computer Science, vol. 5327, pp. 26–40. Springer Berlin Heidelberg (2008), http://dx.doi.org/10.1007/978-3-540-89765-1 2 16. Poon, H., Domingos, P.: Unsupervised Ontology Induction from Text. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. pp. 296– 305. ACL ’10, Association for Computational Linguistics, Stroudsburg, PA, USA (2010), http://dl.acm.org/citation.cfm?id=1858681.1858712 17. Qiao, X., Li, X., Chen, J.: Telecommunications service domain ontology: semantic interop- eration foundation of intelligent integrated services. Telecommunications Networks-Current Status and Future Trends pp. 183–210 (2012) 18. Räisänen, V.: Information and knowledge in telecommunications: a service-dominant view. In: SRII Global Conference (SRII), 2012 Annual. pp. 211–218. IEEE (2012) 19. Rettinger, A., Nickles, M., Tresp, V.: Statistical Relational Learning with Formal Ontolo- gies. In: Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II. pp. 286–301. ECML PKDD ’09, Springer-Verlag, Berlin, Heidelberg (2009), http://dx.doi.org/10.1007/978-3-642-04174-7 19 20. Richardson, M., Domingos, P.: Markov Logic Networks. Machine Learning 62(1-2), 107– 136 (Feb 2006) 21. Stamatelatos, M., Yahia, I.G.B., Peloso, P., Fuentes, B., Tsagkaris, K., Kaloxylos, A.: Infor- mation Model for Managing Autonomic Functions in Future Networks. In: Mobile Networks and Management, pp. 259–272. Springer (2013) 22. Uzun, A., Küpper, A.: OpenMobileNetwork: extending the web of data by a dataset for mo- bile networks and devices. In: Proceedings of the 8th International Conference on Semantic Systems. pp. 17–24. ACM (2012) 23. Xu, Z., Tresp, V., Yu, K., Kriegel, H.P.: Infinite hidden relational models. arXiv preprint arXiv:1206.6864 (2012) 24. Yahia, B., Grida, I., Bertin, E., Crespi, N.: Ontology-Based Management Systems for the Next Generation Services: State-of-the-Art. In: Networking and Services, 2007. ICNS. Third International Conference on. pp. 40–40. IEEE (2007) 25. Zheng, H.T., Kang, B.Y., Kim, H.G.: An Ontology-Based Bayesian Network Approach for Representing Uncertainty in Clinical Practice Guidelines. In: da Costa, P., d’Amato, C., Fanizzi, N., Laskey, K., Laskey, K., Lukasiewicz, T., Nickles, M., Pool, M. (eds.) Uncertainty Reasoning for the Semantic Web I, Lecture Notes in Computer Science, vol. 5327, pp. 161– 173. Springer Berlin Heidelberg (2008), http://dx.doi.org/10.1007/978-3-540-89765-1 10