Description Logics for Interoperability Enrico Franconi http://www.inf.unibz.it/~franconi/ Faculty of Computer Science, Free University of Bozen-Bolzano, Italy Description Logics (DL) [5] are a very promising research area in Knowledge Representation (KR) with applications in databases (DBs). The main effort of the research in DL is in providing both theories and systems for expressing struc- tured knowledge and for accessing and reasoning with it in a principled way. Re- cently, basic progress has been made by establishing the theoretical foundations for the effective use of DL in information systems [8]. DL offer promising for- malisms for solving several problems concerning Conceptual Data Modelling and Ontology Design (see, e.g., [7], or the DAML+OIL and OWL efforts [20]), In- telligent Information Access and Query processing, and Information Integration and Interoperability. I want to argue that good Conceptual Modelling and Ontology Design is re- quired to support powerful Query Management and to allow for semantic based Information Integration and Interoperability. Therefore, this short survey has been structured into three parts. In the first part, the notions of ontology lan- guage and of methodology for conceptual and ontology design will be introduced. In the second part, the query management problem in the presence of the pre- viously devised conceptual model will be considered: a global framework will be introduced, together with various basic tasks involved in information access. In the last part, general issues about integration and interoperability will be pre- sented. The most relevant research work carried out in our group will be cited in the paper, while for all the other relevant citations we refer to [4]. This work has been partially supported by the EU projects Sewasie, KnowledgeWeb, and Interop. Conceptual Modelling and Ontology Design For the purpose of this short survey, an Ontology will be considered as a Con- ceptual Schema expressed in a suitable conceptual data model (i.e., an Ontology Language). Good conceptual data models put their emphasis on the correct and semantically rich representation of complex properties and relations that may exist between documents. They should allow for an abstract representation of data which resembles the way they are actually perceived and used in the real world, thus shortening (with respect to the more traditional data models) the semantic gap between the domain and its representation. Conceptual (or Ontology) modelling deals with the question on how to de- scribe in a declarative and reusable way the domain information of an appli- cation, its relevant vocabulary, and how to constrain the use the data, by un- derstanding what can be drawn from it. Not surprisingly, conceptual modelling tasks have always been in the mainstream of KR research – see for example the research on Ontology representation and design – and can be considered now one of the main applications of KR languages and reasoning technique. In addition, given the high complexity of the modelling task when complex data is involved, in the semantic web field there is the demand of more sophisticated and expres- sive languages than for normal information systems. Again, DL research is very active in providing expressive ontology languages to capture various aspects of the information (see, e.g., [1, 3, 2, 11, 15, 12, 6]). A big part of the DL community likes to see a generic ontology language as the generalisation of both the object-oriented data model based on UML class diagrams and the extended Entity-Relationship (EER) semantic data model, strictly related to the ontology web languages such as DAML+OIL and OWL. Our work in this direction includes the i.com tool [14, 22] – which fully imple- ments an extended conceptual data model generalising the UML class diagrams and the EER schemas – and which is available online for the evaluation of the principles just exposed at the public web address http://www.inf.unibz.it/ ~franconi/icom/. i.com allows for the specification of multiple EER (or UML) diagrams and inter- and intra-schema constraints. Complete logical reasoning is employed by the tool using an underlying DL inference engine to verify the specification, infer implicit facts and stricter constraints, and manifest any in- consistencies during the conceptual modelling phase. Information Access Only recently has KR research started to have an interest in query process- ing and information access. Recent work has come up with advanced reasoning techniques for query evaluation and rewriting using views under the constraints given by the ontology – also called view-based query processing. This means that the notion of accessing information through the navigation of an Ontology modelling the document’s domain – which can be seen as a conceptual schema – has its formal foundations. I will thus consider DL for formalising not only the ontology but also the query processing as well. The (DL-based) conceptual schema as defined in the previous section can be seen as a set of constraints over a vocabulary which is usually richer that the logical schema of the information system it is modelling. In some sense, quite often the conceptual schema plays the role of an general ontology of the domain, very close to the user’s rich vocabulary, rather than of a set of constraints over the poor logical vocabulary structuring the data. With this perspective in mind, the user would prefer to query the information system using the richer vocabulary of the ontology [9]. The vocabulary of the basic data (i.e., the logical schema) could be seen in turn either as a subset of the conceptual vocabulary – this is the simplistic view – or more generally as a set of (materialised) views over the vocabulary of the ontology. However, in this case we have to solve the problem of view-based query processing. The problem requires to answer a query posed to a database – the one defined by the ontology – only on the basis of the information in a set of (materialised) views, which are again queries over the same database. In the process, the information contained in the conceptual schema of the database should be of course taken into account. Two approaches to view-based query processing exist, namely query rewriting and query answering (see, e.g., [25]). This framework can be used to characterise several aspects of an information system to support interoperability. In query optimisation, view-based query processing is relevant because using the views may speed up query processing. In data integration, the views represent the only information sources accessible to answer a query. A data warehouse can be seen as a set of materialised views, and, therefore, query processing reduces to view-based query answering. Finally, since the views provide partial knowledge on the database, view-based query processing can be seen as a special case query answering with incomplete information. Information Integration and Interoperability In this last part I will hint how the technologies introduced in the first two parts, namely a very expressive ontology language and view-based query processing over it, can be used in the framework of Information Integration [16, 21, 22, 24]. Let us suppose to have multiple databases to be integrated. Each database will have its own conceptual schema and logical schema, where, as seen in the previous part, the logical schema is just a set of views over the conceptual schema (local-as-view approach). We assume that each symbol of each schema is identified by a unique global symbol, i.e., the various databases have dis- joint signatures. Interdependencies between entities and relationships in different schemas are represented by means of integrity constraints involving symbols of the schemas. Such interdependencies are called coordination formulas, and they are of the form of inclusion dependencies expressed in a suitable view language (e.g., a DL itself, or a SPJ query language. The union of the various schemas with the coordination formulas and the local views forms the global integrated schema, or the mediator. It is worth noting that the integration process is incre- mental – since the integrated schema can be monotonically refined as soon as there is new understanding of the different component schemas – and that the resulting unified schema is strongly dependent from (actually, it includes) the schemas of the single information sources. This approach gives both a clear semantics to the integration process of on- tologies, and a calculus for deriving inconsistencies and checking the validity of integrity constraints in the integrated schema. Most importantly, in this frame- work global queries can be defined as views over single ontologies, or they can be generalised to span over multiple ontologies. The view-based query processing mechanism will guarantee the correct answer to the global query from the local sources. In [23] a comparison is given between the above local-as-view approach to processing global queries and the global-as-view approach, which is more com- mon in current information integration architectures. More recent research work is dealing with the application of the above ideas in a peer-to-peer interoperability framework [17, 19, 18]. Here, the difference with the classical framework are the following: (a) the role of the coordination for- mulas between nodes is for data migration (as opposed to the role of logical constraints in classical data integration systems); (b) computation is delegated to single nodes (distributed local computation); (c) the topology of the net- work may dynamically change; (d) local inconsistency does not propagate; (e) computational complexity can be low. References 1. A. Artale and E. Franconi. A survey of temporal extensions of description logics. Annals of Mathematics and Artificial Intelligence, 30(1-4), 2001. 2. A. Artale and E. Franconi. Temporal description logics. In Dov Gabbay, Michael Fisher, and Lluis Vila, editors, Handbook of Temporal Reasoning in Artificial In- telligence. Elsevier, “Foundations of Artificial Intelligence” series, 2004. 3. A. Artale, E. Franconi, F. Wolter, and M. Zakharyaschev. A temporal descrip- tion logic for reasoning over conceptual schemas and queries. In Proc. of the 8th European Conference on Logics in Artificial Intelligence (JELIA-2002), 2002. 4. F. Baader, D. McGuinness, D. Nardi, and P. F. Patel-Schneider, editors. De- scription Logic Handbook: Theory, Implementation and Applications. Cambridge University Press, 2002. 5. F. Baader and W. Nutt. Basic description logics. In Baader et al. [4]. 6. Franz Baader, Ralph Kuesters, and Frank Wolter. Extensions to description logics. In Baader et al. [4]. 7. A. Borgida and R. J. Brachman. Conceptual modelling with description logics. In Baader et al. [4]. 8. A. Borgida, M. Lenzerini, and R. Rosati. Description logics for databases. In Baader et al. [4]. 9. Tiziana Catarci, Tania Di Mascio, Enrico Franconi, Giuseppe Santucci, and Sergio Tessaris. An ontology based visual tool for query formulation support. In Proc. of ODBASE’03, International Conference on Ontologies, Databases and Applications of SEmantics, November 2003. 10. E. Franconi, F. Baader, U. Sattler, and P. Vassiliadis. Multidimensional data models and aggregation. In Jarke et al. [21], chapter 5, pages 87–106. 11. E. Franconi, F. Grandi, and F. Mandreoli. A semantic approach for schema evolu- tion and versioning in object-oriented databases. In Proc. of the 1st International Conf. on Computational Logic (CL’2000), DOOD stream. Springer-Verlag, July 2000. 12. E. Franconi, F. Grandi, and F. Mandreoli. Description logics for modelling dynamic information. In Jan Chomicki, Ron van der Meyden, and Gunter Saake, editors, Logics for Emerging Applications of Databases. Springer-Verlag, 2003. 13. E. Franconi and M. Kifer, editors. Proceedings of the 6th International Workshop on Knowledge Representation meets Databases (KRDB’99). Linköping University Technical Report, July 1999. Also electronically available as CEUR Publication, Vol. 21, RWTH Aachen, Germany. 14. E. Franconi and G. Ng. The ICOM tool for intelligent conceptual modelling. In Proc. of the 7th International Workshop on Knowledge Representation meets Databases (KRDB’2000), 2000. 15. E. Franconi and U. Sattler. A data warehouse conceptual data model for multidi- mensional aggregation. In Proceedings of the Workshop on Design and Management of Data Warehouses (DMDW’99), 1999. 16. Enrico Franconi, Ken Barker, and Diego Calvanese, editors. Proceedings of the International Workshop on Foundations of Models for Information Integration (FMII-2001). Springer-Verlag, 2001. 17. Enrico Franconi, Gabriel Kuper, Andrei Lopatenko, and Luciano Serafini. A robust logical and computational characterisation of peer-to-peer database systems. In Proc. of the VLDB International Workshop On Databases, Information Systems and Peer-to-Peer Computing (DBISP2P-2003), 2003. 18. Enrico Franconi, Gabriel Kuper, Andrei Lopatenko, and Ilya Zaihraeu. The coDB robust peer-to-peer database system. In Proc. of the 2nd Workshop on Semantics in Peer-to-Peer and Grid Computing, 2004. 19. Enrico Franconi, Gabriel Kuper, Andrei Lopatenko, and Ilya Zaihraeu. A dis- tributed algorithm for robust data sharing and updates in p2p database networks. In Proc. of the EDBT International Workshop on Peer-to-Peer Computing and Databases, 2004. 20. Christopher A. Welty Ian Horrocks, Deborah L. McGuinness. Digital libraries and web based information systems. In Baader et al. [4]. 21. M. Jarke, M. Lenzerini, Y. Vassilious, and P. Vassiliadis, editors. Fundamentals of Data Warehousing. Springer-Verlag, 1999. 22. Mathias Jarke, V. Quix, D. Calvanese, Maurizio Lenzerini, Enrico Franconi, S. Ligoudistiano, P. Vassiliadis, and Yannis Vassiliou. Concept based design of data warehouses: The DWQ demonstrators. In 2000 ACM SIGMOD International Conference on Management of Data, May 2000. 23. Maurizio Lenzerini. Data integration: A theoretical perspective. In Proc. of PODS- 2002, pages 233–246, 2002. 24. Martin Peim, Enrico Franconi, and Norman Paton. Applying functional languages in knowledge-based information integration systems. In Peter Gray, Larry Ker- schberg, Peter King, and Alex Poulovassilis, editors, The Functional Approach to Data Management. Springer-Verlag, 2004. 25. Martin Peim, Enrico Franconi, Norman Paton, and Carole Goble. Query processing with description logic ontologies over object-wrapped databases. In Proc. of the 14th International Conference on Scientific and Statistical Database Management (SSDBM’02), July 2002.