<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Information System Analysis</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Jasmin</forename><surname>Opitz</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">The University of Manchester</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Bijan</forename><surname>Parsia</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">The University of Manchester</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Ulrike</forename><surname>Sattler</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">The University of Manchester</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">Information System Analysis</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">1C38545462FDF69A45E97A9F2FAB2351</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T00:50+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Ontology-based data access has received a lot of attention recently, yet there is no clear methodology to evaluate a "semantically enriched" information system in general or an ontology based data access system in particular. The quality of such an information system clearly depends on how well your data fits your class-level ontology, and how well these two components fit your queries. This paper presents a generic, flexible framework for this kind of analysis: it can be used, e.g., to compare two class-level ontologies w.r.t. their fitness for a given kind of data and query set. We apply the framework to an example case and show how it helps to answer relevant modelling and representation questions.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>In this paper we present a framework for evaluating the quality of "semantically enriched" information systems <ref type="bibr">(IS)</ref>. By that we mean IS that distinguish between schema and data and are geared towards answering queries. The idea behind that is to encapsulate domain experts' background knowledge into the query answering mechanism in order to improve recall and precision. A typical example of such an IS is an ontology-based data access (OBDA) system that uses a classlevel ontology (or Tbox) as a schema and stores the data in a database. Queries retrieve tuples of individuals from the database that answer the query w.r.t. the schema.</p><p>The proposed evaluation framework measures the well-suitedness of the various components of an IS. It can be applied to any IS that involves a schema, a collection of data, a collection of information requests and a query language (QL). We call this a modelling approach (MA) for an IS. Thus, the framework is generic and can be applied to a variety of scenarios, e.g. for comparing different OBDA systems or for comparing different IS using database schemas or for comparing heterogeneous systems. ODBA has received a lot of attention recently and can come in many different fashions, e.g. regarding the expressive power of the ontology or the supported query language <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b2">3,</ref><ref type="bibr" target="#b0">1,</ref><ref type="bibr" target="#b9">10,</ref><ref type="bibr" target="#b5">6,</ref><ref type="bibr" target="#b8">9]</ref>.</p><p>When applying the framework we look at information requests as abstractions of queries (they are independent of schema and QL). For each MA the framework measures if an information request can be answered by a query in a given QL over the given schema and data and how good the query is. More precisely, the metrics produced by the evaluation framework are the fitness of an MA, i.e. the ability of formulating "good" queries, and the flexibility, i.e. the number of different "good" ways of expressing a query.</p><p>These measurements can help IS designers in taking important design decisions, e.g. whether to use an off-the-shelf schema or one that is specifically tailored to the application or which OBDA technique or tool to use. The measurements also point out which queries can and cannot be answered w.r.t. a given MA and how complicated it is to formulate these queries. That allows IS designers to identify, compare and discuss weak and strong points of their MA and manage trade-offs between modelling effort, maintainability and scalability.</p><p>In the following we will explain the technical details of the evaluation framework and its measurements. Furthermore, we will outline a case study in which we applied the framework to compare different ontology-based MA for medical image annotations. Fig. <ref type="figure" target="#fig_1">1</ref>. A modelling approach plus a query and its answers.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Data</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Information System Evaluation Framework</head><p>We start by formalising the relevant components of a (semantically enriched) information system for which we are then going to evaluate and compare different modelling approaches. We will use the term "modelling approach" to describe the whole system consisting of data, schema, (an abstraction of) queries, and a query language as depicted in Figure <ref type="figure" target="#fig_1">1</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Modelling Approach</head><p>A modelling approach MA = (S, D, R, QL) consists of -a schema S: a finite description of the semantics of the data, e.g. a database schema, a logic program, or the TBox of an ontology, which can be empty. -the data D: e.g. tables and rows in a relational database, ground facts, or ontology ABox assertions.</p><p>-a set of information requests R: each r ∈ R represents the answer to a query of D, and is given as a set (of tuples) over D. Ideally, R should be representative for the queries to be answered by the information system to be built.</p><p>-a query language QL: e.g. SQL, (union of) conjunctive queries, OWL class expressions.</p><p>An information request asks for tuples of the given data that are relevant for the user. The request needs to be distinguished from the actual query, which is a specific manifestation of the information request formulated in QL, see Figure <ref type="figure" target="#fig_1">1</ref>. An information request r can correspond to 0, 1 or more queries in a given query language. The former is the case if there are no queries in QL whose answers would be exactly the tuples in r when asked over S and D, i.e., if QL is unable to express the information request over the given schema and data. In the case that there are one or more queries, some of them might be more easily expressible than others. In Figure <ref type="figure" target="#fig_1">1</ref>, we sketch a case where the user wants to retrieve three individuals, John, Mary, Steve, from the database that are known to be parentsbut not all of which are explicitly stored as parents. Still, in the presence of the given schema, the query Parents(?x) can be formulated to retrieve exactly those three individuals.</p><p>The only assumptions we make is that the query language QL comes with a semantics that identifies, for a given query q of arity n in QL, data D, and schema S, the set of certain answers <ref type="bibr" target="#b3">[4]</ref>. More precisely, we assume the existence of an entailment relation |=, and use Ind(D) for the set of individuals or constants in D to define cert(•) as follows:</p><formula xml:id="formula_0">cert(q, S, D) = {w ∈ Ind(D) n | S ∪ D |= q(w)}.</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Applying the Framework</head><p>The basic characteristics we want to evaluate is the fitness of an MA, i.e. how well the schema and the data are suited to enable the formulation of "fit" queries for answering the given information requests, and the flexibility of an MA, i.e. the number of "fit" queries that can be formulated for answering the given information requests. The fitness and flexibility of an MA can be determined by analysing the syntactic, semantic and/or cognitive complexity of the queries that correspond to the information requests and depends on the fitness function.</p><p>The Fitness Function Different queries that correspond to an information request can vary in length and be more or less complex, e.g. in terms of using relations and constructors such as conjunctions, disjunctions, etc. They can also be more or less difficult to understand from a cognitive perspective. For example, a human user might find a query that uses terms that are actual words (in the sense that they exist in a domain expert's dictionary) easier to understand than one that uses anonymous identifiers. The purpose of the fitness function is to capture this complexity.</p><p>The framework is parametrized with a fitness function f that associates each query q in QL with some value f (q) that is intended to capture its fitness. We only require that f maps QL into a totally ordered set (M, &lt;), e.g. R or N 4 , which we call the query's fitness value. Obvious examples of fitness functions are (i) a query's length, (ii) a query's length combined with the number of constructors involved, either via some (weighted) summation or into a vector, or (iii) a query's length combined with the number of terms not to be found in Wikipedia or a suitable lexicon, or any combinations or extensions of these.</p><p>The smaller the fitness value, the "better" the query. We read f (q) &lt; f(q ) as q being "better" or "fitter" than q . The framework evaluates the "best queries" for an information request, e.g., the shortest queries. The fitness function induces a partial order on the queries.</p><p>The Query Space Each information request r ∈ R has an associated query space: first, we define correct queries cQ(r, S, D) as those that answer exactly an information request r over S and D:</p><formula xml:id="formula_1">1 cQ(r, S, D) = {q | q is a QL query and cert(q, S, D) = r(D)}.</formula><p>Next, we define best queries bQ(r, S, D, f ) as those correct queries whose fitness is maximal. Clearly, best queries depend on how we measure fitness, and thus on the fitness function f :</p><formula xml:id="formula_2">bQ(r, S, D, f ) = {q ∈ cQ(r, S, D) | there is no q ∈ cQ(r, S, D) : f (q ) &lt; f(q)}.</formula><p>Since the bQ(•) are the "fittest" queries among the correct queries, any two queries in bQ(•) are equally fit, and we can abbreviate their fitness as follows: for f (q i ) = f (q j ), we set f ({q 1 , ..., q k }) to be f (q 1 ). For an empty set, e.g., if an information request cannot be expressed in QL over S and D, we set f (∅) = max &lt; (M ) if such a maximum exists, i.e., maximally unfit, or to some other very unfit value.</p><p>If we want to consider the flexibility of an MA, we simply need to consider the number of best queries, i.e., the cardinality of bQ(•). Depending on the application domain, we can adapt the framework to consider only non-redundant queries for measuring the flexibility. For example, if S is a class-level OWL ontology and QL are OWL class expressions for instance retrieval, we can count all elements in bQ(•) that are not structurally equivalent <ref type="bibr" target="#b6">[7]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">Applying the Framework to OWL</head><p>We will now specify an instantiation of the framework to evaluate OWL ontologybased data access approaches. This specification is still quite flexible: e.g., we cover both the case where the data resides in a database and the case where it is part of an ontology. An MA = (T, A, R, CL) consists of -a TBox T, i.e., a set of OWL class-level axioms, 2 that describes the conceptual model and the terminology of the domain, plus possibly a set of mappings in the sense of <ref type="bibr" target="#b2">[3]</ref>, -an ABox A, i.e., a set of OWL assertions about named individuals, or, in the presence of the above mentioned mappings, tables from a relational database from which these mappings are defined, -a set R of information requests r, i.e., sets of tuples of OWL individuals, and -CL is the set of OWL class expressions as a query language.</p><p>OWL class expressions are an obvious choice for a query language, but there are more expressive ones such as conjunctive queries, unions of conjunctive queries <ref type="bibr" target="#b3">[4]</ref>, SPARQL, SPARQL-DL, 3 or nRQL <ref type="bibr" target="#b10">[11]</ref>.</p><p>For an ontology-based modelling approach, we suggest a fitness function as follows: f (q) is a fitness vector (a, b, c, d) that contains (i) a as the length |q| of q, (ii) b as the number of distinct OWL constructors in q, (iii) c as the role nesting depth of q, and (iv) d as a flag that is set to 1 if q contains unintelligible codes, and 0 if all terms in q are human readable otherwise. We compare the fitness of queries via the lexicographic ordering from the left of their fitness vectors.</p><p>A Simple Example We use a simple example to that captures data about parents and their children. We will use this example to illustrate the components of the modelling approach as well as the query space for an information request.</p><p>Consider the modelling approach MA = (T, A, R, CL) consisting of the following TBox and ABox Now consider the information request r(A) = {M ary, John}, i.e., r retrieve "all parents". Using OWL class expressions as a query language, the following queries could be considered:</p><formula xml:id="formula_3">T = { A = { F ather ≡ M</formula><formula xml:id="formula_4">q1 = P arent q2 = F ather M other q3 = ∃hasChild. q4 = W oman M an q5 = F ather q6 = M other</formula><p>The correct queries for r are cQ(r, T, A) = {q 1 , q 2 , q 3 , q 4 }, and not all of them are equivalent. The queries q 5 and q 6 are not correct because they return only incomplete answers. W.r.t. the above mentioned fitness function, we have only one best query, q 1 , because it is the shortest correct query.</p><p>Please note that, w.r.t. the data given here, q 4 is correct for r, and it would be interesting to see what would happen if we extended A with, say, M an(T om): either r will change as well to include T om, or q 4 ceases to be a correct query. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4">Using the Evaluation Framework to Compare Modelling Approaches</head><formula xml:id="formula_5">r 1 r 2 r j MA 2 MA 1 f (bQ 1 1 ) f (bQ 2 1 ) f (bQ 2 2 ) f (bQ 1 2 ) f (bQ 1 j ) f (bQ 2 j ) |bQ 1 1 | |bQ 2 1 | |bQ 2 2 | |bQ 2 j | |bQ 1 j | |bQ 1 2 | ... ... m 1 m 2 l 2 l 1</formula><p>Fig. <ref type="figure">2</ref>. General and comparative measurements for modelling approaches.</p><p>On the left hand side of Figure <ref type="figure">2</ref>, we have sketched an evaluation of a modelling approach MA where, for each information request r i ∈ R, we have computed the best queries for r i , and then their fitness and cardinality. Clearly, if we want to compare two modelling approaches MA 1 and MA 2 , we can do the same and compare, for each information request r i ∈ R and each of the two modelling approaches, the fitness and cardinality of the best queries. This can unveil the strengths and weaknesses of the information system to the system designer. For example, if there are information requests for which the set of correct queries is empty, then f (bQ(r, S, D, f )) is prohibitively bad. To overcome this, we can then decide whether to select a different, more powerful query language or to change the schema or the way the data is modelled-or whether perhaps that particular information request is of too little importance for such a change. The measurements can also help to point out where the trade-offs between modelling effort and benefits in terms of easier query answering are. For example, considering an ontology-based modelling approach, whether more modelling effort for a more expressive TBox would be justified for the sake of simpler queries.</p><p>In addition, we can aggregate the fitness and flexibility of a modelling approach: this can be interesting if we want to compare two such modelling approaches en gros. In what follows, we use AGG to stand for an aggregation function such as min, max, avg, or count. This function can be fixed in the particular application of the framework.</p><p>We can aggregate both the fitness and the flexibility of a modelling approach: the overall fitness of a modelling approach f (MA) is aggregated over the fitness vectors of all best queries for all information requests, i.e.,</p><formula xml:id="formula_6">m = AGG [ r∈R f (bQ(r, S, D, f )).</formula><p>The overall flexibility of a modelling approach aggregates over the cardinality of all best queries for all information requests, i.e.,</p><formula xml:id="formula_7">= AGG [ r∈R | bQ(r, S, D, f )|.</formula><p>As illustrated in Figure <ref type="figure">2</ref>, applying the framework to one modelling approach MA = (S, D, R, QL) reveals -for each r, the fitness value of the best queries: f (bQj). In particular, it will identify information requests for which it is hard to specify a query in QL and those for which this is impossible. -for each r, the number of best queries: |bQj| -the aggregated fitness value m for the entire MA -the aggregated flexibility of the entire modelling approach MA When comparing different modelling approaches (as shown in Figure <ref type="figure">4</ref>) we can compare the -point-to-point fitness for each information request -overall (aggregated) fitness m of the modelling approaches -point-to-point flexibility for each information request -overall (aggregated) flexibility of the modelling approaches</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">A Case Study: Ontology-Based Annotations</head><p>We will now present the application of the evaluation framework in a case study about ontology-based annotations of medical images and their descriptions. The study is described in more detail in <ref type="bibr" target="#b7">[8]</ref>.</p><p>The modelling process involved a number of design decisions. First, we chose to use a module of the established medical ontology SNOMED CT<ref type="foot" target="#foot_2">4</ref> as the TBox of our annotation ontology <ref type="foot" target="#foot_3">5</ref> and translated natural language radiology reports of 50 medical images to ABox assertions of that ontology. The textual descriptions contain medical information such as image type, image modalities, clinical findings, body structures and diagnoses. Next, we had to be decide whether the ABox assertions should be simple class assertions of the relevant medical terms occurring in the text or whether the ABox should contain class and object property assertions, trying to closely reflect the meaning of the text. Furthermore, the SNOMED CT TBox has a very complex structure containing role groups <ref type="bibr" target="#b4">[5]</ref> that are used e.g. to model diseases that relate findings to body structures. We had to find a way to translate the textual descriptions in accordance with this complex structure.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">The Modelling Approaches</head><p>In the following, we present three different modelling approaches. MA 1 models the data with a simple ABox that contains almost only class assertions: individuals are only linked by a single object property shows in order to relate an image to the individuals shown in it. MA 2 uses class and object property assertions that capture the relational structure of the image descriptions. MA 3 uses a slightly different TBox than MA 1 and MA 2 in the sense that we created an additional set of roles and a role hierarchy in order to bypass the SNOMED CT specific role groups. An example of a disease in SNOMED CT that is defined using role groups is NeoplasmOfLung. The concept is defined as follows: <ref type="foot" target="#foot_4">6</ref>NeoplasmOfLung ≡ DisorderOfLung ∃roleGroup( ∃AssociatedMorphology.Neoplasm ∃FindingSite.LungStructure)</p><p>For MA 3 , we introduced three additional object properties: shows, hasFinding and hasLocation and defined the following role hierarchy:</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>roleGroup o AssociatedM orphology hasF inding roleGroup o F indingSite hasLocation shows o hasF inding shows shows o hasLocation shows</head><p>If we want to find all images that show neoplasms in MA 3 , we can formulate a simple OWL class expression query like</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Image</head><p>∃shows.N eoplasm and would retrieve images labelled with Image ∃roleGroup.∃AssociatedMorphology.NeoplasmOfLung without having to use the complicated role group construct in the query.</p><p>We compare the following modelling approaches:</p><formula xml:id="formula_8">MA 1 = (T 1 , A 1 , R, CL) MA 2 = (T 1 , A 2 , R, CL) MA 3 = (T 2 , A 3 , R, CL)</formula><p>where T 1 is the original SNOMED CT TBox and T 2 the TBox with the additional role hierarchy. A 1 is an ABox with the data formulated in terms of simple class assertions whereas A 2 and A 3 use class assertions as well as object property assertions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">The Information Requests</head><p>The set of information requests R is derived from the content of the original, natural language image descriptions: clinical findings, findings located in body parts, complex findings (involving role groups), image types and modalities and combinations of the former. We will now list some representative information requests.</p><p>-r1: An information request that involves one clinical finding: "All images that show neoplasms." -r2: An information request that involves two concepts, an image type and an image projection: "All X-ray images with PosteroAnterior (PA) projection." -r3: An information request that involves a clinical finding combined with a qualifier value: "All images that show left-sided pleural effusions."</p><p>-r4: An information request that involves a clinical finding combined with a body structure: "All images that show soft tissue masses in the pleural membrane."</p><p>We expect that MA 1 is good for formulating queries for simple requests (such as those that ask for just one concept, e.g. r 1 and r 2 ) whereas MA 2 is more appropriate for formulating queries for complex requests that involve relations between concepts, such as r 3 and r 4 . However, we also expect that it is difficult to formulate queries for the more complex requests r 3 and r 4 in MA 1 because simple class assertions cannot capture the semantics of findings that are related to qualifier values or body structures. Furthermore, the measurements should highlight that MA 3 allows the formulation of simpler queries as opposed to MA 2 because the TBox contains the additional role hierarchy that allows use to bypass role groups.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Results</head><p>Tables 1 -3 illustrate the findings for the three proposed modelling approaches MA i w.r.t. the information requests r 1 to r 4 . For each of the information requests, the best queries as well as the fitness values for length, number of distinct constructors and role nesting depth as well as the flexibility ( ) are listed for the three modelling approaches.  <ref type="table">3</ref>. Findings for MA 3 .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4">Evaluation</head><p>The results shows that MA 1 allows the formulation of relatively simple queries. However, it is not always possible to formulate a query that returns exactly those tuples that are the certain answers to the information request. As soon as the information request involves nesting of entities, e.g. a finding with a location or a finding with a qualifier, MA 1 does not allow the formulation of a query that is precise enough to return only the correct answers. In this case the fitness values were assigned an exemplary value max = 100, see Table <ref type="table" target="#tab_1">1</ref> for r 4 . In this information request we want to find images that show soft tissue masses located in the pleural membrane. In our data set there is one image annotation that describes a neoplasm in the pleural membrane and a soft tissue mass in some other body structure. This image would have been returned with a query like Image ∃shows.SoftTissueMass ∃shows.PleuralMembraneStructure, although it is not an answer to the information request. The problem lies in the nature of the data modelling paradigm. The lack of relational structure in the ABox makes it impossible to capture the semantics of the image descriptions appropriately.</p><p>MA 2 models the data in the ABox using the relational structures defined in the TBox, in particular the properties shows, roleGroup, associatedMorphology, etc. This allows us to formulate queries for all information requests. However, the queries are rather long and nested due to the fact that the complicated role group construct <ref type="bibr" target="#b4">[5]</ref> has to be used. The modelling approach MA 3 can capture the semantics of the image descriptions as well as MA 2 and allows us to formulate queries for all information requests. Furthermore, the queries are significantly simpler than those in MA 2 because MA 3 uses a slightly more expressive TBox than MA 2 with which we can bypass role groups.</p><p>The three modelling approaches and their measurements expose evolving design of the retrieval system built in the case study. We started off with a relatively simple ABox that involves little effort compared to the later versions but is not expressive enough to allow the formulation of queries for all information requests. Using a more expressive ABox with object property assertions to relate the class assertions to each other makes it possible to formulate queries for all information requests, however, the queries become significantly more complex. Furthermore, with a little more modelling effort of introducing a small role hierarchy in the TBox we can formulate queries that are as expressive but much simpler than those that came with the original TBox.</p><p>The evaluation framework has highlighted the weaknesses of each modelling approach, e.g. the inability or difficulty of formulating queries for information requests. It can also highlight the strengths, e.g. conciseness and flexibility of a modelling approach. The measurements can guide the system engineer and support design decisions. For example, the framework can identify the benefits of changes in the modelling approach and therefore point out whether more modelling effort would be justified.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Conclusion and Future Work</head><p>We have presented a generic information system evaluation framework that can be used to analyse the fitness and flexibility of modelling approaches. It involves evaluating represenative information requests and the complexity of the queries that correspond to these requests as well as the well-suitedness of the components of the modelling approach, i.e. the schema, the data and the queries.</p><p>The measurements generated by the framework can be used to highlight strengths and weaknesses of a modelling approach and to compare the fitness of similar modelling approaches. It also supports engineers in making important design decisions, such as using an off-the-shelf schema or creating one that is tailored to the data or, in general, investing more modelling effort if this leads to significant benefits in the fitness of the modelling approach. The measurements can be used as a basis for discussion when building data access applications.</p><p>A next step of our work will be to apply the framework in a case study where we compare more heterogeneous modelling approaches with each other, e.g. an ontology-based modelling approach with one based on databases. Furthermore, we want to extend the framework so that it measures the fitness of queries taking into account not only exact matches to the answers of the respective information requests but also partial results. Finally, we will extend this approach so that it not only evaluates the complexity of formulating queries, but also the overal performance and scalability of query answering.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>bQ 1 =</head><label>1</label><figDesc>{...} bQ 2 = {...} bQ j = {...}</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head></head><label></label><figDesc>See http://www.w3.org/2009/sparql/wiki/Main_Page and entailment regimes.</figDesc><table><row><cell cols="2">an ∃hasChild. , M other ≡ W oman ∃hasChild. , F ather P arent,</cell><cell>F ather(John), M other(M ary), hasChild(M ary, T om),</cell></row><row><cell>M other</cell><cell>P arent}</cell><cell>hasChild(John, T om)}</cell></row><row><cell cols="3">2 More precisely, OWL 2 class expression axioms, property axioms, datatype defini-</cell></row><row><cell cols="3">tions, and keys, see http://www.w3.org/TR/2009/REC-owl2-syntax-20091027/.</cell></row></table><note>3 </note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 1 .</head><label>1</label><figDesc>Findings for MA 1 .</figDesc><table><row><cell cols="2">r j bestQueries</cell><cell>Length</cell><cell>Consts</cell><cell>Depth Nesting</cell><cell>Flex</cell></row><row><cell>r 1 Image r 2 Image</cell><cell>∃shows.Neoplasm</cell><cell>3 5</cell><cell>2 2</cell><cell>1 1</cell><cell>1 1</cell></row><row><cell cols="2">∃shows.PlainChestXray ∃shows.PAProjection r 3 Image</cell><cell>5</cell><cell>2</cell><cell>1</cell><cell>1</cell></row><row><cell cols="2">∃shows.PleuralEffusion ∃shows.LeftSided r 4 none</cell><cell>100</cell><cell>100</cell><cell>100</cell><cell>0</cell></row><row><cell cols="2">r j bestQueries</cell><cell>Length</cell><cell>Consts</cell><cell>Depth Nesting</cell><cell>Flex</cell></row><row><cell cols="2">r 1 Image ∃roleGroup.(∃AssociatedMorphology.Neoplasm) ∃shows.(Disease r 2 Image</cell><cell>6 5</cell><cell>2 2</cell><cell>2 1</cell><cell>1 1</cell></row><row><cell cols="2">∃hasImageType.PlainChestXray ∃hasImageProjection.PAProjection r 3 Image</cell><cell>5</cell><cell>2</cell><cell>2</cell><cell>1</cell></row><row><cell cols="2">∃shows.(PleuralEffusion ∃hasQualifierValue.LeftSided) r 4 Image ∃FindingSite.PleuralMembraneStructure) ∃AssociatedMorphology.SoftTissueMass ∃roleGroup.( ∃shows.(Disease</cell><cell>8</cell><cell>2</cell><cell>3</cell><cell>1</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 2 .</head><label>2</label><figDesc>Findings for MA 2 .</figDesc><table><row><cell cols="2">r j bestQueries</cell><cell>Length</cell><cell>Consts</cell><cell>Depth Nesting</cell><cell>Flex</cell></row><row><cell>r 1 Image r 2 Image</cell><cell>∃shows.Neoplasm</cell><cell>3 5</cell><cell>2 2</cell><cell>1 1</cell><cell>1 1</cell></row><row><cell cols="2">∃hasImageType.PlainChestXray ∃hasImageProjection.PAProjection r 3 Image</cell><cell>5</cell><cell>2</cell><cell>2</cell><cell>1</cell></row><row><cell cols="2">∃shows.(PleuralEffusion ∃hasQualifierValue.LeftSided) r 4 Image</cell><cell>6</cell><cell>2</cell><cell>2</cell><cell>2</cell></row><row><cell cols="2">∃shows.(∃hasFinding.SoftTissueMass ∃hasLocation.PleuralMembraneStructure) Image</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="2">∃shows.(∃AssociatedMorphology.SoftTissueMass ∃Findingsite.PleuralMembraneStructure) Table</cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0">Proceedings of the International Workshop on Evaluation of Semantic Technologies (IWEST 2010). Shanghai, China. November 8, 2010.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_1">Currently, the framework does not consider approximations of correct queries for measuring the fitness of the modelling approach, see the discussion in Section 4.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_2">http://www.ihtsdo.org/snomed-ct/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_3">http://www.cs.man.ac.uk/\ ~opitzj/snomed/snomedLungModuleImageAnnotations. owl</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_4">To improve readability, we use slightly abbreviated class names and DL syntax.</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Quonto: Querying ontologies</title>
		<author>
			<persName><forename type="first">A</forename><surname>Acciarri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Calvanese</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">D</forename><surname>Giacomo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Lembo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lenzerini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Palmieri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Rosati</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">AAAI</title>
				<imprint>
			<publisher>AAAI Press / The MIT Press</publisher>
			<date type="published" when="2005">2005</date>
			<biblScope unit="page" from="1670" to="1671" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">DL-Lite: Tractable Description Logics for Ontologies</title>
		<author>
			<persName><forename type="first">D</forename><surname>Calvanese</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">D</forename><surname>Giacomo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Lembo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lenzerini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Rosati</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">AAAI</title>
				<imprint>
			<publisher>AAAI Press / The MIT Press</publisher>
			<date type="published" when="2005">2005</date>
			<biblScope unit="page" from="602" to="607" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Tractable Reasoning and Efficient Query Answering in Description Logics: The DL-Lite Family</title>
		<author>
			<persName><forename type="first">D</forename><surname>Calvanese</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">D</forename><surname>Giacomo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Lembo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lenzerini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Rosati</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">J. Autom. Reasoning</title>
		<imprint>
			<biblScope unit="volume">39</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="385" to="429" />
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">On the decidability of query containment under constraints</title>
		<author>
			<persName><forename type="first">D</forename><surname>Calvanese</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">D</forename><surname>Giacomo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lenzerini</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">PODS</title>
				<imprint>
			<publisher>ACM Press</publisher>
			<date type="published" when="1998">1998</date>
			<biblScope unit="page" from="149" to="158" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Relationship Groups in SNOMED CT</title>
		<author>
			<persName><forename type="first">R</forename><surname>Cornet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Schulz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Medical Informatics in a United and Healthy Europe</title>
				<imprint>
			<publisher>IOS Press</publisher>
			<date type="published" when="2009">2009. 2009</date>
			<biblScope unit="page" from="223" to="227" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Conjunctive Query Answering in EL using a Database System</title>
		<author>
			<persName><forename type="first">C</forename><surname>Lutz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Toman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Wolter</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">OWLED</title>
				<imprint>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title level="m" type="main">OWL 2 Web Ontology Language: Structural Specification and Functional-Style Syntax</title>
		<author>
			<persName><forename type="first">B</forename><surname>Motik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">F</forename><surname>Patel-Schneider</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Parsia</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
		<respStmt>
			<orgName>W3C Recommendation</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Technical report</note>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Using Ontologies for Medical Image Retrieval -An Experiment</title>
		<author>
			<persName><forename type="first">J</forename><surname>Opitz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Parsia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">U</forename><surname>Sattler</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">OWLED</title>
				<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Rewriting Conjunctive Queries over Description Logic Knowledge Bases</title>
		<author>
			<persName><forename type="first">H</forename><surname>Pérez-Urbina</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Motik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Horrocks</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">SDKB</title>
				<imprint>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="199" to="214" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Effective query rewriting with ontologies over dboxes</title>
		<author>
			<persName><forename type="first">I</forename><surname>Seylan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Franconi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>De Bruijn</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IJCAI</title>
				<imprint>
			<date type="published" when="2009">2009</date>
			<biblScope unit="page" from="923" to="925" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">A high performance semantic web query answering engine</title>
		<author>
			<persName><forename type="first">M</forename><surname>Wessel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Möller</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Workshop on Description Logics</title>
				<imprint>
			<publisher>CEUR</publisher>
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
