<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Faceted Approach To Diverse Query Processing</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Alessandro</forename><surname>Agostini</surname></persName>
							<email>aagostini@pmu.edu.sa</email>
							<affiliation key="aff0">
								<orgName type="department">Department of Computer Science Prince Mohammad Bin</orgName>
								<orgName type="institution">Fahd University Al-Khobar</orgName>
								<address>
									<country key="SA">Saudi Arabia</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Devika</forename><forename type="middle">P</forename><surname>Madalli</surname></persName>
							<email>devika@drtc.isibang.ac.in</email>
							<affiliation key="aff1">
								<orgName type="department">Documentation Research and Training Centre Indian Statistical Institute</orgName>
								<address>
									<settlement>Bangalore</settlement>
									<country key="IN">India</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">A</forename><forename type="middle">R D</forename><surname>Prasad</surname></persName>
							<affiliation key="aff1">
								<orgName type="department">Documentation Research and Training Centre Indian Statistical Institute</orgName>
								<address>
									<settlement>Bangalore</settlement>
									<country key="IN">India</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Faceted Approach To Diverse Query Processing</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">998D5DD3EA43CD18F7EEAAB364C1F2AA</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T00:33+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>H.3.7 [Information Systems]: Digital Libraries; I.2.4 [Computing Methodologies]: Knowledge Representation Formalisms and Methods Design</term>
					<term>Human Factors</term>
					<term>Algorithms Query refinement</term>
					<term>facet-based search</term>
					<term>text-based search</term>
					<term>contextbased search</term>
					<term>user issues</term>
					<term>description logic</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper presents a formal framework for implementing a query refinement method. The method uses general principles of facet analysis. Two key notions are advanced and discussed: diversity and focus. Diversity refers to the information needs of a querying user; it is captured by the notion of 'facet'. A focus refers to how diversity is captured from the documents as organized by the user; it provides a kind of context to the user query. The method is situated within the formal framework of the smallest propositionally closed description logic ALC, thereby betting that ALC provides us with a suitable SAT solver to implement a facet engine, which is the main component of our method.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">INTRODUCTION</head><p>Classical libraries had systems that processed subjects or domains and built representations such as subject indices. Among these system, the Colon Classification System (CCS) first proposed by S.R. Ranganathan <ref type="bibr" target="#b20">[20]</ref> is currently widely used by almost all Indian libraries. The CCS had enough contextual information in the method of facetisation and synthesis so that it formed a semantic formalisation of the domain scope of the library collections.</p><p>In order to digitize CCS and similar facet-based systems, Prasad and Guha <ref type="bibr" target="#b18">[18]</ref> demonstrate the applicability of faceted schema in describing resources in web directories and annotating resources in digital libraries using SKOS/RDF representation to express DEPA strings, according to faceted theory by Ranganathan <ref type="bibr" target="#b20">[20]</ref> and DEPA facet analysis <ref type="bibr" target="#b7">[7]</ref>. On the other hand, current keyword-based querying methods does not use DEPA strings to represent web directories and annotating resources in digital libraries, so they seem inadequate to search over digital repositories organized according to CCS and similar faced-based classification systems.</p><p>For answers to be relevant, a user must ask the appropriate query in order to retrieve the desired information and fulfill the information need (IN). For keyword-based search this means that a high number of keywords is necessary to the user to narrow down the search according to her information need. This is due the semantic ambiguity of querying languages, often built upon natural language, as it is the case of keyword-based querying. Unfortunately, the query length of keyword-based search on average is reported to be short, with 90% of the queries being less than four keywords <ref type="bibr" target="#b12">[12]</ref>. As a consequence, the ambiguity of the query is somewhat mirrored in the relative relevance of search results <ref type="bibr" target="#b32">[32,</ref><ref type="bibr" target="#b3">3]</ref>; diversity in search results arises <ref type="bibr" target="#b15">[15]</ref> and query refinement by the user is often the only solution. To resolve such ambiguity some authors advanced the notion of 'context' in web search, see for instance <ref type="bibr" target="#b14">[14,</ref><ref type="bibr" target="#b10">10]</ref> and references cited therein. However, in contect-based solutions the user is often assumed to know how data and information are organized in the search domain. This is often hard to happen in realworld, distributed scenarios like the Web, due to large amounts of heterogeneous data organized in an unknow structure.</p><p>In this paper we present a formal framework wherein we define a method for the extraction of DEPA facets from a user query. The facets are then used to refine the original query for search and retrieval purposes. The method is aimed to suggest the user a list of facets that the user would hardly be aware of by simply typing a keyword-based query into a search engine, without any query context. These automatically suggested new facets can be used by the user, for instance by clicking on one of the new facets, to narrow down the search space by expanding the original user query with the suggested facet. This paper is organized as follows. In Section 2 we define basic concepts related to facet analysis. In Section 3 we discuss the first step of our method. In Section 4 we build a formal faceted ontology to formalize the focused terms and contexts that we successively process, in Section 5, to produce new facets to be shown to the user for query refinement. After building the faceted ontology and defining the facet engine, in Section 6 we present the three different yet related querying methods we offer to the user; these are keyword-based, by focus, and on subject. In Section 7 we discuss related work. In Section 8 we conclude the paper.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">FACETS ANALYSIS</head><p>Facet analysis is essentially a conceptual analysis of the subject matter, or the topical content of a concept into distinct divisions that together constitute a semantic description of the concept. In order to build the facet repository available to a user to refine a query, in this section we present some elements of facet analysis.</p><p>Our facets repository is organized around two main notions of the DEPA paradigm for facet analysis <ref type="bibr" target="#b6">[6,</ref><ref type="bibr" target="#b7">7]</ref>: subjects and facets. A subject of a concept is the topical content of the concept, that is, the concept's overall semantics, as defined by the combination of extensional and intensional semantics of the concept term. The definition can be extended to a query, which in its simplest form can be thought of as a finite sequence of concept terms; see subsections 6.1 and 6.3. A facet consists of a "group of terms derived by taking each term and defining it, per genus et differentiam, with respect for its parent class." <ref type="bibr">[31, p. 12]</ref>. According to Ranganathan <ref type="bibr" target="#b20">[20]</ref>, each domain is made of distinct divisions or facets that are groups of mutually exclusive concepts and many such facets together constitute a domain. The notion of such facetization has been extended by Bhattacharyya <ref type="bibr" target="#b7">[7]</ref> to subject indexing by representing content as a string of fundamental categories DEPA (Discipline, Entity, Property and Action) that are conceptually equivalent to 'facets'. To illustrate, we rely on the following two examples. EXAMPLE 1. Consider a document titled 'Improving EU labour market access for Rome'. DEPA facet analysis of the title leads to facets such as: Labour Market (Entity), Access (Action), Rome (Space -from commonly applicable facet schedules across domains). The facet 'Discipline' is extrapolated from faceted document representation, and it is 'Economics'.</p><p>Note that in case a concept would be classified within more than one discipline, as a homonymous or synonymous concept, then all such different combinations of facets are taken into account and presented to the user for further refinement. EXAMPLE 2. Consider a document titled 'Treating Apple trees for bacterial disease in Trentino'. <ref type="foot" target="#foot_0">1</ref> DEPA facet analysis provides a classification of the document into the following facets: Agriculture (D), Apple Trees (E), Treating (A), Disease (P), and Bacterial (as 'Modifier' to P, cf. <ref type="bibr" target="#b6">[6]</ref>).</p><p>We are now ready to define the facet repository for a given context. A facet repository for a context C is the set </p><formula xml:id="formula_0">F R(C) = {⟨C : d,</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">FOCUSED TERMS FROM TEXT</head><p>In the present work, we apply facetization as a technique to combine extensional and intensional semantics of concepts viz. queries, or equivalently to disclose the subject of concepts and queries to the querying user, for the purpose of query refinement and search assistance. We implement facetization in two related steps: 1. we produce certain "focused terms" from documents organized in a polyhierarchy, and 2. from focused terms we produc new facets to be shown to the user for the purpose of query refinement. We present step 1 in subsections 3.1 and 3.2 in this section, and step 2 in sections 4 and 5.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Organization of documents</head><p>Although our method can be adopted as integral part of digital libraries systems, both for describing the documents collection and for faceted querying over the collection or the web, in this paper we assumed the method assists a querying user in query refinement. As the method in this specific application uses a textual collection of documents stored in the user's querying machine, we stipulate the following convention. CONVENTION 1. We denote the set of available documents to a querying user by D. All available documents are textual, that is, they can be processed by text information retrieval techniques as the variant of a standard technique discussed in Section 3.</p><p>Intuitively, the domain D of documents can be thought of as the set of all documents the querying user has classified and stored in the querying machine. CONVENTION 2. We assume that the querying user organizes documents in D by using a 'polyhierarchical classification', or polyhierarchy.</p><p>A polyhierarchical classification is a hierarchical classification permitting some concept terms to be listed in multiple categories of a taxonomy, or branches of a hierarchy <ref type="bibr" target="#b16">[16]</ref>. An example of polyhierarchy can be found in Figure <ref type="figure" target="#fig_0">1</ref>. Note that what makes the hierarchical classification in Figure <ref type="figure" target="#fig_0">1</ref> be polyhierarchical is the concept term 'Apple'. A subset of documents is organized in 'contexts', each context be organized into related sets of documents. A context is a polyhierarchical classification composed by sets of documents, i.e., 'nodes' of the polyhierarchy, called clusters, and a relation over the nodes as defined by the polyhierarchy. Typical relations are the binary relations of subsumption, part-of, is-a, among others relations. Each cluster in a given context has a name composed by a finite sequence of words from a representation language, often a natural langiage thereby betting that clusters are named by a human-the querying user, who naturally applies her native language for clusters naming. A cluster's name in such representation language is referred to as concept term. A concept is a concept term provided with a semantics. Two kinds of semantics are provided to a concept term: an extensional semantics, defined over the documents in the cluster named by the concept term; and an intensional semantics, defined by the unique position of the concept term in a given 'focus'.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Cx:MyClassification</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Computers</head><p>Contexts provide a way to define finite, ordered sequences of concept terms, each sequence called a focus. A focus consists of an ordered set of related concept terms, each concept term naming a cluster built upon the collection of documents in D. Intuitively, a focus is a path of concept terms corresponding to a path in a given context. Figure <ref type="figure" target="#fig_1">2</ref>  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Concept terms grounded in documents</head><p>In this section, our goal is to automatically assign a 'label' to every cluster of a given context. Each cluster's label produced by Algorithm 1 below is a finite, simple concatenation of terms with maximum 'weight', extracted by using Text (•). Formally, we proceed as follows.</p><p>Let Text (•) be a text extraction function. In this paper, we refer to Text (•) as a standard keywords extraction function, for instance see <ref type="bibr" target="#b25">[25,</ref><ref type="bibr">Sec. 4]</ref>. Given a document d, Text (d) listes all the keywords in d, precisely, the most frequent 'tokens'. Applied to a document d, Text (•) produces a set Text (d) of terms (or 'keywords'). Let d be any document in D. As terms are defined from documents, from now on we write k ∈ Text (d) to denote a generic term retrieved by using Text (•) d. Given a document d, we rank a term k ∈ Text (d) by adapting IR standard TF/IDF ("Term Frequency / Inverse Document Frequency") method <ref type="bibr" target="#b22">[22,</ref><ref type="bibr" target="#b23">23]</ref> to deal with contexts and unique concept terms' position, i.e., focus, within a context. Observe that in the following, for a given context C we write 'C in C' in place of 'C in C' set of clusters' for every cluster C.</p><p>Let querying user u organizing a context C, cluster C in C, and term k ∈ Text (d) for a document d ∈ D be given. We define the weight of k in C as follows:</p><formula xml:id="formula_1">W u [k, C] = ( ∑ d∈C TF[k, d]) • log Card (F C) doCK u [k] ,<label>(1)</label></formula><p>where</p><formula xml:id="formula_2">TF[k, d] is the total number of occurrences of k in d, so that ∑ d∈C TF[k, d] is the total number of occurrences of k in C; Card (F C) is the number of focuses in C with leaf C, and doCK u [k]</formula><p>is the total number of clusters in the set</p><formula xml:id="formula_3">C \ {C ′ | C ′ ̸ = C is a cluster in a focus in C with leaf C} (2)</formula><p>which contain k. Intuitively, (1) says that, in order to represent the extensional semantics of a focus, the importance of a retrieved term for a cluster, i.e., the value of W u [k, C], is inversely proportional to the number of different focuses with C as leaf which contain the term.</p><p>The label of a cluster C is the most representative term or sequence of terms for the cluster. Now we want compute the label of all clusters of a given context. For doing this, we process all documents stored in each cluster by considering the position of each cluster in the context. To define the process formally, we rely on the following technical definition. Let context C organize (a subset of) documents in D and cluster C in C be given. We define</p><formula xml:id="formula_4">IR (D, C, C) = {k ∈ Text (d) | d ∈ C, C in C}. (<label>3</label></formula><formula xml:id="formula_5">)</formula><p>We expect that the label of cluster C in (  To compute a label of every nonempty cluster C of a given context C, we exhibit an algorithm that produces the label lC of C; see Algorithm 1. Set IR = IR (D, C, C). Algorithm 1 Context-based cluster labeling.</p><formula xml:id="formula_6">k is a label of C, if W u [k ′ , C] ≤ W u [k, C] for all terms k ′ in IR (D, C, C). A se- quence k1, k2, ...kn of terms in IR (D, C, C) is a label of C if (a) W u [ki, C] = W u [k, C] for i = 1,</formula><formula xml:id="formula_7">Input: C, D ̸ = ∅ foreach C in C with C ̸ = ∅ do foreach k ∈ IR (D, C, C) do compute W u [k, C] according to formula (1) od; compute M = {k ∈ IR | ∀k ′ ∈ IR, W u [k ′ , C] ≤ W u [k, C]};</formula><p>Let n be the cardinality of M ; Let {k1, k2, . . . , kn} be the lexicographical ordering of M ; Set l0 = ∅; /* empty sequence */</p><formula xml:id="formula_8">for i = 1 to i = n do Pick ki ∈ M ; Set li = li−1ki od od; /* simple concatenation */ Define lC = ln Return : set of labels {lC | C in C, C ̸ = ∅}. Observe: 1. If C ̸ = ∅ then IR ̸ = ∅. 2.</formula><p>The label lC computed by Algorithm 1 in not unique. In fact, M in Algorithm 1 is assumed to be ordered according to lexicographical ordering. Other orderings of the elements in M are possible and, as a consequence, a different label can be generated from each ordering. We are now ready to define "focused terms." Let a focus F with concept term C as leaf be given. A focused term for F is any term that appears in a label lC of a cluster C in F . In symbols, the set of focus terms for F is</p><formula xml:id="formula_9">F T (F ) = {k | k appears in lC , C ∈ F}.</formula><p>A focused term for C is any term that appears in lC . A focus term for a concept term plays the role of a synonymous, or alias names, of the concept term. As we will see in Section 6, alias names are important to improve keyword-based querying.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">FACETED ONTOLOGY BUILDING</head><p>The result of extracting terms from documents and "facetizing" the concepts of a polyhierarchical classification by using them produces a basic kind of faceted taxonomy, provided that (1) the extracted terms or, often, a proper subset of these <ref type="bibr">[9]</ref>, are matched with a predefined set of facets, and (2) the clusters in a focus are related to each other by a subsumption relation. For a faceted taxonomy consists of: (a) a set of facets, where each facet consists of a predefined set of terms; and (b) a subsumption relation among the terms. In this section we provide the formal framework we need to formalize the focused terms and labeled contexts we have produced by Algorithm 1 by shallowly assuming (2) 2 .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Description Logics</head><p>Description Logics (DLs) <ref type="bibr" target="#b5">[5]</ref> are a family of logic-based knowledge representation formalisms designed to represent and reason about the knowledge of an application domain in a structured and well-understood way. In this paper, we use a basic description logic, called ALC, thereby betting that ALC provides us with an efficient SAT solver to implement our facet engine (Section 5). ALC is the smallest propositionally closed DL, and provides the concept constructors For the goal of this paper, we use a limited part of ALC's expressive power; in particular we do not use role axioms and assertions. Moreover, we write concept descriptions in lower case, as concept description from now on are terms extracted by Algorithm 1 from documents 2 That in our approach clusters in a focus are related to each other by a subsumption relation follows from Convention 2 by observing that polyhierarchical classifications are often subsumption hierarchies. However, we do not need to strictly assume (2) in this paper. as explained. Due to the limitation of space, we do not provide a detailed introduction of Description Logics (DLs), but rather point the reader to <ref type="bibr" target="#b5">[5,</ref><ref type="bibr" target="#b4">4]</ref> and offer the reader an example. EXAMPLE 5. Consider the labeled focus in Example 4. We can represent it within ALC by a set of equality axioms, that we present as labels of the labeled focus in Figure <ref type="figure">4</ref>. The concept descrip-</p><formula xml:id="formula_10">¬ C, C ⊓ D, C ⊔ D,</formula><formula xml:id="formula_11">Fruit ≡ ∃hasK.k 3 1 ⊓ • • • ⊓ ∃hasK.k 3 n ... Trentino ≡ ∃hasK.k 2 1 ⊓ • • • ⊓ ∃hasK.k 2 m Apple ≡ ∃hasK.k 1 1 ⊓ • • • ⊓ ∃hasK.k 1 p Figure 4: A labeled focus in ALC.</formula><p>tions k j i that appear in the tree refer to the focused terms extracted by Algorithm 1 for each concept in the focus; hasK is a named role, which is intuitively interpreted as 'has keyword'. For example, ∃hasK.k <ref type="foot" target="#foot_1">3</ref>1 intuitively means that concept term 'Fruit' in focus F :Fruit&gt;Trentino&gt;Apple is extended with focused term (keyword) k 3  1 . Each equality axiom that appears along the tree defines in ALC a concept term in F; the focus itself is formalized by the equality axiom: FocusApple ≡ Apple ⊓ ∃R.(Trentino ⊓ ∃R.Fruit). An ALC KB for this example is the set of the three equality axioms depicted along the tree plus the equality axiom that defines 'Fo-cusApple' as the 'focus Apple', i.e., the focus F.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Formal Faceted Classifications</head><p>Now we generalize the example. Algorithm 2 below provides a way to build an ALC faceted knowledge base, or faceted ontology, for a given context. The algorithm works in two main steps.</p><p>First, it builds a knowledge base by adding ALC equality axioms that formally define the concept terms of an input context by using focused terms computed by Algorithm 1 over the same context. For maching purposes that we will see in Section 5, if strictly more or strictly less (but at least one) focused terms were computed for a concept term, then the algorithm adds to the knowledge base all the equality axioms defined over all possible combinations of four focused terms picked up, possibly with repetitions, from the computed terms.</p><p>Second, the algorithm adds to the knowledge base so obtained all ALC equality axioms that formally define DEPA facets of every concept as stored in the facet repository (see Section 2). These axioms have the form C ≡ ∃F acetD.d ⊓ ∃F acetE.e ⊓ ∃F acetP.p ⊓ ∃F acetA.a, <ref type="bibr" target="#b4">(4)</ref> where C represents a concept c available in the facet repository, F acetD, F acetE, F acetP , and F acetA are named roles rapresenting the property of c in terms of DEPA facet analysis paradigm. 3  The intended interpretation of these named roles relates to the facet repository. For example, ∃F acetD.f means that there is a concept in the facet repository with facet 'Discipline' be f . By extension, equality axiom (4) means that there is a concept in the facet repository with facet 'Discipline' d, 'Entity' e, 'Property' p, and 'Action' a, and that concept has name C. Hence, as per second step, Algorithm 2 adds to the knowledge base all axioms of form as in (4) if and only if there is a concept (or a subject) with DEPA facets d, e, p, a in the facet repository. We make the system insensitive to case and punctuation in the facets d, e, p, a by adding additional axioms where variants of d, e, p, a with the same meaning are used. We call the ontology produced by Algorithm 2 a formal faceted classification (FFC).</p><p>Algorithm 2 Building a ALC faceted ontology O. </p><formula xml:id="formula_12">Input: C, D ̸ = ∅, F R(C) Set O = ∅; /*</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">FACET ENGINE</head><p>Now we design within our framework a facet engine that computes the matching between the focused terms of a input context and the predefined set of facets stored in the facet repository for a number of concepts. Intuitively, the facet engine looks at all keywords generated for each concept name in a focus for all focuses of the hierarchy, and browse through the focus from the root to the leaf to identify what keywords are DEPA facets stored in facet repository. The facet engine's main component is Algorithm 3. The basic steps of the algorithm are the following:</p><p>Step 1. Input a concept description C that represents a user's query; the different possible queries that can be represented this way are presented in Section 6.</p><p>Step 2. Find and retrieve from the ontology built by Algorithm 2 all equality axioms that define C in the ontology either by focused terms or DEPA facets. If no axioms do exist, that is, C is not defined according to the knowledge stored in the ontology, the algorithm ends with no help to the user. This state means that the search engine cannot provide the user with help for query refinement by facets.</p><p>Step 3. For all retrieved axioms and for each axiom of the form C ≡ ∃hasK.k1 ⊓ • • • ⊓ ∃hasK.kn, where lC = k1...kn is the label computed by Algorithm 1, the algorithm runs the ALC SAT solver in order to match (focused) terms ki in the axiom to all DEPA facets for C possibly stored in the facet repository. Note that the performance of our method mainly dependents on this step, namely, the number and complexity of the matchings. Preliminary results suggested that the algorithm satisfies the requirements of a query refinement system in terms of real time performance. A complete study of the complexity of this step is in progress.</p><p>Step 4. For all successful matchings computed in Step 3, the retrieved DEPA facets are output and shown to the user. </p><formula xml:id="formula_13">:= F acetSet(C) j−1 ∪ F li /* all DEPA strings for C in F R(C) retrieved */ od fi; Return : F acetSet(C) j .</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">QUERY PROCESSING</head><p>After building the faceted ontology and defining the facet engine we are ready to use them to provide new facets to the user for query refinement. We allow the user to make three kind of query: keyword-based, by focus, and on subject. We discuss each querying method in turn.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.1">Keyword-based querying</head><p>The user types one or more keywords in the search box. This method is the simplest one and it is often the only method available when the user does not know anything about the subject to search, or the user's knowledge on the query subject is not based on documents locally stored in the user querying machine, so that we can not use the ontology and facet engine we have advanced. This is also a tyipical case of keyword-based querying by common search engines, where the keywords used in the query are listed without a specific ordering on the only basis of the user's information need.</p><p>We deal with this method of querying as follows. Each keyword is mapped to zero or more concept terms in the context C. We do that using an exact string match of the keyword to the concept term or one of its alias names, namely, its focused terms.</p><p>If no concept term and its alias names match any keyword, no concept description is available to the facet engine, and as a consequence no facets for query refinement are shown to the user.</p><p>If one concept term or its alias names match some keywords, then the concept description C of the concept term is generated and processed by Algorithm 3 for query expansion. The facets that occur in the query expansion are shown to the user. When selecting one of the new facets, the user will narrow down the search by expanding the original query with the suggested facet.</p><p>If multiple concept terms match some keywords, then the concept description of each term is generated and processed by Algorithm 3 for query expansion. The facets that occur in the query expansion of every concept description are shown to the user. Alternatively, the user is given the option to refine their query to indicate which concept term, namely, keyword they meant the most.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2">Querying-By-Focus</head><p>Now suppose that the user knows at least something about the subject to search, and the user's knowledge comes from documents stored and polyhierarchically organized in the user's document collection. In this case, it would always be desiderable for the user to get better and better understanding of the hidden content of the query, as it is automatically generated by a suitable method, so as to discover new facets of the original query that the user was not aware of before. For example, suppose the query is 'apple' as contextualized in Figure <ref type="figure" target="#fig_4">5</ref>. The user clicks on a concept term in a context C. In doing that, the user selects a focus in C. Alternatively, the user types some keywords as in keyword-based querying, but in a specific order to mean a focus in C. For example, the user may click on (an appropriate graphic-version of) 'Apple' in context or either type keywords 'fruit', 'trentino', 'apple' in this order, as to mean Cx:Fruit&gt;Trentino&gt;Apple. In the example, by selecting the facet 'Fruit' the user would narrow down the search space by excluding all subjects about Apple Computers and related subjects as search results (see Figure <ref type="figure" target="#fig_0">1</ref>). Similarly, by selecting facet 'Trentino' the user would be able to narrow down the search space by excluding all subjects about fruits that are not related to Trentino's production of apples. It follows that the keyword-based method and querying by focus are not equivalent for at least one reason, that is, in keyword-based querying the order of keywords does not matter, in querying by focus does. The other main difference between these two querying methods arises looking at query processing. The difference is that concept terms in a focus are not 'pure' keywords; a concept term is represented by a string of similar keywords as generated by Algorithm 1. Concept terms relate to documents in the user's repository, while keywords are usually unrelated to the user's documents.</p><p>A query-by-focus is similar to a query by example, yet it is more specific. In querying by example, a sample document (the example) is selected by the user to refine the query. On the other hand, in querying by focus the position of the sample document is also considered, that is, the place the document is stored within the user's documentary repository. To illustrate, suppose that a user stores his documents according two different structures, see Figure <ref type="figure" target="#fig_5">6</ref> suppose the user selects the document named doc1 as the sample document. In classical querying by example, a relevant answer to the user would be any document about 'apple', as meant as either a fruit or a computer. In contrast, using querying by focus the only relevant answers to the user would be documents from one of the two focus Fruit&gt;Apple and Computers&gt;Apple.</p><p>We deal with querying by focus as follow. First, a concept description C of the concept term that is the leaf of the focus is generated and processed by Algorithm 3 for query expansion. The facets that occur in the query expansion are shown to the user. When selecting one of the new facets, the user will narrow down the search by expanding the original query with the suggested facet.</p><p>Note that the case where query by focus applies in practical situations is not as uncommon as it may seem, because almost all users start a search from a device storing text and text-annotated documents, and these are often organized by the user according to a polyhierarchical classification. More importantly, the fact that a user searches the Web does not mean that documents from the Web will be used for the purpose of querying by focus. The documents used for querying by focus are all and only the documents locally stored in the user's querying device, whatever the search objective is either to retrieve documents stored in the user's device or in the Web. As a consequence, querying by focus clearly scales to the size of the web. To understand a bit further, recall that our method is about query refinement, it is not a query search method. We use standard methods and search engines to search; the difference is that the keywords we let the search engines to use are automatically generated by our facetization technique.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.3">Querying-On-Subject</head><p>Subject-based querying is the most common approach by specialized users, where 'subject' refers to the topical intent of a query (cf. Section 2). In our faceted approach to representation of documents in collection D, 'subjects' are broken down into distinct divisions, the facets of subject. A typical 'query-on-subject' is deemed to relate to a specific subject of a preexisting faceted classification. For example, a subject-based query is: 'What are the documents on the effects of nitrogen fertilizers on rice plants?' The subject of the concept subsumed by this query is one of possibly many focuses, for example the following: Cx:rice plants&gt;nitrogen fertilizers&gt;effects.</p><p>(</p><formula xml:id="formula_14">)<label>5</label></formula><p>This is a partial focus, in the sense that the discipline subsumed by the query as provided by the DEPA facet analysis is Cx:Agriculture&gt;rice plants&gt;nitrogen fertilizers&gt;effects. <ref type="bibr" target="#b6">(6)</ref> Another possible focus for the subject of query's concept is the following:</p><p>C ′ x:Agriculture&gt; effects of nitrogen&gt; fertilizers&gt;rice plants.</p><p>A number of different but equivalent focuses could exists for a given subject-based query. Note the the existance of a focus for this query as well as the focus form depend only upon the querying user's classification of documents. The take-away point is that by merging a subject to one or more focuses, by automatically transforming a query-on-subject to a query-by-focus, the method provides the user with assistance in query refinement. In fact, we compute the focuses generated from the query on subject, and for each focus we consider the concept description that represents the focus in ALC ontology computed by Algorithm 2. Then we proceed as in the case of querying by focus and compute the query expansion of the focus according to knowledge stored in the ontology. Finally, the retrieved facets are shown to the user. If multiple focuses are computed from the query's subject, the user is given the option to refine the original query to indicate which focus they meant for the searched subject.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">RELATED WORK</head><p>There has been extensive work on automated facet construction motivated by query refinement, browsing and navigation over document collections, see for instance <ref type="bibr" target="#b29">[29]</ref>, <ref type="bibr" target="#b8">[8,</ref><ref type="bibr">9]</ref>, <ref type="bibr" target="#b10">[10]</ref>, . <ref type="bibr" target="#b24">[24]</ref>, <ref type="bibr" target="#b30">[30,</ref><ref type="bibr" target="#b13">13]</ref>.</p><p>The notion of context in these related works differ from the notion of focus; in <ref type="bibr" target="#b10">[10]</ref> context is a piece of text, from a document the user is presented to, surrounding the query, which is marked by the user on the document. The structural nature of a focus contrasts with the plain, linguistic nature of query context as meant in <ref type="bibr" target="#b10">[10]</ref>. The navigation trees discussed in <ref type="bibr" target="#b28">[28]</ref> are similar to the focuses discussed in this paper. The formal approach of <ref type="bibr" target="#b28">[28]</ref>, moreover, as well as the use of faceted taxonomies is close in spirit, if not in the formal development to our work presented here. As far as we know, none of the foregoing approaches uses a DEPA facet schema.</p><p>Our method is a focused retrieval method, in the sense that focused retrieval addresses ways to provide a querying user a more direct access to relevant information <ref type="bibr" target="#b26">[26]</ref>. Focused retrieval aims to identify not only documents relevant to a user information need, but also where within the document the relevant information is located.</p><p>Our approach of querying-by-focus is similar to querying by focus on hierarchical classifications proposed by <ref type="bibr" target="#b1">[1,</ref><ref type="bibr" target="#b2">2]</ref>.</p><p>In the Indian Context, faceted library systems, especially the Colon Classification System (CCS), has been adopted by majority of the academic libraries for organizing collections in semantic arrangement. However, there is a wide scope for use of the faceted theory behind systems such as CCS to other knowledge modeling efforts. Prasad and Guha <ref type="bibr" target="#b18">[18]</ref> intoduced a facet-based method to formulate the descriptive domain metadata that could be used to annotate digital library resources. Prasad and Madalli <ref type="bibr" target="#b19">[19]</ref> propose a generic model for building semantic infrastructure for digital libraries based on facets as used in traditional library classification systems.</p><p>Faceted taxonomies are extensively studied, see for instance <ref type="bibr" target="#b21">[21,</ref><ref type="bibr" target="#b27">27,</ref><ref type="bibr" target="#b28">28]</ref> and references therein. Facet techniques include that studied by Tvaroẑek and Bieliková <ref type="bibr" target="#b27">[27]</ref>, who have proposed faceted navigation and its personalization in digital libraries. They follow a method of faceted browser adaptation based on an automatically acquired user model with support for dynamic facet generation J.</p><p>Polowinski <ref type="bibr" target="#b17">[17]</ref> argues for use of Faceted Browsing as a visual selection mechanism to browse data collections as it is deemed as being particularly suitable for structured, but heterogeneous data with explicit semantics.</p><p>Normalized Formal Classifications (NFC) used in <ref type="bibr" target="#b11">[11]</ref> does this by taking into account both the label of the node and its position using natural language processing techniques (see <ref type="bibr">[11, sec 4]</ref>). On the other hand, we have used an information retrieval technique to find out the keywords that will successively represented in concept descriptions by using role names of the form hasK.k. This is an important difference with <ref type="bibr" target="#b11">[11]</ref>. A focus is called "concept at a node" in <ref type="bibr">[11, p. 70</ref>], although we believe that the two notions are not totally equivalent (to be investigated). The notion of Formal Faceted Classification (FFC) extends the notion of "lightweight ontology" of <ref type="bibr" target="#b11">[11]</ref> to facets. A main difference with lightweight ontologies by <ref type="bibr" target="#b11">[11]</ref> is that FFC's descriptive language is not propositional as the language used in <ref type="bibr" target="#b11">[11]</ref>. Yet, it allows us to automate, through DL reasoning services (SAT), query refinement, as we did in this paper. Moreover, by our query language we allow a user to specify a query by selecting a sample document, to be interpreted of as the "information need" of documents similar to the sample selected. As a consequence, we provide a user with a mechanism of "querying by example" as a special case. On the other hand, in <ref type="bibr" target="#b11">[11]</ref> it seems not easy to formalize querying by example, as the propositional language used does not allow to represent instances.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8.">CONCLUSION</head><p>This paper presented a formal framework for a querying refinement method that enables the extraction of the diversity aspects, or facets, of a user query. The method uses the general principles of facet analysis in the DEPA paradigm of facetization and the notion of 'focus', which is used to infer new facets from the user query. The method provides a user with additional and essential contextual information, in form of new facets. When selecting one of the new facets, the user can narrow down the search by expanding the original query with the suggested facets. The proposed method of query refinement is based on diversity in querying and a multi-dimensionality of information. Three methods of querying weree discussed: keyword-based, by focus, and on subject. For each method, textual and structural dimensions were used to assist the user in query refining. The textual dimension allowed us to generate the top-k most relevant terms for each concept of a given polyhierarchy of text and text-annotated documents. The structural dimension of the polyhierarchy was used to match DEPA facets with the user query. We have situated our framework within the smallest propositionally closed description logic ALC, and we have used ALC's solver to implement the facet engine as the main component of the method.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: A polyhierarchy, or polyhierarchical context Cx.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: An example of context (left) and focus (right).</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>EXAMPLE 4 .Figure 3 :</head><label>43</label><figDesc>Figure 3: A focus as labeled by Algorithm 1.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head></head><label></label><figDesc>∃R.C, ∀R.C, as well as concept inclusion (or subsumption) C ⊑ D and concept equality C ≡ D, where C, D are concept descriptions and R is a named role. A DL knowledge base (KB) consists of concept axioms (such as concept inclusion and concept equality axioms), role axioms (such as functional role axioms) and assertions of the form C(a), R(a, b) where a and b are named individuals.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: A focus for query 'Apple'.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Figure 6 :</head><label>6</label><figDesc>Figure 6: Position of sample document doc1 matters.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head></head><label></label><figDesc>e, p, a⟩ | C has DEPA facets d, e, p, a}, where C is a concept description in description logic ALC (see subsection 4.2) of a concept or subject of interest in context C, and d, e, p, a are, respectively, a Discipline, Entity, Property and Action in DEPA classification system.</figDesc><table><row><cell>EXAMPLE 3. Consider the previous two examples. We can as-</cell></row><row><cell>sume that 'Improving EU labour market access for Rome' is rep-</cell></row><row><cell>resented by a concept description C1, and 'Treating Apple trees</cell></row><row><cell>for bacterial disease in Trentino' is represented by a concept de-</cell></row><row><cell>scription C2 in a context C. The facet repository F R(C) contains</cell></row><row><cell>⟨C1 : Economics, LabourM arket, p, Access⟩ for p is unspeci-</cell></row><row><cell>fied, and ⟨C2 : Agriculture, AppleT rees, Disease, T reating⟩.</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head></head><label></label><figDesc>3) is the most representative term or sequence of terms in IR (D, C, C). The most representative term among terms in IR (D, C, C) is the term with the highest weight among all terms in IR</figDesc><table /><note>(D, C, C) according to weighting measure 1. Formally, a term k in IR (D, C, C) is the most representative for the cluster C in C, and we say that</note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head></head><label></label><figDesc>Algorithm 3 Query expansion with facets from focused terms. proc QueryExpansion Input: C, O, F R(C) /* C is meant to represent user query */ Define ΩK be the set of axioms in O of the form C ≡ ∃hasK.k1 ⊓ • • • ⊓ ∃hasK.kn; /* k1...kn = lC */ Define ΩF be the set of axioms in O of the form C ≡ ∃D.d ⊓ ∃E.e ⊓ ∃P.p ⊓ ∃A.a; /* ⟨C : d, e, p, a⟩ is in F R(C) */ if ΩK ∨ ΩF = ∅</figDesc><table><row><cell>then exit</cell><cell cols="2">/* no query exspansion provided */</cell></row><row><cell>else</cell><cell></cell></row><row><cell>s := Card (ΩK );</cell><cell cols="2">/* ΩK cardinality is s ≥ 1 */</cell></row><row><cell>t := Card (ΩF );</cell><cell cols="2">/* ΩF cardinality is t ≥ 1 */</cell></row><row><cell cols="3">F acetSet(C) := ∅; /* set of facets retrieved for C */</cell></row><row><cell cols="2">for j = 1 to j = s do</cell></row><row><cell>F00 := ∅;</cell><cell cols="2">/* different facets strings retrieved */</cell></row><row><cell></cell><cell cols="2">/* by using a single axiom in ΩK */</cell></row><row><cell cols="2">for l = 1 to l = t do for i = 1 to i = ( n 4 )</cell><cell>do</cell></row><row><cell cols="3">if O |= ∃hasK.ki1 ⊓ • • • ⊓ ∃hasK.ki4} ≡</cell></row><row><cell cols="3">∃D.d ⊓ ∃E.e ⊓ ∃P.p ⊓ ∃A.a</cell></row><row><cell cols="3">/* focused terms and DEPA facets match */</cell></row><row><cell>then</cell><cell></cell></row><row><cell cols="3">F */</cell></row><row><cell></cell><cell cols="2">/* depending on ki1,...,ki4 */</cell></row><row><cell>fi od</cell><cell></cell></row><row><cell>od;</cell><cell></cell></row><row><cell>F acetSet(C) j</cell><cell></cell></row></table><note>li := F li−1 ∪ {⟨C : d, e, p, a⟩} /* ⟨C : d, e, p, a⟩ retrieved</note></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">Trentino is a Province of the Italian North-east known for the Dolomites and for its quality production of red and yellow apples.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_1">To shorten notation, in algorithms we use D, E, P , A instead.</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title/>
		<author>
			<persName><surname>References</surname></persName>
		</author>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">On the discovery of the semantic context of queries by game-playing</title>
		<author>
			<persName><forename type="first">A</forename><surname>Agostini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Avesani</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Sixth International Conference On Flexible Query Answering Systems (FQAS-04)</title>
				<editor>
			<persName><forename type="first">H</forename><surname>Christiansen</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M.-S</forename><surname>Hacid</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Andreasen</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Larsen</surname></persName>
		</editor>
		<meeting>the Sixth International Conference On Flexible Query Answering Systems (FQAS-04)<address><addrLine>Berlin Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer-Verlag LNAI</publisher>
			<date type="published" when="2004">2004</date>
			<biblScope unit="volume">3055</biblScope>
			<biblScope unit="page" from="203" to="216" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Identification of communities of peers by trust and reputation</title>
		<author>
			<persName><forename type="first">A</forename><surname>Agostini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Moro</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Eleventh International Conference on Artificial Intelligence: Methodology, Systems, Applications -Semantic Web Challenges (AIMSA-04)</title>
				<editor>
			<persName><forename type="first">D</forename><forename type="middle">F C</forename><surname>Bussler</surname></persName>
		</editor>
		<meeting>the Eleventh International Conference on Artificial Intelligence: Methodology, Systems, Applications -Semantic Web Challenges (AIMSA-04)<address><addrLine>Berlin Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer-Verlag LNAI</publisher>
			<date type="published" when="2004">2004. 3192</date>
			<biblScope unit="page" from="85" to="95" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Diversifying search results</title>
		<author>
			<persName><forename type="first">R</forename><surname>Agrawal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gollapudi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Halverson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ieong</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Second ACM International Conference on Web Search and Data Mining (WSDM-00)</title>
				<meeting>the Second ACM International Conference on Web Search and Data Mining (WSDM-00)<address><addrLine>New York, NY</addrLine></address></meeting>
		<imprint>
			<publisher>ACM Press</publisher>
			<date type="published" when="2009">2009</date>
			<biblScope unit="page" from="5" to="14" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">Handbook of Description Logics</title>
		<editor>F. Baader, D. Calvanese, D. McGuinness, D. Nardi, and P. Patel-Schneider</editor>
		<imprint>
			<date type="published" when="2002">2002</date>
			<publisher>Cambridge University Press</publisher>
			<pubPlace>Cambridge, UK</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Basic description logics</title>
		<author>
			<persName><forename type="first">F</forename><surname>Baader</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Nutt</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Handbook of Description Logics</title>
				<editor>
			<persName><forename type="first">F</forename><surname>Baader</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><surname>Calvanese</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><forename type="middle">M</forename><surname>Guinness</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><forename type="middle">P</forename></persName>
		</editor>
		<editor>
			<persName><forename type="first">-S</forename><forename type="middle">D</forename><surname>Nardi</surname></persName>
		</editor>
		<meeting><address><addrLine>Cambridge, UK</addrLine></address></meeting>
		<imprint>
			<publisher>Cambridge University Press</publisher>
			<date type="published" when="2002">2002</date>
			<biblScope unit="page" from="47" to="100" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">POPSI: its fundamentals and procedure based on a general theory of subject indexing languages</title>
		<author>
			<persName><forename type="first">G</forename><surname>Bhattacharyya</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Library Science with a Slant to Documentation</title>
		<imprint>
			<biblScope unit="volume">16</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="1" to="34" />
			<date type="published" when="1976">1976</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Subject indexing language: its theory and practice</title>
		<author>
			<persName><forename type="first">G</forename><surname>Bhattacharyya</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the DRTC Refresher Seminar-13, New Developments in LIS in India</title>
				<meeting>the DRTC Refresher Seminar-13, New Developments in LIS in India<address><addrLine>Bangalore, India</addrLine></address></meeting>
		<imprint>
			<publisher>DRTC</publisher>
			<date type="published" when="1981">1981</date>
		</imprint>
		<respStmt>
			<orgName>ISI Bangalore Centre</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Automatic discovery of useful facet terms</title>
		<author>
			<persName><forename type="first">W</forename><surname>Dakka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Dayal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Ipeirotis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the ACM SIGIR 2006 Workshop on Faceted Search</title>
				<meeting>the ACM SIGIR 2006 Workshop on Faceted Search<address><addrLine>New York, NY</addrLine></address></meeting>
		<imprint>
			<publisher>ACM Press</publisher>
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Automatic extraction of useful facet hierarchies from text databases</title>
		<author>
			<persName><forename type="first">W</forename><surname>Dakka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Ipeirotis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2008 IEEE 24th International Conference on Data Engineering (ICDE-08)</title>
				<meeting>the 2008 IEEE 24th International Conference on Data Engineering (ICDE-08)<address><addrLine>Washington, DC, USA</addrLine></address></meeting>
		<imprint>
			<publisher>IEEE Computer Society</publisher>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="466" to="475" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Placing search in context: The concept revised</title>
		<author>
			<persName><forename type="first">L</forename><surname>Finkelstein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Gabrilovich</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Matias</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Rivlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Solan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Wolfman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Ruppin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Tenth International World Wide Web Conference (WWW-2001)</title>
				<meeting>the Tenth International World Wide Web Conference (WWW-2001)<address><addrLine>New York, NY</addrLine></address></meeting>
		<imprint>
			<publisher>ACM Press</publisher>
			<date type="published" when="2001">2001</date>
			<biblScope unit="page" from="406" to="414" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Encoding classifications into lightweight ontologies</title>
		<author>
			<persName><forename type="first">F</forename><surname>Giunchiglia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Marchese</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Journal on Data Semantics VIII</title>
				<editor>
			<persName><forename type="first">S</forename><surname>Spaccapietra</surname></persName>
		</editor>
		<meeting><address><addrLine>Berlin Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer-Verlag LNCS</publisher>
			<date type="published" when="2007">2007</date>
			<biblScope unit="volume">4380</biblScope>
			<biblScope unit="page" from="57" to="81" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Real life, real users, and real needs: a study and analysis of user queries on the web</title>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">J</forename><surname>Jansen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Spink</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Saracevic</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information Processing &amp; Management</title>
		<imprint>
			<biblScope unit="volume">36</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="207" to="227" />
			<date type="published" when="2000">2000</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">AFGF: An automatic facet generation framework for document retrieval</title>
		<author>
			<persName><forename type="first">K</forename><surname>Latha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">R</forename><surname>Veni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Rajaram</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2010 International Conference on Advances in Computer Engineering (ACE-2010)</title>
				<meeting>the 2010 International Conference on Advances in Computer Engineering (ACE-2010)<address><addrLine>Washington, DC, USA</addrLine></address></meeting>
		<imprint>
			<publisher>IEEE Computer Society</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="110" to="114" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Context in Web Search</title>
		<author>
			<persName><forename type="first">S</forename><surname>Lawrence</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Data Engineering Bulletin</title>
		<imprint>
			<biblScope unit="volume">23</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="25" to="32" />
			<date type="published" when="2000">2000</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">On the interdisciplinary foundations of diversity</title>
		<author>
			<persName><forename type="first">V</forename><surname>Maltese</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Giunchiglia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Denecke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Lewis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Wallner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Baldry</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Madalli</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the First International Workshop on Living Web at ISWC-09</title>
				<editor>
			<persName><forename type="first">G</forename><surname>Boato</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Niederee</surname></persName>
		</editor>
		<meeting>the First International Workshop on Living Web at ISWC-09<address><addrLine>Washington D.C., USA</addrLine></address></meeting>
		<imprint>
			<publisher>CEUR-WS</publisher>
			<date type="published" when="2009-10-26">October 26, 2009. 2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<title level="m" type="main">Information architecture for the World Wide Web</title>
		<author>
			<persName><forename type="first">P</forename><surname>Morville</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Rosenfeld</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2006">2006</date>
			<publisher>O&apos;Reilly Media, Inc</publisher>
			<pubPlace>Sebastopol, CAe</pubPlace>
		</imprint>
	</monogr>
	<note>3rd edition</note>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Human interface and the management of information. Designing information environments</title>
		<author>
			<persName><forename type="first">J</forename><surname>Polowinski</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Symposium on Human Interface 2009, held as Part of HCI International 2009 (HCII-09)</title>
				<editor>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>Smith</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><surname>Salvendy</surname></persName>
		</editor>
		<meeting>the Symposium on Human Interface 2009, held as Part of HCI International 2009 (HCII-09)<address><addrLine>San Diego, CA, USA; Berlin Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer-Verlag LNCS</publisher>
			<date type="published" when="2009">July 19-24, 2009. 2009</date>
			<biblScope unit="volume">5617</biblScope>
			<biblScope unit="page" from="601" to="610" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Expressing faceted subject indexing in SKOS/RDF</title>
		<author>
			<persName><forename type="first">A</forename><surname>Prasad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Guha</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the First International Conference of Semantic Web and Digital Libraries</title>
				<meeting>the First International Conference of Semantic Web and Digital Libraries<address><addrLine>Bangalore</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2007-07">21-23 February (ICSWDL-07. 2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Semantic digital faceted infrastructure for semantic digital libraries</title>
		<author>
			<persName><forename type="first">A</forename><surname>Prasad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Madalli</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Library Review</title>
		<imprint>
			<biblScope unit="volume">57</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="225" to="234" />
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<monogr>
		<title level="m" type="main">Prolegomena to Library Classification</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">R</forename><surname>Ranganathan</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1967">1967</date>
			<publisher>Asia Publishing House</publisher>
			<pubPlace>London</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Dynamic Taxonomies and Faceted Search</title>
		<author>
			<persName><forename type="first">G</forename><surname>Sacco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tzitzikas</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The Information Retrieval Series</title>
				<meeting><address><addrLine>Berlin Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer-Verlag</publisher>
			<date type="published" when="2009">2009</date>
			<biblScope unit="volume">25</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<title level="m" type="main">The SMART Retrieval System-Experiments in Automatic Document Retrieval</title>
		<author>
			<persName><forename type="first">G</forename><surname>Salton</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1971">1971</date>
			<publisher>Prentice-Hall Inc</publisher>
			<pubPlace>Englewood Cliffs, NJ</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<monogr>
		<title level="m" type="main">Introduction to Modern Information Retrieval</title>
		<author>
			<persName><forename type="first">G</forename><surname>Salton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Mcgill</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1983">1983</date>
			<publisher>McGraw-Hill</publisher>
			<pubPlace>New York, NY</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Automating creation of hierarchical faceted metadata structures</title>
		<author>
			<persName><forename type="first">E</forename><surname>Stoica</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Hearst</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Richardson</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Human Language Technology Conference (NAACL HLT)</title>
				<meeting>the Human Language Technology Conference (NAACL HLT)<address><addrLine>Rochester, NY, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2007">2007</date>
			<biblScope unit="page" from="244" to="251" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Using keyword extraction for web site clustering</title>
		<author>
			<persName><forename type="first">P</forename><surname>Tonella</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Ricca</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Pianta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Girardi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Fifth International Workshop on Web Site Evolution (WSE-03)</title>
				<editor>
			<persName><forename type="first">K</forename><surname>Wong</surname></persName>
		</editor>
		<meeting>the Fifth International Workshop on Web Site Evolution (WSE-03)<address><addrLine>Amsterdam, The Netherlands</addrLine></address></meeting>
		<imprint>
			<publisher>IEEE Computer Society</publisher>
			<date type="published" when="2003">2003</date>
			<biblScope unit="page" from="41" to="48" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Current research in focused retrieval and result aggregation</title>
		<author>
			<persName><forename type="first">A</forename><surname>Trotman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Geva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kamps</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lalmas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Murdock</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Special Issue in the Journal of Information Retrieval</title>
		<imprint>
			<biblScope unit="volume">13</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="407" to="411" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Personalized faceted browsing for digital libraries</title>
		<author>
			<persName><forename type="first">M</forename><surname>Tvaroẑek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Bieliková</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 11th European Conference on Digital Libraries (ECDL-07)</title>
				<editor>
			<persName><forename type="first">L</forename><surname>Ács</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Fuhr</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Meghini</surname></persName>
		</editor>
		<meeting>the 11th European Conference on Digital Libraries (ECDL-07)<address><addrLine>Budapest, Hungary; Berlin Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer-Verlag LNCS</publisher>
			<date type="published" when="2007">September 16-21, 2007. 2007</date>
			<biblScope unit="volume">4675</biblScope>
			<biblScope unit="page" from="485" to="488" />
		</imprint>
	</monogr>
	<note>Research and Advanced Technology for Digital Libraries</note>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Extended faceted taxonomies for web catalogs</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Tzitzikas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Spyratos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Constantopoulos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Analyti</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Third International Conference on Web Information Systems Engineering (WISE-02)</title>
				<meeting>the Third International Conference on Web Information Systems Engineering (WISE-02)</meeting>
		<imprint>
			<date type="published" when="2002">2002</date>
			<biblScope unit="page" from="192" to="204" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">Faceted exploration of image search results</title>
		<author>
			<persName><forename type="first">R</forename><surname>Van Zwol</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Sigurbjörnsson</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Nineteenth International World Wide Web Conference (WWW-10)</title>
				<meeting>the Nineteenth International World Wide Web Conference (WWW-10)<address><addrLine>New York, NY</addrLine></address></meeting>
		<imprint>
			<publisher>ACM Press</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="961" to="970" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<analytic>
		<title level="a" type="main">Efficient computation of diverse query results</title>
		<author>
			<persName><forename type="first">E</forename><surname>Vee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">U</forename><surname>Srivastava</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Shanmugasundaram</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Bhat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Yahia</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2008 IEEE 24th International Conference on Data Engineering (ICDE-08)</title>
				<meeting>the 2008 IEEE 24th International Conference on Data Engineering (ICDE-08)<address><addrLine>Washington, DC, USA</addrLine></address></meeting>
		<imprint>
			<publisher>IEEE Computer Society</publisher>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="228" to="236" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<monogr>
		<title level="m" type="main">Faceted classification: A guide to construction and use of special schemes</title>
		<author>
			<persName><forename type="first">B</forename><surname>Vickery</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1960">1960</date>
			<publisher>Aslib -Asia Publishing House</publisher>
			<pubPlace>London</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<analytic>
		<title level="a" type="main">Resolving tag ambiguity</title>
		<author>
			<persName><forename type="first">K</forename><surname>Weinberger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Slaney</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Van Zwol</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 16th International ACM Conference on Multimedia (MM 2008)</title>
				<meeting>the 16th International ACM Conference on Multimedia (MM 2008)<address><addrLine>New York, NY</addrLine></address></meeting>
		<imprint>
			<publisher>ACM Press</publisher>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
