<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">A Semi-Automatic Semantic Annotation and Authoring Tool for a Library Help Desk Service</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Antti</forename><surname>Vehvil</surname></persName>
							<email>antti.vehvilainen@tkk.fi</email>
						</author>
						<author>
							<persName><forename type="first">Eero</forename><surname>Hyv</surname></persName>
							<email>eero.hyvonen@tkk.fi</email>
						</author>
						<author>
							<persName><forename type="first">Olli</forename><surname>Alm</surname></persName>
							<email>olli.alm@tkk.fi</email>
						</author>
						<author>
							<affiliation key="aff0">
								<orgName type="laboratory">Laboratory of Media Technology Semantic Computing Research Group</orgName>
								<orgName type="institution">Helsinki University of Technology (TKK)</orgName>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff1">
								<orgName type="department">Laboratory of Media Technology</orgName>
								<orgName type="institution">Helsinki University of Technology (TKK</orgName>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff2">
								<orgName type="institution">University of Helsinki Semantic Computing Research Group</orgName>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff3">
								<orgName type="institution">University of Helsinki</orgName>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff4">
								<orgName type="department">Semantic Computing Research Group</orgName>
								<orgName type="institution">Helsinki University of Technology (TKK)</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">A Semi-Automatic Semantic Annotation and Authoring Tool for a Library Help Desk Service</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">38B1AA5708E87E7F8E1265086B1281A9</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T16:23+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper discusses how knowledge technologies can be utilized in creating help desk services on the semantic web. To ease the content indexer's work, we propose semi-automatic semantic annotation of natural language text for annotating question-answer (QA) pairs, and case-based reasoning techniques for finding similar questions. To provide answers matching with the indexer's and end-user's information needs, methods for combining case-based reasoning with semantic search and browsing are proposed. We integrate different data sources by using large ontologies of upper common concepts, places, and agents. Techniques to utilize these sources in authoring answers are suggested. A prototype implementation of a real life ontology-based help desk application is presented as a proof of concept. This system is based on the data set of over 20,000 QA pairs and the operational principles of an existing national library help desk service in Finland.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">INTRODUCTION</head><p>Companies and public organizations widely use help desk services in order to solve problems for their customers. The classic example of a help desk service is a call center, where support persons answer questions by phone or by email. As help desk services are being transferred to the Web, it's more and more common that the customers have also the possibility to solve their problems by themselves by using the knowledge and content accumulated at the service, without contacting a support person directly <ref type="bibr" target="#b4">[5]</ref>. A simple approach, for example, is to publish Frequently Asked Questions (FAQ) lists on the web. The option to use a simple and fast question-answer (QA) self-service is appreciated not only by the customers, but by the authors of the answers, too. Their time is saved, if the QA service can automatically provide an answer to the customer. Furthermore, the author can use the accumulated QA knowledge of the service by herself, which helps in authoring the answers and improves the quality of the answers. This paper discusses applications of semantic web technologies to help desk services. We focus on QA help desk ser-vices, where the database of the service is composed of previously answered questions, i.e., QA pairs. In such a service the user has a question in mind, and the service has two major tasks:</p><p>1. Finding relevant previous answers. A search method is needed to find the already answered relevant QA-pairs from the repository.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>2.</head><p>Authoring a new answer. An existing QA pair may satisfy the customers information need, but usually some kind of adaptation of the old answer case is needed. Usually answers are created and modified manually by a human editor.</p><p>The research problem of this paper is to investigate how to support semi-automatic answer authoring in a QA help desk service. Our methodology is to use semantic web technologies in content annotation, in utilizing the QA repository, and in integrating information available online on the web with the authoring process and the answers.</p><p>In this paper, when we use the term indexing we refer to the old, existing way of doing indexing where index terms are just strings without an ontological reference. We use the term annotation to refer to the new way of using annotation concepts that have an ontological reference.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.1">The Existing Service</head><p>The research is based on a real life case study: we use the data set of the operational Ask a librarian service<ref type="foot" target="#foot_0">1</ref> offered nationally in Finland by the editors of the Libraries.fi<ref type="foot" target="#foot_1">2</ref> portal. In this service the clients can send questions to a virtual librarian via email, and a librarian of the service provides an answer within three working days. Some of the questions that the clients send are simple and the librarian can answer them straight away. These include questions about the opening times of a library, how to make an inter-library loan etc. However, most of the questions require that the Where I could find helpful information? Answers to these questions span typically a few paragraphs of text and contain some links to useful web sites. The librarians report that on average they use from half an hour to an hour to compose such an answer.</p><p>Each QA pair has been indexed using the YSA thesaurus 3 of some 23,000 common Finnish terms. At the moment the data set consists of over 20,000 QA pairs. A keyword-based search service is available on the web for both end-users and answering librarians to use.</p><p>In the service, several problems were identified by enquiring the librarians employed by the service:</p><p>1. Accessing accumulated knowledge. For a new submitted question, the first thing to do is often to find out if there already exists a similar or at least related answer in the knowledge base.</p><p>2. Exploiting external resources in authoring. How to integrate different data sources and services, such as library systems on the web, and then use these sources in authoring a new answer?</p><p>3. Semantic annotation. How to help the librarian in choosing the appropriate annotation concepts for a 3 http://vesa.lib.helsinki.fi new QA pair? This problem was considered especially crucial by the practitioner.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.2">The Proposed Solution</head><p>The problems described above are approached by describing a prototype of a semantic annotation and authoring tool Opas<ref type="foot" target="#foot_2">4</ref>  <ref type="bibr" target="#b19">[20]</ref>. The system is intended to be used by the librarians in authoring answers in the Ask the librarian service.</p><p>In the following, we first show how semi-automatic semantic annotation can be used to help in choosing concepts for the semantic annotation of QA pairs, based on ontologies. Then the problem of finding relevant answers for a new incoming question is approached by using ideas of case-based reasoning (CBR) <ref type="bibr" target="#b0">[1]</ref>. It is also shown present how a common upper ontology can be used to integrate different data sources to help in authoring answers. We then present the results of the early evaluations conducted with the prototype. In conclusion, contributions of the work are summarized, related work discussed, and directions of further research outlined.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">SEMI-AUTOMATIC SEMANTIC ANNO-TATION</head><p>When interviewing the librarians, two problems related to the indexing the QA pairs were brought up: 1) Choosing the appropriate indexing terms for annotating a questionanswer pair is often consuming and difficult. 2) There are different conventions used in indexing by different people, which makes the content unbalanced. For example, one li- brarian may use a few general terms to describe an answer, whereas another uses a large number of more detailed terms.</p><p>Our solution approach to these problems is to combine ontology-based semi-automatic annotation <ref type="bibr" target="#b12">[13]</ref> and machine reasoning. The idea is to create a knowledge-based system that automatically provides the annotator with a suggestion of potential annotation concepts based on the textual material and other knowledge available, such as the QA database, earlier annotations, and common knowledge about indexing practices. The initial suggestion is then checked and edited by the human editor as she likes. This strategy not only helps the annotator in finding annotation terms (from tens of thousands of choices) but also enforces the annotators to use right terms based on the underlying annotation ontologies. Furthermore, content is likely to become more balanced because every annotator starts her job from a suggestion based on the same logic. By encoding indexers' knowledge and common indexing practices as rules, or by using automatic techniques such as collaborative filtering <ref type="bibr" target="#b6">[7]</ref>, it is possible to help especially novice indexers in their job even further.</p><p>As a first step towards such a knowledge-based semiautomatic annotation tool, we created an ontology-based information extraction tool Poka 5 for textual data, and in-5 http://www.seco.tkk.fi/applications/poka/ tegrated it with Opas. The following describes briefly how Poka works.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Extracting Annotation Concepts</head><p>Poka provides the QA indexer with a list of possible annotation concepts as ontological concepts (URIs), and the indexer chooses which concepts she wants to use. The selection of the concepts is based on the words and expressions found in the question and answer.</p><p>The librarians currently choose the indexing terms manually from the General Finnish Thesaurus YSA <ref type="foot" target="#foot_3">6</ref> . The terms in YSA are (with some exceptions) common noun terms, such as dog, astronomy, or child. In addition, the indexer may use free indexing terms that are not explicitly listed in the thesaurus. Free terms can be common nouns, such as names of flowers or animals, or proper nouns, such as person names (e.g., John F. Kennedy) or geographical places (Finland, Beijing). These categories of words, and free indexing terms not explicitly listed in the thesaurus, are treated by Poka in the following way. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.1">Common Nouns</head><p>In order to map common nouns in YSA with corresponding ontology concepts, YSA was transformed into the General Finnish Upper Ontology (YSO) 7 <ref type="bibr" target="#b10">[11]</ref>. YSO contains over 20,000 Finnish indexing concepts organized into 10 major subsumption hierarchies. Each concept is associated with one or more term labels, which allows mapping of words and terms onto YSO concepts (URIs).</p><p>First, the input question is analysed by a morphological analyser and a syntactic parser FDG 8 <ref type="bibr" target="#b17">[18]</ref>. It produces tokenized output of the text in XML-form. FDG produces a lemmatized form of the word(s), morphological information, syntactical information, and type and reference of functional dependency to another token within a sentence, if there exist one.</p><p>For concept matching, also the labels of YSO-concepts are lemmatized. Lemmatized concepts are indexed in a prefix trie for efficient extraction. Lemmatization of text and concept names helps to achive better recall in the extraction process; syntactical forms of words vary greatly in languages with heavy morphological affixation <ref type="bibr" target="#b16">[17]</ref>. The architecture can be extended to support other languages with different language-dependent syntactic parsers.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.2">Place Names</head><p>Place name recognition in Poka is based on the same method as common noun recognition. In this case, the place ontology of the MuseumFinland portal <ref type="bibr" target="#b9">[10]</ref> extended in the CultureSampo-project 9 is used instead of YSO.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.3">Person Names</head><p>Poka's name recognition tool is a rule-based information extraction tool without initial gazetteers. The main idea A strength of Poka's extraction process is that it recognizes also untypical names, unlike the tools based on gazetteers, such as tools that use the initial named entity recognition of the Gate framework <ref type="bibr" target="#b2">[3]</ref>. Searching potential names is started from the uppercase words of the document. With morphosyntactic clues some hits can be discarded. For example, first names in Finnish rarely have certain morphological affixation like -ssa (similar to English preposition in) or -lla (preposition on). Also the FDG-parser's surface-syntactic analysis is used as clues for revealing the proper names.</p><p>Person name recognition may produce false hits. One wrong hit of full name may cause the corresponding wrong first and last name occurrences to be mapped to a full name. The good thing is that all the occurrences of the false name can be corrected by discarding the full name.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Free Annotation Concepts</head><p>Poka doesn't always suggest all annotation concepts that the librarian wants to use, even if the corresponding word can be found in the text to be annotated, and the word is considered a legal annotation concept. This happens always with free annotation concepts that by definition are not included in the ontology explicitly. Obviously, human intervention is necessary in such cases.</p><p>Our approach to the problem of extracting free annotation concepts is to provide a mechanism by which the end-users can define new free annotation entries in the ontology and share them with other annotators. A new annotation concept is defined by simply telling the system its class, label, and an optional comment. For example, the term "leikkiauto" (toy car) is not present in YSO ontology because lots of things can be used as toys, and it does not make much sense to list them all in the system. On the other hand, the concept toy car is useful from the indexing and information retrieval view points. In this case, the user can interactively create a new concept as a subclass of an existing ontological concept, here toy ("lelu"), label it, here "leikkiauto" (toy car), and use it in the annotation. When searching for content later on by using the concept toy ("lelu"), also QA pairs annotated with toy car ("leikkiauto") can be retrieved with the additional information that in this case the QA pair is about toy cars in particular. The new concept of toy car also be utilized in various ways in the user interface, e.g., as a search category in view-based semantic search <ref type="bibr" target="#b9">[10]</ref>. Free indexing terms with the same name can be distinguished with different URIs and with an additional comment.</p><p>Unknown but relevant annotation concepts without a corresponding concept in the ontologies are frequently encountered also in name recognition because new names (e.g., names of pop stars) are constantly introduced as time goes by. The same approach used with free annotation concepts can be employed here, too.</p><p>In some cases where a word does not have an exact match with an ontological concept, Poka is able to suggest related annotation concepts based on the ontology. Such reasoning can be based, for example, on the morphological structure of a compound word or the functional dependencies produced by the FDG-parser.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">Ranking Annotation Concepts</head><p>Previous sections analyzed situations where a semantic annotator produces too few relevant annotation concepts. A reverse problem with automatic semantic annotation is that often too many irrelevant concepts are suggested. Espe-cially, if the input text is long, a considerable number of possible annotation concepts are usually found. In such cases it is useful to rank the concepts according to their likely relevance, and provide the end-user with a simple mechanism for evaluating and deleting the irrelevant annotations.</p><p>Opas uses the idea <ref type="bibr" target="#b15">[16]</ref> of searching for semantic cluster(s) from the term set for determining the relevance of indexing concepts: terms in semantic clusters are ranked more relevant than semantically isolated terms. For example terms doctor, sickness and medication form a semantic cluster. For common noun terms we use the concept relations defined in the YSO ontology to identify these clusters.</p><p>In <ref type="bibr" target="#b7">[8]</ref>, an ontological extension of the classic tf-idf (term frequency -inverse document frequency) method is developed, which enables us to identify synonyms and to utilize the concept hierarchies of the ontology. We apply this work so that more weight is given to concepts that appear frequently in the text but haven't been used often as annotation concepts in previous questions. In addition, Opas can suggest annotation concepts that are usually used together. For example, if a question has the concept aviation extracted, and there are lots of questions annotated with both aviation and airplane, the concept airplane can be suggested for annotation concept, even though it is not explicitly present in the question text.</p><p>Our preliminary experiments with annotation concept weighting seem to suggest that relatively more weight should be given to terms that have a high term frequency, and the effect of inverse document frequency should be relatively smaller. The reasoning behind this is that if, say, the concept poetry appears in a question many times, it seems that the concept is relevant to the question even though it has been used frequently as an annotation concept in previous questions. So, in Opas the main weight is determined by the term frequency, whereas inverse document frequency and semantic clusters have a smaller impact on the weight.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4">An Example</head><p>Figure <ref type="figure" target="#fig_0">1</ref> depicts the first screen that the librarian sees when she has decided to answer a question. The end-user has submitted a question about Arto Paasilinna's (a Finnish author) life and his books (on the left, in the box "Kysymysteksti" (Question Text). On the right, in the box "Oppaan löytämät käsitteet" (Indexing Concepts Found) there are two common noun concepts "teokset" (writings) and "esitelmät" (plays). Poka has also identified the person name "Arto Paasilinna". Below the question text, there is the authoring component ("Vastaajan apurit") (Authoring Tools) to be discussed in detail in section 4.</p><p>Figure <ref type="figure">2</ref> depicts the case where the free annotation concept "leikkiauto" (toy car) is encountered. In this case, Poka analyses the compound term into pieces and suggests the concept "leikkikalu" toy because it is found in the YSO ontology as a potentially related concept based on the first part of the compound. The librarian can then define the narrower concept toy car with the label "leikkiautot" toy cars by clicking on the link in the middle.</p><p>Figure <ref type="figure">3</ref> depicts the case where Poka is unable to make any suggestions, and the librarian wants to add the new annotation concept writer ("kirjailijat") in the ontology. As she is typing in the word, Opas uses semantic autocompletion <ref type="bibr" target="#b8">[9]</ref> to suggest matching annotation concepts in YSO. The floating box on the bottom right displays information about a concept, its preferred and alternative labels, related concepts, subconcepts, and superconcepts. This information is displayed when the librarian points the concepts with the mouse. The purpose of the autocompletion component is to 1) ensure that the indexer uses a concept found in the ontology and 2) suggest semantically related indexing concepts that the librarian perhaps didn't consider.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">UTILIZING CASE-BASED REASONING TO FIND SIMILAR QUESTIONS</head><p>Case-based reasoning (CBR) <ref type="bibr" target="#b0">[1]</ref> is a problem solving paradigm in artificial intelligence where new problems are solved based on previously experienced similar problems, cases. The CBR cycle consists of four phases: 1) Retrieve he most similar case or cases, 2) Reuse the retrieved case(s) to solve the problem, 3) Revise the proposed solution and 4) Retain the solution as a new case in the case base.</p><p>Since similar QA pairs recur in QA services, we decided to investigate the usefulness of CBR in QA indexing and information retrieval. CBR has been used in help desk applications previously. For example, Goker and Roth-Berghofer <ref type="bibr" target="#b5">[6]</ref> argue that CBR can successfully be used in a help desk service and by using CBR in help desk service an organization can strengthen the common knowledge and reduce the time needed to answer a help request. Kai et al. <ref type="bibr" target="#b11">[12]</ref> have found out that users of a CBR-based help desk system tend to remember solutions longer since they feel that they've solved the problem themselves, even though the solution was retrieved and possibly adapted from the case base.</p><p>What Opas brings in to traditional CBR approach is that it integrates semantic annotation to the steps of the CBR cycle. For the first step, Opas contains a CBR component that automatically searches for similar questions based on the concepts that Poka has extracted from the question text. The weighted annotation concept list discussed in section 2.3 is used as the basis for the search with the following modifications: 1) The concepts that the indexer has selected are given a substantially higher weight since their relevance has been confirmed by the indexer.</p><p>2) The extracted places, names and specified concepts are given a higher weight due to their specificity.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">INTEGRATING DIFFERENT DATA SOURCES IN ANSWER AUTHORING</head><p>When discussed the current service with the librarians, a few things were remarkable about the information sources that the librarians use when answering a question. Firstly, nearly all of the librarians said that they use the reference library with real books to find useful resources. Secondly, even though nearly all the librarians agreed that the questions tend to repeat themselves, not many of them systematically use the question archive to find old similar questions. Besides that, it is remarkable that when the librarians aren't able to answer a question in three working days, they nevertheless send an answer to the client. This answer usually contains pointers to different information resources, for example web sites, that might contain the answer to the question.</p><p>Based on the remarks described above, we decided to add an authoring component to Opas. The purpose of this component is to help the librarian to compose the answer using different information sources. The authoring component can be seen in the figure <ref type="figure" target="#fig_0">1</ref> ("Vastaajan apurit"). What is common to these authoring components is that each of them uses the annotation concept suggestions produced by Poka to query external resources. The common upper ontology YSO acts as a "glue" between different information resources. In the following the subcomponents of the authoring component are explained.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Authoring Using Existing QA Pairs</head><p>Existing QA pairs can be used as a basis for composing the new answer. In figure <ref type="figure" target="#fig_2">4</ref> the librarian has opened one of the questions in order to see whether it provides useful information for answering the question. The answer can be used as basis for the new answer by clicking the link (the white paper sheet with a pen). Figure <ref type="figure" target="#fig_3">5</ref> depicts how the librarian has used an existing answer as a basis for the answer.</p><p>As the retrieval of similar QA pairs can be seen as the first step in the CBR cycle, using them in authoring component can be seen as a part of the second step: Reuse the retrieved case(s) to solve the problem.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Authoring Using a Library Classification System</head><p>An ontology for a library classification system was created for Opas, and then the Helsinki City Library Classification System (HCLCS) <ref type="foot" target="#foot_4">10</ref> was converted into this ontologized form. The basis for the classification ontology is Simple Knowledge Organisation System (SKOS) <ref type="foot" target="#foot_5">11</ref> and the conversion was made following the guidelines given in <ref type="bibr" target="#b18">[19]</ref>. In addition to class hierarchies the HCLCS contains index terms, and each of these terms has got a relation to a library class. For example the term Treatment of alcoholics has got a relation to the library class 371.71 Alcohol policy.</p><p>Index terms in the HCLCS contain also views, as can be seen in the figure <ref type="figure" target="#fig_4">6</ref>. For example the term pieces of art ("Teokset") embodies different viewpoints such as bibliographies and art collections. Each of these viewpoint is related to a library class. These relations between index terms and library classes are used to search for books that could be relevant to the answer. These books are searched based on the library class, as depicted in the figure <ref type="figure" target="#fig_4">6</ref>. The librarian can use the results of the book search 1) for searching an answer for the question and 2) by enhancing the answer with links to interesting books.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Authoring Using a Link Library</head><p>The editors of the Libraries.fi maintain a collection of links to interesting web sites. This link library is categorized using the same classification system that is used in the HCLCS. An ontology was created and then the data was converted into an ontologized form in a similar manner than described in the previous section. The figure <ref type="figure" target="#fig_5">7</ref> depicts a screenshot of this link library. The links are categorized by the HCLCS ("Henkilöbibliografiat", "Lastenkirjastotyö", etc.), and the librarian has opened one category to see whether there are interesting links. These links can be added to the answer text as can be seen in the figure <ref type="figure" target="#fig_3">5</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">EVALUATION</head><p>To evaluate the current version of the prototype and to find out librarians' initial attitudes towards the new version of the system, a few user tests were run with real users of the service. The tests were conducted so that the librarian was first introduced with the prototype and its features. Then, she was asked to answer a question using the prototype. The questions were real questions of the existing version of the service. Finally, the librarian was interviewed about the answering process.</p><p>The results of the evaluation were encouraging. All librarians found the features of the prototype useful and said that they would take the prototype into use, if it were possible. The most impressing and useful feature for the librarians seemed to be the authoring features of the prototype, especially the component that searches for existing similar questions automatically. All librarians were also pleased with the authoring features that enable to add resources (old answers, links, book references) to the answer by clicking a button.</p><p>The annotation concept suggestions were welcomed, but not as eagerly as the authoring components. Some of the librarians said that the concept suggestions were entirely irrelevant. The semantic autocompletion component that searches for concepts in YSO was considered useful. Based on the tests, nothing can yet be said about how good the ranking of the concept suggestions was.</p><p>When a librarian hasn't selected and confirmed any of the suggested annotation concepts, the authoring component fetches resources based on all of the concepts in the list. However, when the librarian had selected one or more suggestions to be used, it was confusing that still the authoring component fetched resources related to unselected concepts. Although these resources were given a smaller weight and thus they were lower in the result list, it seems that when the librarian has selected one or more concept suggestion or inserted a free annotation concept, the other, unselected concepts should be ignored totally in the result lists of the authoring components.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">DISCUSSION</head><p>First experiments with combining semi-automatic semantic annotation and authoring with the ideas of case-based reasoning seem promising. Even though the evaluation of the prototype wasn't extensive, it can be concluded that Opas would be a valuable tool to librarians if taken into use. However, systematic empirical evaluations of the application are yet to be done.</p><p>Currently the book search component isn't using semantically annotated content, but instead fetches web pages and then parses the results from the HTML content. In consequence, one of the major benefits of the semantic web, disambiguation of terms (for example, "Nokia" as an enterprise and as a city) is not possible. Opas would benefit more from a system with semantically annotated content.</p><p>The utilization of case-based reasoning in Opas can be seen somewhat shallow. The ideas of CBR and the steps of the CBR-process fit well with Opas, but the details of each step could be examined more carefully. For example a framework for similarity assessment presented in <ref type="bibr" target="#b3">[4]</ref> could be utilized for the retrieval of similar QA pairs.</p><p>A result of the the evaluation was that the annotation concept suggestions weren't optimal. Sophisticated methods for ranking the suggestions and finding out which concepts really are relevant for a user query should be investigated and developed further.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.1">Related Work</head><p>To search for similar questions some other approaches would have been possible as well. For example Kohonen et al. <ref type="bibr" target="#b14">[15]</ref> demonstrate how Self Organizing Maps <ref type="bibr" target="#b13">[14]</ref> (SOM) can be used to organize a vast collection of patent abstracts and then use the SOM to search if similar patents exist for a new patent application. A standard text search by using for example the Java search engine Lucene<ref type="foot" target="#foot_6">12</ref> would also probably yield sufficient results when searching for similar questions. However these methods don't take into account the semantics of the text, and we want to be able to utilize the semantic relations defined in the common upper ontology YSO.</p><p>As for semantic authoring, David Aumuller <ref type="bibr" target="#b1">[2]</ref> presents a technique to semantically author Wiki pages. The technique is not just for adding annotations to the pages but also for editing the text. His ideas could be applied in authoring the answers.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2">Future Work</head><p>Currently Opas is focused on the indexers' role in QA applications but Opas will include the end-users' side, too. Here we work on questions such as: how to classify the QA pairs for semantic view-based search, how to do semantic recommending in order to show other interesting answers, and how to integrate the system with semantic content and services at other locations on the web related to the end-user's information needs. The CBR component that searches for similar questions can be used with little modifications at the end-users' side, too.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Question text, concepts found by Poka and similar questions in Opas UI.</figDesc><graphic coords="2,79.03,39.98,451.82,292.01" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :Figure 3 :</head><label>23</label><figDesc>Figure 2: Specifying an annotation concept</figDesc><graphic coords="3,124.18,40.27,361.14,118.65" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: An example of an existing QA pair and it's index terms</figDesc><graphic coords="4,124.21,40.24,361.22,218.01" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: An example using an existing QA pair and a link from the link library in authoring an answer.</figDesc><graphic coords="5,124.20,40.37,360.95,231.33" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 6 :</head><label>6</label><figDesc>Figure 6: A book search based on the index terms and their views that were found in Helsinki City Library Classification System.</figDesc><graphic coords="6,79.07,40.11,451.71,249.00" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Figure 7 :</head><label>7</label><figDesc>Figure 7: An example of link library links that are found based on Poka's annotation concept suggestions.</figDesc><graphic coords="7,124.20,40.16,360.90,199.02" type="bitmap" /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">http://www.kirjastot.fi/tietopalvelu</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">Libraries.fi provides access to Finnish Library Net Services under one user interface, see http://www.libraries.fi.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_2">http://www.seco.tkk.fi/applications/opas/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_3">http://www.vesa.lib.helsinki.fi</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="10" xml:id="foot_4">http://hklj.kirjastot.fi/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="11" xml:id="foot_5">http://www.w3.org/2004/02/skos/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="12" xml:id="foot_6">http://lucene.apache.org</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="13" xml:id="foot_7">http://www.seco.tkk.fi/projects/finnonto/</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>Our work is a part of the National Semantic Web Ontology Project in Finland (FinnONTO) <ref type="bibr" target="#b12">13</ref> , funded by the National Funding Agency for Technology and Innovation (Tekes) and a consortium of 36 public organizations and companies.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Case-based reasoning: foundational issues, methodological variations, and system approaches</title>
		<author>
			<persName><forename type="first">A</forename><surname>Aamodt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Plaza</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">AI Commun</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="39" to="59" />
			<date type="published" when="1994">1994</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Semantic authoring and retrieval within a wiki</title>
		<author>
			<persName><forename type="first">D</forename><surname>Aumueller</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Demo paper, 2nd European Semantic Web Conference</title>
				<imprint>
			<date type="published" when="2005-08">Aug 2005. 2005. ESWC2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Evolving gate to meet new challenges in language engineering</title>
		<author>
			<persName><forename type="first">K</forename><surname>Bontcheva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Tablan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Maynard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Cunningham</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Natural Language Engineering</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="issue">3/4</biblScope>
			<biblScope unit="page" from="349" to="373" />
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m" type="main">Issues in Structured Knowledge Representation A Definitional Approach with Application to Case-Based Reasoning and Medical Informatics</title>
		<author>
			<persName><forename type="first">G</forename><surname>Falkman</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2003">2003</date>
		</imprint>
		<respStmt>
			<orgName>Chalmers University of Technology, Göteborg University</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">PhD thesis</note>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">An integrated help support for customer services over the world wide web: a case study</title>
		<author>
			<persName><forename type="first">S</forename><surname>Foo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">C</forename><surname>Hui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">C</forename><surname>Leong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Liu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Comput. Ind</title>
		<imprint>
			<biblScope unit="volume">41</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="129" to="145" />
			<date type="published" when="2000">2000</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">The development and utilization of the case-based help-desk support system homer</title>
		<author>
			<persName><forename type="first">M</forename><surname>Goker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Roth-Berghofer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Engineering Applications of Artificial Intelligence</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="665" to="680" />
			<date type="published" when="1999-12">Dec 1999</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Explaining collaborative filtering recommendations</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">H</forename><surname>Herlocker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Konstan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Riedl</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Computer Supported Cooperative Work</title>
				<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2000">2000</date>
			<biblScope unit="page" from="241" to="250" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Integrating tf-idf weighting with fuzzy view-based search</title>
		<author>
			<persName><forename type="first">M</forename><surname>Holi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Hyvönen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Lindgren</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the ECAI Workshop on Text-Based Information Retrieval (TIR-06)</title>
				<meeting>the ECAI Workshop on Text-Based Information Retrieval (TIR-06)</meeting>
		<imprint>
			<date type="published" when="2006-08">Aug 2006</date>
		</imprint>
	</monogr>
	<note>To be published</note>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Semantic autocompletion</title>
		<author>
			<persName><forename type="first">E</forename><surname>Hyvönen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Mäkelä</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 1st Asian Semantic Web Conference (ASWC-2006)</title>
				<meeting>the 1st Asian Semantic Web Conference (ASWC-2006)<address><addrLine>Beijing</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2006">Sep 4-9, 2006</date>
		</imprint>
	</monogr>
	<note>forth-coming</note>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">MuseumFinland -Finnish museums on the semantic web</title>
		<author>
			<persName><forename type="first">E</forename><surname>Hyvönen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Mäkelä</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Salminen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Valo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Viljanen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Saarela</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Junnila</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kettula</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Web Semantics: Science, Services and Agents on the World Wide Web</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="issue">2-3</biblScope>
			<biblScope unit="page" from="224" to="241" />
			<date type="published" when="2005-10">Oct 2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Finnish national ontologies for the semantic web -towards a content and service infrastructure</title>
		<author>
			<persName><forename type="first">E</forename><surname>Hyvönen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Valo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Komulainen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Seppälä</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Kauppinen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Ruotsalo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Salminen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ylisalmi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of International Conference on Dublin Core and Metadata Applications</title>
				<meeting>International Conference on Dublin Core and Metadata Applications<address><addrLine>DC</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2005-11">2005. Nov 2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">A self-improving helpdesk service system using case-based reasoning techniques</title>
		<author>
			<persName><forename type="first">H</forename><surname>Kai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Raman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Carlisle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Cross</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computers in Industry</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="113" to="125" />
			<date type="published" when="1996-09">September 1996</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Semi-automatic semantic annotations for web documents</title>
		<author>
			<persName><forename type="first">N</forename><surname>Kiyavitskaya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Zeni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Cordy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Mich</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mylopoulos</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">SWAP 2005, Semantic Web Applications and Perspectives, Proceedings of the 2nd Italian Semantic Web Workshop University of Trento</title>
				<meeting><address><addrLine>Trento, Italy</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2005-12">December 2005. 2005</date>
			<biblScope unit="page" from="14" to="15" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">The self-organizing map</title>
		<author>
			<persName><forename type="first">T</forename><surname>Kohonen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE</title>
				<meeting>the IEEE</meeting>
		<imprint>
			<date type="published" when="1990-09">Sep 1990</date>
			<biblScope unit="volume">78</biblScope>
			<biblScope unit="page" from="1464" to="1480" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Self organization of a massive document collection</title>
		<author>
			<persName><forename type="first">T</forename><surname>Kohonen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kaski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lagus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Salojarvi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Honkela</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Paatero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Saarela</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="574" to="585" />
			<date type="published" when="2000-05">May 2000</date>
		</imprint>
	</monogr>
	<note>Neural Networks</note>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Deriving semantic annotations of an audiovisual program from contextual texts</title>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">S</forename><surname>Luit Gazendam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Veronique</forename><surname>Malaisé</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Brugman</surname></persName>
		</author>
		<ptr target="http://www.cs.vu.nl/guus/papers/Gazendam06a.pdf" />
	</analytic>
	<monogr>
		<title level="m">Semantic Web Annotation of Multimedia (SWAMM&apos;06) workshop</title>
				<imprint>
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Porting an english semantic tagger to the finnish language</title>
		<author>
			<persName><forename type="first">L</forename><surname>Löfberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Archer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Piao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rayson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Mcenery</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Varantola</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J.-P</forename><surname>Juntunen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Corpus Linguistics 2003 conference</title>
				<meeting>the Corpus Linguistics 2003 conference</meeting>
		<imprint>
			<publisher>UCREL, Lancaster University</publisher>
			<date type="published" when="2003">2003</date>
			<biblScope unit="page" from="457" to="464" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">A non-projective dependency parser</title>
		<author>
			<persName><forename type="first">P</forename><surname>Tapanainen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Järvinen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 5th Conference on Applied Natural Language Processing</title>
				<meeting>the 5th Conference on Applied Natural Language Processing</meeting>
		<imprint>
			<date type="published" when="1997">1997</date>
			<biblScope unit="page" from="64" to="71" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">A method for converting thesauri to rdf/owl</title>
		<author>
			<persName><forename type="first">M</forename><surname>Van Assem</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">R</forename><surname>Menken</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Schreiber</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wielemaker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Wielinga</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Third International Semantic Web Conference ISWC 2004</title>
				<imprint>
			<date type="published" when="2004">2004</date>
			<biblScope unit="volume">3298</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Combining case-based reasoning and semantic indexing in a question-answer service</title>
		<author>
			<persName><forename type="first">A</forename><surname>Vehviläinen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Alm</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Hyvönen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">1st Asian Semantic Web Conference</title>
				<imprint>
			<date type="published" when="2006-06-20">June 20 2006. ASWC2006</date>
		</imprint>
	</monogr>
	<note>Poster paper</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
