<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">The development of a schema for the annotation of terms in the BioCaster disease detecting/tracking system</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Ai</forename><surname>Kawazoe</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">National Institute of Informatics</orgName>
								<address>
									<addrLine>Hitotsubashi 2-1-2 Chiyoda-ku</addrLine>
									<settlement>Tokyo</settlement>
									<country key="JP">JAPAN</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><roleName>Ph.D</roleName><forename type="first">Lihua</forename><surname>Jin</surname></persName>
							<email>lihua-jin@nii.ac.jp</email>
							<affiliation key="aff0">
								<orgName type="institution">National Institute of Informatics</orgName>
								<address>
									<addrLine>Hitotsubashi 2-1-2 Chiyoda-ku</addrLine>
									<settlement>Tokyo</settlement>
									<country key="JP">JAPAN</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><roleName>Ph.D</roleName><forename type="first">Mika</forename><surname>Shigematsu</surname></persName>
							<email>mikas@nih.go.jp</email>
							<affiliation key="aff2">
								<orgName type="institution">National Institute of Infectious Diseases</orgName>
								<address>
									<addrLine>Toyama 1-23-1 Shinjuku-ku</addrLine>
									<settlement>Tokyo</settlement>
									<country key="JP">JAPAN</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><roleName>M.D</roleName><forename type="first">Roberto</forename><surname>Barrero</surname></persName>
							<email>rbarrero@genes.nig.ac.jp</email>
							<affiliation key="aff1">
								<orgName type="institution">National Institute of Genetics</orgName>
								<address>
									<addrLine>Yata 1111 Mishima</addrLine>
									<settlement>Shizuoka</settlement>
									<country key="JP">JAPAN</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><roleName>Ph.D</roleName><forename type="first">Kiyosu</forename><surname>Taniguchi</surname></persName>
							<affiliation key="aff2">
								<orgName type="institution">National Institute of Infectious Diseases</orgName>
								<address>
									<addrLine>Toyama 1-23-1 Shinjuku-ku</addrLine>
									<settlement>Tokyo</settlement>
									<country key="JP">JAPAN</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><roleName>M.D</roleName><forename type="first">Nigel</forename><surname>Collier</surname></persName>
							<email>collier@nii.ac.jp</email>
							<affiliation key="aff0">
								<orgName type="institution">National Institute of Informatics</orgName>
								<address>
									<addrLine>Hitotsubashi 2-1-2 Chiyoda-ku</addrLine>
									<settlement>Tokyo</settlement>
									<country key="JP">JAPAN</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">The development of a schema for the annotation of terms in the BioCaster disease detecting/tracking system</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">0B5A3871B0EE3B6D1817EDD21ED0D791</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-19T15:56+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Amid growing public concern about the spread of infectious diseases such as avian influenza and SARS, there is an increasing need for collecting timely and reliable information about disease outbreaks from natural language data such as online news articles. In this paper we introduce BioCaster, a text mining-based system for infectious disease detection and tracking currently being developed, and discuss the development of a domain ontology and schema for the annotation of terms. In particular we focus on the comparison between two approaches, 1) a traditional task-oriented approach with a simple schema that does not strictly follow ontological principles, and 2) a formal approach which is ontologically well-founded but adds extra requirements to the annotation schema. We report on several critical problems that were highlighted by an entity annotation experiment, attributable to the purely task-oriented ontology design. A second experiment based on a formally constructed ontology produced improved annotation results despite the apparent complexity of the annotation schema.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">INTRODUCTION</head><p>As shown by the recent outbreak of Severe Acute Respiratory Syndrome (SARS) and emerging cases of avian influenza, infectious diseases have the potential to spread rapidly through person-to-person transmission within densely populated areas and across country borders through international air travel. The first line of defense against rapidly spreading diseases is surveillance, led by the World Health Organization (WHO) and national health authorities. Catching an outbreak earlier has clear implications for both morbidity and mortality as well as the feasibility of containment <ref type="bibr" target="#b0">[1]</ref>. However a lack of surveillance system infrastructure in Southeast Asia, which is currently the focus of an avian H5N1 epidemic is seen as hindering control efforts. In addition to traditional surrogate methods such as reporting notifiable diseases and over-the-counter (OTC) sales monitoring, public health experts are increasingly considering news and other reports available on the World Wide Web (Web) as a costeffective means of helping to find and track early cluster cases, enabling a timely and appropriate response. Such rumour-based information may be of particular value for assessing possible outbreaks in areas where formal reporting procedures are absent or not well established.</p><p>Several major challenges exist in locating Webbased information in a timely manner using traditional search methods:</p><p>(1) the massively increasing volume of dynamically changing unstructured news data available on the Web makes it extremely difficult to obtain a clear picture of an outbreak in a timely manner, (2) the large-scale republication of reports from centralized news agencies requires redundancy to be identified and removed, (3) the initial reports of an outbreak are contained in only a few news articles which will usually be overlooked by traditional search engines which use keyword indexing, (4) the first reports of an infectious disease will often be reported in local news media which are only available in the local language. Experience has shown that this requires computer systems to have at least a partial understanding of the domain through ontologies, term lists and databases as well as specialized multilingual resources. To address the information needs in the domain of infectious disease outbreaks, standard Information Extraction technology has been adapted for retrospective archive search <ref type="bibr" target="#b1">[2]</ref> but only a few systems are currently actively deployed with the most prominent being the Global Public Health Intelligence Network (GPHIN) <ref type="bibr" target="#b2">[3]</ref>, a successful but semi-closed system used by the WHO. We are now developing BioCaster, a text mining system based on an openly available multilingual ontology for proactive notification about priority disease outbreaks. A key component of the BioCaster system is the use of automated learning methods to identify novel entities and events using features derived from annotated examples in a multilingual collection of news articles. The initial target languages are English, Japanese, Vietnamese and Thai.</p><p>In our early development of BioCaster it became clear that we needed a rigorous schema for markable entities. Since the system relies on high quality human annotated training data for constructing named entity recognizers (NERs), any inconsistency introduced into the annotation schema by ontological inconsistencies should be harmful for annotation performance, both human and machine. Surprisingly while there have been several studies on the mapping problem between terms and coding systems such as the UMLS Metathesaurus <ref type="bibr" target="#b3">[4]</ref> as well as biomedical annotation experiments <ref type="bibr" target="#b4">[5]</ref> [6] <ref type="bibr" target="#b6">[7]</ref> there have been to the best of our knowledge no studies conducted into the method by which new domain models suitable for biomedical text mining should be organized. We report here on our initial experience which showed that the task-oriented annotation schema based on a poorly-considered domain ontology can indeed be harmful to accuracy. Re-organizing this schema using well founded ontological principles produced better results, despite the added complexity.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">USER NEEDS</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Epidemiologists</head><p>are concerned with the circumstances in which diseases occur in a population and the factors that influence their incidence, spread, recognition and control. Our initial discussions with domain experts at the National Institute of Infectious Diseases revealed several common scenarios for gathering information from Web news including cases involving the spread of a communicable disease across international borders and the contamination of blood products. From these initial discussions we collected examples of early outbreak news reports and compiled a list of significant entity classes which included DISEASE <ref type="foot" target="#foot_0">1</ref> , CASE, LOCATION SYMPTOM, TIME, DRUG, etc.</p><p>Subsequent follow up discussions and examination of the literature revealed that we can categorize these concepts according to the information needs of the scientists as shown in Table <ref type="table" target="#tab_0">1</ref>.</p><p>Genetic epidemiology adds another dimension to the information needs as the genetic makeup of the host plays a key role in determining susceptibility or resistance to pathogens. We therefore chose to add in a further level of detail about the host which includes genes and their products, identified with a §. Finally we had 19 categories of concepts which we want to identify in news texts (Table <ref type="table">2</ref>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">CONSIDERATION ON TWO APPROACHES</head><p>At this stage we were aware that some of the important concepts in Table <ref type="table">2</ref> are contextuallydependent and intrinsically different from others. For example, CASE and TRANSMISSION represent roles (discussed in <ref type="bibr" target="#b7">[8]</ref>  <ref type="bibr" target="#b8">[9]</ref> [10] <ref type="bibr" target="#b10">[11]</ref> among others) which are dependent on the existence of events they participate in, while most others, such as PERSON, BACTERIA, and NON_HUMAN, represent types.</p><p>We had two options for constructing the ontology and annotation schema, according to how to deal with concepts of a different nature. The first approach is rather task-oriented. Here we do not make any distinction between context-dependent concepts and others. This results in a somewhat simpler ontology: all categories of concepts are represented as classes which follow a disjoint entity class principal that has been the underlying premise of NERs. The corresponding annotation schema will also be simpler, since instances of context-dependent classes are annotated in the same way as those of other classes, e.g. &lt;NAME cl="PERSON"&gt;Kofi Annan&lt;/NAME&gt; &lt;NAME cl="CASE"&gt;a 12 year-old girl&lt;/NAME&gt; infected with H5N1</p><p>(The details of this schema will be given in the next section.) In this task-oriented approach, we can annotate exactly what the event frame needs to identify.</p><p>For example, we can exclude from annotation non-named, non-case mentions, which we are not interested in. A defect of this approach is that it is not ontologically well-founded.</p><p>The alternative approach is a more formal one where we make a clear distinction between contextdependent concepts and others, based on wellfounded ontological principles. The result is likely to be a more complex ontology in which contextdependent concepts have a different status from other concepts. The corresponding annotation schema will also be more complex as well, since roles are annotated in a different way from those of entity classes. In order to achieve ontological consistency we also need to annotate more mentions than the former approach, including those that will not instantiate event frames. From the two approaches above, out of expediency we chose the former for the first annotation experiment. The reason being that it seemed easier for annotators and that we could find almost no precedent works in named entity annotation which dealt with formal analysis of entities and role concepts.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">ANNOTATION EXPERIMENT 1 4.1 Method</head><p>Based on the list of categories of concepts in Table <ref type="table">2</ref>, we constructed the ontology shown in Figure <ref type="figure">1</ref>  <ref type="bibr" target="#b11">[12]</ref>, GENIA ontology <ref type="bibr" target="#b12">[13]</ref>, MUC-7 <ref type="bibr" target="#b13">[14]</ref>, and HUB-4 <ref type="bibr" target="#b14">[15]</ref>, respectively.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 2 List of classes of markable concepts</head><p>In the annotation schema used in the example above, the attribute cl takes the entity class label as its value. For example "&lt;NAME cl="PERSON"&gt;Kofi Annan&lt;/NAME&gt;" means that the entity mentioned by "Kofi Annan" is related to the class PERSON. The reason for using this rather vague expression is to cover two relations between mentioned entities and the ontology we want to describe. The first is "is an instance of", and the other one is "is a subclass of". Some of the markable texts mention a particular and others mention a universal. For example, names of persons, locations and organizations are usually used to refer to a particular, whereas names of chemical substance, viruses and proteins are often used to refer to universals. This is one of the factors which makes ontology-based annotation a complicated process. It should be noted though that we intend to work towards a clear distinction between the two relations in future work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Annotation results and problems Figure 1 Initial domain ontology (simplified)</head><p>During the first annotation experiment, we had many problem reports form annotators, and found a significant number of inconsistencies in the annotation results. Most of the problems could be traced back to poor design of the domain ontology and the annotation schema. Follow up analysis on the corpus yielded the following symptoms of error:</p><p>roles, have the same status as other classes since we adopted the task-oriented approach as discussed in the last section. We developed annotation guidelines to annotate non-overlapping mentions related to the classes in news articles and hired two PhD informatics students as annotators. After 1-week of training consisting of guideline review, case study discussions and test cases, we started the annotation process with 200 news articles taken from domain sources, including WHO epidemic reports, IRIN, and Reuter news.</p><p>Gaps in the annotation schema shown by the existence of mentions to entities which it is desirable to annotate but the annotation schema does not cover.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>• • •</head><p>In order to restrict the markable mentions to exactly those that we aimed to identify with the text mining system, we defined CASE as the class of confirmed cases which are unnamed, and PERSON as the class of named persons who are not cases. We considered this would narrow down the number of markable mentions since unnamed mentions for non-cases need not be annotated. We also instructed annotators to markup only the single most appropriate class, prohibited multiple classes. An example of annotated text is shown below: Ambiguity between context-dependent concepts and context-independent ones Idiosyncratic annotations which are forced on annotators due to the disjoint entity class principal.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Gaps in the annotation schema</head><p>At the initial stage of our analysis we considered that distinguishing CASE (as confirmed cases of a disease which are unnamed humans) from PERSON (named persons who are not cases of a disease) was rather natural, since CASE entities are in general anonymous. However, in the news articles there were some examples where cases were mentioned by name as follows:</p><p>The &lt;NAME cl="ORGANIZATION"&gt;Ministry of Health&lt;/NAME&gt; in &lt;NAME cl="LOCATION"&gt; Indonesia&lt;/NAME&gt; has today confirmed &lt;NAME cl="CASE"&gt;a fatal human case&lt;/NAME&gt; of &lt;NAME cl="DISEASE"&gt;H5N1 avian influenza&lt;/NAME&gt;. &lt;NAME cl="CASE"&gt;A 27year-old woman&lt;/NAME&gt; from &lt;NAME cl="LOCATION"&gt;Jakarta&lt;/NAME&gt; developed symptoms on &lt;NAME cl="TIME"&gt;17 September&lt;/NAME&gt;. She contracted the virus from close contact with infected &lt;NAME cl="TRANSMISSION"&gt;birds&lt;/NAME&gt;. E1 Tests carried out in a UK laboratory confirmed that M.A and F died from the H5N1 strain<ref type="foot" target="#foot_1">2</ref> </p><p>In addition, we found that there were more frequent mentions of putative cases than we had expected.</p><p>These mentions were often annotated as CASE by annotators although we restricted the scope of this class only to confirmed cases.</p><p>E2 a Taiwanese is suspected to have died of SARS Follow up discussions with public health experts revealed that mentions of putative cases are important, especially in the early stages of disease outbreaks, and we concluded that they should be identified by the system. However, the existing framework made them difficult to capture.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Ambiguity caused by context-dependent concepts</head><p>One of the classes which confused annotators most was TRANSMISSION (source of infection). Below are typical examples of problematic cases.</p><p>E3 Victims contract the virus from close contact with infected birds E4 There is no known cure for Ebola, which is transmitted via infected body fluids E5 An Irish woman infected with Hepatitis C by a contaminated blood product E6 18 hospitalized after consuming chapattis Annotators had a problem in annotating 'birds' in E3 since those can be classified as both TRANSMISSION and NON_HUMAN (animals). 'Body fluid' in E4 is also ambiguous between TRANSMISSION and ANATOMY (body parts), and also 'blood product' in E5 is ambiguous between TRANSMISSION and PRODUCT (biological product). Most of the TRANSMISSION instances found in the text were those which could be categorized as NON_HUMAN, and the cases which belonged only to TRANSMISSION, such as 'chapattis' in E6, were very few.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Idiosyncratic annotations due to the disjoint entity class principal</head><p>E7 &lt;NAME cl="PERSON"&gt;Hudd&lt;/NAME&gt; has written several books on music hall and Variety... E8 Doctors later diagnosed &lt;NAME cl="CASE"&gt;Hudd&lt;/NAME&gt; with a chest infection... In the example above, it is clearly undesirable that the same entity is related to PERSON in E7 and CASE in E8. Although the annotator was aware of the choices the principal of disjoint classes forced a choice.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Empirical results from training an NER</head><p>We trained a support vector machine <ref type="bibr" target="#b12">[13]</ref> (for details, see Takeuchi and Collier <ref type="bibr" target="#b13">[14]</ref>) for named entity recognition based on the annotated corpus of 200 news articles. 10-fold cross validation experiments were performed using TinySVM<ref type="foot" target="#foot_2">3</ref> . A -2/+1 features window was used that included surface word, orthography, biomedical prefixes/suffixes, lemma, head noun and previous class predications. The Fscore for the all classes in Table <ref type="table">2</ref> was 76.96. Among the problematic classes were found to be PERSON, CASE and NON_HUMAN (many instances of which had ambiguity with TRANSMISSION) which had F-scores below our expectation: PERSON (54.95), CASE (53.17), NON_HUMAN (68.0).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">ANNOTATION EXPERIMENT 2</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1">Re-examination of the approach</head><p>Although we chose the task-oriented approach for its simplicity and ease of implementation the results from automatic NER and subsequent corpus analysis revealed that problems arose because we made no clear distinction between context-dependent and context-independent classes. We decided to take an alternative, formal and linguistically-sound approach, and distinguish context-dependent concepts from others in both the ontology and the annotation schema.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2">Classification of concepts</head><p>The first step was to use the classification method proposed by <ref type="bibr">Guarino and Welty ([9]</ref> and <ref type="bibr" target="#b9">[10]</ref>) which is based on meta-properties (rigidity, identity, dependency), in order to classify categories of concepts in Table <ref type="table">2</ref> We consider that this class is anti-rigid, since it is possible that an action which is an instance of CONTROL in the current world is not an instance of CONTROL in some other accessible world. The same action may be conducted for different purposes in different worlds. *2 This class includes events. In DOLCE top level categories (Gangemi et al. <ref type="bibr" target="#b18">[19]</ref>), Events are under the class of Perdurant/Occurrence. It seems to be controversial what the identity condition for events should be. <ref type="bibr">Davidson [20]</ref> proposes a condition such that "events are identical if and only if they have exactly the same causes and effects". In any case it should be reasonable to assume that this class itself does not supply ICs but inherits them from the upper level classes. *3 What we consider ICs for this class is as follows: Two instances of diseases are identical iff the two are experienced by the same host at the same time, are caused by the same agent (e.g. H5N1 virus for "H5N1 avian influenza") and have the same set of characteristic alterations/symptoms (e.g. inflammation of the lung for "pneumonia").</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 3: Classification of concepts</head><p>necessary IC: E(x, t)∧φ(x, t)∧E(x, t')∧φ(y, t')∧ x=y →Γ(x, y, t, t') sufficient IC: E(x, t)∧φ(x, t)∧E(x, t')∧φ(y, t')∧ Γ(x, y, t, t') →x=y (E : "actually exist at time t")</p><p>Any property φ carries an IC (+I) iff it is subsumed by a property supplying that IC.</p><p>A property φ supplies an IC (+O) iff i) it is rigid; ii) there is a necessary or sufficient IC for it; and iii) the same IC is not carried by all the properties subsuming φ.</p><p>&lt;Dependency&gt; ([10], p.7) externally dependent property φ (+D): ∀x□(φ(x) →∃y ω(y) ∧￢P(y, x) ∧￢C(y, x)) (P: "is a part of") (C: "is a constituent of")</p><p>Classification results are shown in Table <ref type="table">3</ref>. Most concepts such as ANATOMY, NON_HUMAN, and PERSON are classified as Type, whereas the concepts which were problematic in the first experiment were classified as Role: TRANSMISSION (Formal Role) and CASE (Material Role).</p><p>According to the further classification of non-rigid concepts by Kaneiwa and Mizoguchi <ref type="bibr" target="#b17">[18]</ref>, these cases are classified as timedependent concepts.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3">Modification of the schema</head><p>For some of the roles in Table <ref type="table">3</ref>, we modified their status in the annotation schema.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>CASE</head><p>CASE and PERSON were problematic since we distinguished them according to the form of expression (unnamed/named), in addition to the case/non-case distinction. In order to cover the mentions which could not be annotated in the first experiment, we extended the scope of the PERSON class to include person instances in general, and eliminate the unnamed/named and case/non-case distinctions. We modified the annotation schema so that CASE is not the value of cl attribute, but is the case attribute which applies to the referred instance of PERSON. This attribute takes the value true when the mentioned instance is a confirmed case of disease, false when the instance is not a case, and putative when the instance is a suspected case. Named case mentions and suspected case mentions are annotated as follows:</p><p>E9 Tests carried out in a UK laboratory confirmed that &lt;NAME cl="PERSON" case="true"&gt;M.A&lt;/NAME&gt;... E10 &lt;NAME cl="PERSON" case="putative"&gt;a Taiwanese&lt;/NAME&gt; is suspected to have died of SARS</p><p>The meaning of case attribute-value pairs can be described in logical description and natural language as follows: &lt;...cl="PERSON" case="true"&gt;John&lt;/...&gt;: case(j) "It is true that the person j mentioned by "John" is an instance of the CASE class" &lt;...cl="PERSON" case="false"&gt;John&lt;/...&gt;: ￢case(j) "It is false that the person j mentioned by "John" is an instance of the CASE class" &lt;...cl="PERSON" case="putative"&gt;John&lt;/..&gt;: ◇case(j) "It is possible that the person j mentioned by "John" is an instance of the CASE class"</p><p>As shown above, the values of the case attribute correspond to logical operators such as ￢ and ◇.</p><p>The values of case attributes specify the modes of linkage between the referred concept and the CASE class. The formal basis we had in mind when formulating the case attribute are as follows: 1) every instance of a non-rigid class must be an instance of some rigid class, 2) the relations between a non-rigid class and its instance are often modified by modal/temporal operators. The first point drove us to create the case attribute which apply to instances of some rigid class, here, PERSON. The second point is the motivation for us to set values to include negative and modal operators. This schema can be extended if we allow a wider value range for the case attribute to include other modal/temporal operators, although currently we restrict the values to the three above.</p><p>It is worth noting that there is a trade-off between this revised schema and the former schema which is that we have increased the number of the markable entities, since we need to annotate unnamed, noncase mentions which are not directly related to the purpose of the system.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>TRANSMISSION</head><p>We defined the transmission attribute which applies to mentions of ANATOMY, PRODUCT, PERSON and NON_HUMAN classes. As shown in the following examples, 'birds' are always related to NON_HUMAN, and take a 'true' value only when they are mentioned as a source of infection. It can also take a 'putative' value to cover mentions to possible sources of infection.</p><p>E11 Victims contract the virus from close contact with infected &lt;NAME cl="NON_HUMAN transmission="true"&gt;birds&lt;/NAME&gt; T_CHEMICAL /NT_CHEMICAL Concept classification revealed that T_CHEMICAL and NT_CHEMICAL have "the situation dependency obtained from extending types" discussed in <ref type="bibr" target="#b17">[18]</ref> and have the same status as 'weapon' and 'table <ref type="table">'</ref>. T_CHEMICAL includes chemicals mentioned as drugs in any context and those regarded as drugs in some context. Here we removed the two classes and made the parent node CHEMICAL as a class for annotation.</p><p>We then defined therapeutic attribute which applies to mentions of CHEMICAL and takes the value true when the entity is intended for therapeutic use and false otherwise.</p><p>As a result of the modifications above, our revised ontology is shown in Figure <ref type="figure">2</ref>. We also added new classes CONDITION (status of patients: 'hospitalized' 'died 'in critical condition', etc) and OUTBREAK (collective disease incident: 'outbreak', 'pandemic', etc). Information about CONDITION is important for experts to know the rate of hospitalization and death and determine the alert level. Mentions of OUTBREAK include expressions which are specific to disease outbreak news, increasing the specificity of our detection system. We located PERSON and NON_HUMAN under metazoa, and added a number attribute (which takes one or many as its value) to be applied to PERSON instances.</p><p>With insights from the revised ontology we also changed the annotation method by dividing the process into two distinct stages as shown in Figure <ref type="figure">3:</ref> 1) annotation of mentions to non-role (rigid) concepts and 2) annotation of role (non-rigid) concepts.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.4">Results of annotation and NE recognizer training</head><p>We asked three PhD students to annotate a further 300 news articles. This time we used the revised annotation method 1 and 2 shown in Figure <ref type="figure">3</ref>.</p><p>As a result of distinguishing between Role concepts (case, transmission, therapeutic) from others in the annotation schema, problem reports on these classes were reduced, and the annotation results were also improved.</p><p>Contrary to our expectations, the complexity of the new annotation schema and the increased number of markable mentions seemed to have no negative influence on the annotator's speed.</p><p>The improvement can be seen empirically in the NER results. We re-annotated the corpus used in the first experiment using the revised annotation schema. This time the F-score for all classes rose to 79.96 (+3 compared to the previous result).</p><p>Especially, significant increases of the F score were observed in the classes for PERSON (66.28; +11.33 compared to the previous result), case mentions among PERSON (65.63; +12.46), and NON_HUMAN (73.21; +5.21).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.5">Remaining issues</head><p>Some of the problems reported in this second experiment were related to context dependency (antirigidity, situation dependency) discussed in Section 6.2.</p><p>The most difficult class seemed to be CONTROL (control measures to lower the risk of diseases). As shown in Table <ref type="table">3</ref>, we consider this class is also nonrigid, and it includes mentions which refer to subclasses of the CONTROL class regardless of situation ("quarantine" "vaccination"), and others which can be a control measure depending on the situation ("warning" "blockade"). This characteristic seems to cause the difficulty.</p><p>So far we have resolved the complexity of nonrigid concepts by defining attributes which apply to instances of rigid classes (e.g. the case attribute for the class PERSON). This strategy, however does not seem to be effective for CONTROL since it is not easy to identify a rigid superclass for CONTROL which can be realistically annotated in the text. For example, EVENT can be considered as a rigid class subsuming CONTROL, but currently it is not realistic to manually annotate every mention of an event. Currently we are seeking for a way to deal with this problem.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">CONCLUSION</head><p>The study in this paper was motivated by our need for a high quality annotation schema to support detection of novel entities in the infectious disease outbreak domain. We discussed two experiments based on alternative approaches for constructing an ontology-based annotation schema. The amount of data in our study is relatively small but empirical results indicate support for our view that there is a positive effect in adopting well founded ontological principals over an ad-hoc task-based approach. Although this study is not a formal evaluation of ontologies, it is still an evaluation from the viewpoint of ontology application to the task of natural language annotation. The classification method of Guarino and Welty ([9], <ref type="bibr" target="#b9">[10]</ref>) which was originally proposed to achieve consistency in the configurational structure of ontologies, was adapted and found to be useful for improving annotation performance.</p><p>An alternative possibility exists which we have not addressed in this paper which is to reformulate the tradition NER task to allow for overlapping (nested) and multi-class entities. This however introduces significant additional complications in both the recognizer models and in the annotation schema so we have adopted a less radical formulation in this work.</p><p>As the next step in this study, we are now extending our simple taxonomy to a multi-lingual ontology; enriching the current taxonomic structure domain-sensitive relations. The resulting ontology will be freely available for re-use. At the initial stage we are focusing on English, Japanese, Vietnamese, Thai, Chinese (standard) and Korean. We hope to add other Asia-Pacific languages in the future.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 Figure 3</head><label>23</label><figDesc>Figure 2 Current ontology (simplified)</figDesc><graphic coords="8,72.30,77.94,215.04,232.68" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>. Note that CASE and TRANSMISSION, which represent Categorization of concepts</figDesc><table><row><cell>Focus</cell><cell>Description</cell><cell>Example properties</cell><cell>Concept types</cell></row><row><cell>Agent</cell><cell>Pathogens</cell><cell>Infectivity, pathogenicity, virulence, incubation</cell><cell>VIRUS, BACTERIA,</cell></row><row><cell></cell><cell></cell><cell>period, communicability</cell><cell>PARASITE * , FUNGI *</cell></row><row><cell>Transmission</cell><cell>The delivery or dispersal</cell><cell>Dermal, oral, respiratory</cell><cell>TRANSMISSION</cell></row><row><cell></cell><cell>method</cell><cell></cell><cell></cell></row><row><cell>Host</cell><cell>Persons carrying a</cell><cell>Age, gender, occupation,</cell><cell>CASE, SYMPTOM, DISEASE,</cell></row><row><cell></cell><cell>disease</cell><cell></cell><cell>ANATOMY, DNA  § , RNA  § ,</cell></row><row><cell></cell><cell></cell><cell></cell><cell>PROTEIN  §</cell></row><row><cell>Environment</cell><cell>Location and climate</cell><cell>Large population centre, enclosed building, mass</cell><cell>LOCATION, TIME</cell></row><row><cell></cell><cell></cell><cell>transport system, rural village</cell><cell></cell></row><row><cell cols="2">* Not included in the current schema</cell><cell></cell><cell></cell></row><row><cell cols="2">§ Genetic level entities</cell><cell></cell><cell></cell></row></table><note>ClassesExamples Description ANATOMY liver, pancreas, nervous system, eLa cel, Body parts including tissues and cells BACTERIA Escherichia coli O157, tubercle bacillus Eubacteria CASE a 35-year-old woman, the third case Confirmed cases of diseases NT_CHEMICAL beryllium, organophosphate pesticide Chemicals intended for non-therapeutic purposes * 1 T_CHEMICAL Relenza, immunosuppressive drug, oseltamivir Chemicals intended for the treatment of diseases* 1 CONTROL stamping out, screening, vaccination Control measures to lower the risk of transmission of a disease DISEASE H5N1 avian influenza, SARS, cholera A deviation in the normal functioning of the host caused by a persistent agent (pathogen) or some environmental factor DNA Sp1 site, triple-A, c-jun gene Includes the names of DNAs, groups, families, molecules, domains and regions* 2 LOCATION Viet Nam, Jakarta, Sumatra Island, Asia A politically or geographically defined location* 3 NON_HUMAN civet cats, poultry, flies Multi-cell organism other than humans, i.e. "animals" ORGANIZATION the Ministry of Health, WHO, Pasteur Institute Corporate, governmental, or other organizational entity* 3 PERSON Jean Chretien, Murray McQuigge A named person or family PRODUCT botulism antitoxin, Influenza vaccine Biological product, (e.g. vaccines, immune sera) PROTEIN STAT, RNA polymerase II alpha subunit Includes the names of proteins, groups, families, molecules, complexes and substructures* 2 RNA IL-2R alpha transcripts, TNF mRNA Includes the names of RNAs, groups, families,</note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head></head><label></label><figDesc>. Definitions of the metaproperties we used are as follows:</figDesc><table><row><cell></cell><cell>rigidity</cell><cell>identity (supplying)</cell><cell>identity (carrying)</cell><cell>dependency</cell><cell>classification</cell></row><row><cell>ANATOMY</cell><cell>+R</cell><cell>+O</cell><cell>+ I</cell><cell>-D</cell><cell>Type</cell></row><row><cell>BACTERIA</cell><cell>+R</cell><cell>+O</cell><cell>+ I</cell><cell>-D</cell><cell>Type</cell></row><row><cell>CASE</cell><cell>~R</cell><cell>-O</cell><cell>+ I</cell><cell>+D</cell><cell>Material Role</cell></row><row><cell>NT_CHEMICAL</cell><cell>~R</cell><cell>-O</cell><cell>+ I</cell><cell>+D</cell><cell>Material Role</cell></row><row><cell>T_CHEMICAL</cell><cell>~R</cell><cell>-O</cell><cell>+ I</cell><cell>+D</cell><cell>Material Role</cell></row><row><cell>CONTROL</cell><cell>~R *1</cell><cell>-O *2</cell><cell>+ I</cell><cell>+D</cell><cell>Material Role</cell></row><row><cell>DISEASE</cell><cell>+R</cell><cell>+O *3</cell><cell>+ I</cell><cell>+D</cell><cell>Type</cell></row><row><cell>DNA</cell><cell>+R</cell><cell>+O</cell><cell>+ I</cell><cell>-D</cell><cell>Type</cell></row><row><cell>LOCATION</cell><cell>+R</cell><cell>+O</cell><cell>+ I</cell><cell>-D</cell><cell>Type</cell></row><row><cell>NON_HUMAN</cell><cell>+R</cell><cell>+O</cell><cell>+ I</cell><cell>-D</cell><cell>Type</cell></row><row><cell>ORGANIZATION</cell><cell>+R</cell><cell>+O</cell><cell>+ I</cell><cell>-D</cell><cell>Type</cell></row><row><cell>PERSON</cell><cell>+R</cell><cell>+O</cell><cell>+ I</cell><cell>-D</cell><cell>Type</cell></row><row><cell>PRODUCT</cell><cell>+R</cell><cell>+O</cell><cell>+ I</cell><cell>+D</cell><cell>Type</cell></row><row><cell>PROTEIN</cell><cell>+R</cell><cell>+O</cell><cell>+ I</cell><cell>-D</cell><cell>Type</cell></row><row><cell>RNA</cell><cell>+R</cell><cell>+O</cell><cell>+ I</cell><cell>-D</cell><cell>Type</cell></row><row><cell>SYMPTOM</cell><cell>+R</cell><cell>+O</cell><cell>+ I</cell><cell>+D</cell><cell>Type</cell></row><row><cell>TIME</cell><cell>+R</cell><cell>+O</cell><cell>+ I</cell><cell>-D</cell><cell>Type</cell></row><row><cell>VIRUS</cell><cell>+R</cell><cell>+O</cell><cell>+ I</cell><cell>-D</cell><cell>Type</cell></row><row><cell>TRANSMISSION</cell><cell>~R</cell><cell>-O</cell><cell>-I</cell><cell>+D</cell><cell>Formal Role</cell></row><row><cell>*1</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell></cell><cell></cell><cell></cell><cell cols="2">&lt;Rigidity&gt; ([10], p.4)</cell><cell></cell></row><row><cell></cell><cell></cell><cell></cell><cell cols="3">rigid property φ(+R): ∀x φ(x) → □φ(x)</cell></row><row><cell></cell><cell></cell><cell></cell><cell cols="3">anti-rigid property φ(~R): ∀x φ(x) →￢□φ(x)</cell></row><row><cell></cell><cell></cell><cell></cell><cell cols="2">&lt;Identity&gt; ([10], p.5)</cell><cell></cell></row><row><cell></cell><cell></cell><cell></cell><cell cols="3">Identity Condition (IC): An identity condition is a</cell></row><row><cell></cell><cell></cell><cell></cell><cell cols="3">formula Γ that satisfies either of the followings 4 :</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">We will adopt here the notation of using all upper case for domain entity classes.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">In this example we only show initials of the victims' names.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">Available from http://cl.aist-nara.ac.jp/~takuku/software/TinySVM</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3"><ref type="bibr" target="#b3">4</ref> In<ref type="bibr" target="#b8">[9]</ref>, further restrictions are added in order to avoid 1) the case where the necessary IC definition becomes trivially true regardless of the truth value of the formula x=y and 2) the case where Γ(x, y, t, t') is false and that makes the sufficient IC definition trivially true.</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgements</head><p>We gratefully acknowledge partial funding support from the Japan Society for the Promotion of Science (grant no. 18049071). We also thank the anonymous reviewers for helpful comments.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Strategies for containing an emerging influenza pandemic in Southeast Asia</title>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">M</forename><surname>Ferguson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">A</forename><surname>Cummings</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Cauchemez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Fraser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Riley</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Nature</title>
		<imprint>
			<biblScope unit="volume">437</biblScope>
			<biblScope unit="page" from="209" to="214" />
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Information extraction for enhanced access to disease outbreak reports</title>
		<author>
			<persName><forename type="first">R</forename><surname>Grishman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Huttunen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Yangarber</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Biomedical Informatics</title>
		<imprint>
			<biblScope unit="volume">35</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="236" to="246" />
			<date type="published" when="2002">2002</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<ptr target="http://www.phac-aspc.gc.ca/media/nr-rp/2004/2004_gphin-rmispbk_e.html" />
		<title level="m">GPHIN system</title>
				<imprint>
			<publisher>Public Health Agency of Canada</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">R</forename><surname>Aronson</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of AMIA Symposium</title>
				<meeting>AMIA Symposium</meeting>
		<imprint>
			<date type="published" when="2001">2001</date>
			<biblScope unit="page" from="17" to="21" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">EDGAR: extraction of drugs, genes and relations from the biomedical literature</title>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">C</forename><surname>Rindflesch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Tanabe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">N</forename><surname>Weinstein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Hunter</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of Pacific Symposium on Biocomputing</title>
				<meeting>Pacific Symposium on Biocomputing</meeting>
		<imprint>
			<date type="published" when="2000">2000</date>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="page" from="514" to="525" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Introduction to the Bio-entity Recognition Task of the JNLPBA workshop</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">D</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Ohta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tsuruoka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tateishi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Collier</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the JNPBA</title>
				<meeting>the JNPBA</meeting>
		<imprint>
			<date type="published" when="2004">2004</date>
			<biblScope unit="page" from="70" to="76" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">BioCreAtIvE task 1A: gene mention finding evaluation</title>
		<author>
			<persName><forename type="first">A</forename><surname>Yeh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Morgan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Colosimo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Hirschman</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">BMC Bioinformatics</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page">S2</biblScope>
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
	<note>Suppl</note>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">Conceptual structures: Information processing in mind and machine</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">F</forename><surname>Sowa</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1984">1984</date>
			<publisher>Addison-Wesley</publisher>
			<pubPlace>New York</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">A formal ontology of properties</title>
		<author>
			<persName><forename type="first">N</forename><surname>Guarino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Welty</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of EKAW-2000: The 12th International Conference on Knowledge Engineering and Knowledge Management</title>
				<editor>
			<persName><forename type="first">R</forename><surname>Dieng</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Corby</forename><forename type="middle">O</forename></persName>
		</editor>
		<meeting>EKAW-2000: The 12th International Conference on Knowledge Engineering and Knowledge Management</meeting>
		<imprint>
			<biblScope unit="volume">1937</biblScope>
			<biblScope unit="page" from="97" to="112" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Ontological analysis of taxonomic relations</title>
		<author>
			<persName><forename type="first">N</forename><surname>Guarino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Welty</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of ER-2000: The International Conference on Conceptual Modeling</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Lander</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">V</forename><surname>Storey</surname></persName>
		</editor>
		<meeting>ER-2000: The International Conference on Conceptual Modeling<address><addrLine>Berlin, Germany</addrLine></address></meeting>
		<imprint>
			<publisher>Springer Verlag LNCS</publisher>
			<biblScope unit="volume">1920</biblScope>
			<biblScope unit="page" from="210" to="224" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">On the representation of roles in object-oriented and conceptual modelling</title>
		<author>
			<persName><forename type="first">F</forename><surname>Steimann</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Data and Knowledge Engineering35</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="83" to="106" />
			<date type="published" when="2000">2000</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<author>
			<persName><forename type="first">U</forename><forename type="middle">S</forename></persName>
		</author>
		<title level="m">Medical Subject Headings</title>
				<imprint>
			<publisher>MeSH</publisher>
			<date type="published" when="2006">2006</date>
		</imprint>
		<respStmt>
			<orgName>National Library of Medicine</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">GENIA corpus -a semantically annotated corpus for biotextmining</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">D</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Ohta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tateishi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tsujii</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Bioinformatics</title>
		<imprint>
			<biblScope unit="volume">19</biblScope>
			<biblScope unit="page" from="180" to="182" />
			<date type="published" when="2003">2003</date>
			<publisher>Oxford University Press</publisher>
		</imprint>
	</monogr>
	<note>suppl. 1</note>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">MUC-7 named entity task definition</title>
		<author>
			<persName><forename type="first">L</forename><surname>Hirschman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Chinchor</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 7th Message Understanding Conference (MUC-7)</title>
				<meeting>the 7th Message Understanding Conference (MUC-7)</meeting>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<author>
			<persName><forename type="first">L</forename><surname>Hirschman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Chinchor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Grishman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Sundheim</surname></persName>
		</author>
		<ptr target="http://www-nlpir.nist.gov/related_projects/muc/proceedings/hub4/guidelines.html" />
		<title level="m">Hub-4 Event Guidelines Version 2</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<title level="m" type="main">The Nature of Statistical Learning Theory</title>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">N</forename><surname>Vapnik</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1995">1995</date>
			<publisher>Springer-Verlag</publisher>
			<pubPlace>New York</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Bio-medical entity extraction using support vector machines</title>
		<author>
			<persName><forename type="first">K</forename><surname>Takeuchi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Collier</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Artificial Intelligence in Medicine</title>
				<imprint>
			<publisher>Elsevier</publisher>
			<date type="published" when="2005">2005</date>
			<biblScope unit="volume">33</biblScope>
			<biblScope unit="page" from="125" to="137" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">An order-sorted quantified modal logic for meta-ontology</title>
		<author>
			<persName><forename type="first">K</forename><surname>Kaneiwa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Mizoguchi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. of the International Conference on Automated Reasoning with Analytic Tableaux and Related Methods (TABLEAUX 2005</title>
				<meeting>of the International Conference on Automated Reasoning with Analytic Tableaux and Related Methods (TABLEAUX 2005<address><addrLine>Koblenz, Germany</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2005">2005</date>
			<biblScope unit="page" from="169" to="184" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Sweetening ontologies with DOLCE</title>
		<author>
			<persName><forename type="first">A</forename><surname>Gangemi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Guarino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Masolo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Oltramari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Schneider</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 13th European Conference on Knowledge Engineering and Knowledge Management (EKAW2002)</title>
				<editor>
			<persName><surname>Benjamins</surname></persName>
		</editor>
		<meeting>the 13th European Conference on Knowledge Engineering and Knowledge Management (EKAW2002)<address><addrLine>Sigüenza, Spain</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2002">2002</date>
			<biblScope unit="page" from="166" to="181" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">The Individuation of events</title>
		<author>
			<persName><forename type="first">D</forename><surname>Davidson</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Essays in Honor of Carl G. Hempel</title>
				<editor>
			<persName><forename type="first">D</forename><surname>Reidel</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="1969">1969</date>
			<biblScope unit="page" from="216" to="234" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
