<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Towards Automatic Knowledge Acquisition from Text Based on Ontology-centric Knowledge Representation and Acquisition</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Yu-Sheng</forename><surname>Lai</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Industrial Technology Research Institute</orgName>
								<address>
									<settlement>Tainan</settlement>
									<country>Taiwan, R.O</country>
								</address>
							</affiliation>
						</author>
						<author role="corresp">
							<persName><forename type="first">Ren-Jr</forename><surname>Wang</surname></persName>
							<email>rjwang@itri.org.tw</email>
							<affiliation key="aff1">
								<orgName type="institution">Industrial Technology Research Institute</orgName>
								<address>
									<settlement>Hsinchu</settlement>
									<region>.C</region>
									<country>Taiwan, R.O</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Towards Automatic Knowledge Acquisition from Text Based on Ontology-centric Knowledge Representation and Acquisition</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">813D3193A99D11B351E936E98A5344D5</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T12:06+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>I.2.4 Knowledge Representation Formalisms and Methods I.2.7 Natural Language Processing Natural language processing</term>
					<term>knowledge representation</term>
					<term>knowledge acquisition</term>
					<term>ontology Arbitration Analysis</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>With the development of the Semantic Web and ontology technologies, many ontologies have been built or will be built before long. Based on the ontologies, we attempt to investigate the technology of automatic knowledge acquisition from text. This paper presents an ontologycentric framework for knowledge representation and acquisition, called iOkra. By combining NLP technologies with replaceable ontologies, the framework is able to acquire different domain knowledge from natural language input. The acquired knowledge is represented in the form of instances and statements associated with the ontologies.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>INTRODUCTION</head><p>Knowledge acquisition traditionally requires various specialists in logic, linguistic, philosophy, etc. Although many facilities have been developed for enabling these specialists to collaborate, large number of handcrafting task is still unavoidable. Automatic knowledge acquisition from text seems to be a pleasant aspect because of wealthy textual documents and data. However, no fully satisfactory approaches to automatic knowledge acquisition from text have been proposed.</p><p>In <ref type="bibr" target="#b2">[3]</ref>, Berners-Lee and the co-authors claimed that "the Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation." It indicates that the data on the Semantic Web is not only human-readable but also machineunderstandable.</p><p>Technically, researchers use ontologies to describe the semantics of websites. The W3C defines the Semantic Web as "the abstract representation of data on the World Wide Web, based on the RDF standards and other standards to be defined" and has been developing it in collaboration with many researchers and organizations. A document that specifies usage scenarios, goals and requirements for a web ontology language (OWL) has been proposed <ref type="bibr" target="#b5">[6]</ref>. OntoWeb Network involving most European Union (EU) members has been integrating academic and industrial resources to promote interdisciplinary work and strengthen the European influence on Semantic Web standardization efforts such as those based on RDF and XML.</p><p>The Semantic Web relies heavily on formal ontologies to structure data for comprehensive and transportable machine understanding <ref type="bibr" target="#b7">[8]</ref>. With the development of the Semantic Web and ontology technologies, many ontologies have been built or will be built before long. This paper proposes an ontology-centric framework (see Fig. <ref type="figure" target="#fig_0">1</ref>) that integrates natural language processing (NLP) technologies and the ontologies to automatically acquire knowledge from natural language input.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Natural language input</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Morphological Analysis</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Syntactic Analysis</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Semantic Analysis</head><p>Discourse Analysis </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>THE FRAMEWORK</head><p>As illustrated in Fig. <ref type="figure" target="#fig_0">1</ref>, the framework called iOkra is expected to automatically acquire knowledge from natural language input, to represent the knowledge in the form of instances and statements associate with the ontologies, and to store the acquired knowledge into knowledge base.</p><p>The central ontologies comprise two kinds of ontologies: linguistic ontologies and domain ontologies. The main characteristic of linguistic ontologies is that they are bound to the semantics of grammatical units, such as words, nominal groups, etc. <ref type="bibr" target="#b4">[5]</ref>. The domain ontologies provide varied ontological information, which might be domain-specific, task-oriented, or use-desirable.</p><p>In the framework, the natural language input is processed through several modules including morphological, syntactic, semantic, and discourse analyses and arbitration module.</p><p>The morphological analysis splits the input text into words and connects to the ontologies for each word. The connections provide syntactic and semantic information for the following analyses.</p><p>The syntactic analysis performs a semantic case frame parsing. The information-based case grammar <ref type="bibr" target="#b3">[4]</ref> is adopted to suggest parts of the thematic roles, such as agent, patient, theme, goal, etc., in each sentence.</p><p>The semantic analysis finds the remaining roles out and identifies the statements, cf. RDF statements, namely the concept for each word and the relations between the word concepts, according to the ontologies.</p><p>The discourse analysis addresses the contextual issues, such as ellipsis and anaphora resolutions, which is currently an initial and on-going task and will be not presented in the following of this paper.</p><p>The arbitration module quantifies all possible statements to reconcile conflicts, produces final result statements, and stores the results into a knowledge base, which is in a form of statements associated with the ontologies.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>ONTOLOGY-BASED KNOWLEDGE REPRESENTATION</head><p>What is knowledge representation (KR)? Allen considered that "knowledge representation means different things to different researchers <ref type="bibr" target="#b1">[2]</ref>." For some, it concerns the structure of the language used to express the knowledge. For others, it concerns the content of sentences. Herein we are interesting in the meaning representation of sentences.</p><p>Stevens et al. presented an ontology-based knowledge representation system for bioinformatics since they believed that "the combination of an ontology with associated instances is what is known as a knowledge base <ref type="bibr" target="#b8">[9]</ref>," in which the instances indicate the things represented by concepts. Similar to the notion, we represent knowledge in an ontology-based representation system.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>The Ontologies</head><p>An ontology basically consists of a set of concepts that represent classes of objects, and a set of binary relations defined on concepts. A special transitive relation subClassOf represents a subsumption relationship between concepts. The subsumption relations structure a taxonomy for the ontology. In addition to the taxonomy, an ontology typically contains a set of axioms explicitly or implicitly. The axioms enhance the ontology for reasoning.</p><p>Maedche and Staab proposed an ontology-learning framework <ref type="bibr" target="#b7">[8]</ref> for the Semantic Web. In their case, they formally defined an ontology as an 8-tuple &lt;L, C, H C , R, H R , F, G, A&gt;, in which the first primitive L denotes a set of strings that describe lexical entries for concepts and relations, the middle 6 primitives structure the taxonomy of the ontology, and the last primitive A is a set of axioms that describe additional constraints on the ontology. The axioms make implicit facts more explicit. Based on the same definition, two ontologies: a linguistic ontology and a domain ontology, are currently in iOkra.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Linguistic Ontology</head><p>Following the DAML+OIL specification, Lai et al. constructed a Chinese lexical ontology call CLO <ref type="bibr" target="#b6">[7]</ref>. To improve the ability in Traditional Chinese language processing, we define an amended version that has altered by a wide margin. Major amendments are as follows:</p><p>1. The approach to real world applications such as information extraction and knowledge acquisition, we make an adjustment in taxonomy. "人 (person)," "事 (affair)," "時 (time)," "地 (place)," "物 (thing)" are five basic entities in documents <ref type="bibr">(Chen et al., 1998)</ref>. Therefore we define the five entities plus two additional concepts " 屬 性 (attribute)" and " 數 量 (quantity)" as the upmost concepts.</p><p>2. To increase the compatibility with other ontology editors, such as OilEdit, the concept Lexicon in CLO is eliminated from the amendment. Some of the lexical entries are changed into instances. Others are moved to new, more proper position.</p><p>3. To enhance the expression power in linguistics, some thematic roles, such as theme, goal, range, etc., are interpreted as relations between concepts and added to the ontology.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Domain Ontology</head><p>For different domains, one term could be interpreted as many different meanings. For example, "大陸 (mainland)" means a country -China in a hard news article, but also means a corporation name -CEC in a stock news article. It means different ontologies are required for different domains, even for different tasks.</p><p>Addressing the problem of knowledge representation and acquisition from the news articles of Taiwan stock market, we create an ontology that aims at the terminology of Taiwan stock market, such as industrial categories, corporation names, product names, people names, proper nouns, etc. Most of them are collected from the WWW and are organized into the domain ontology automatically. A small number are reorganized or modified manually.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Instance and Statement</head><p>The iOkra represents ontology-based knowledge consisting of two components: instance and statement. An instance is a specific description of a concept. For example, "台積電 (TSMC)" is an instance of concept "公司 (corporation)." A statement specifies a relationship between instances. For example, the concept "公司 (corporation)" has a "董事長 (board chairman)" relation to the concept "自然人 (natural person)," and "張忠謀 (Morris C.M. Chang)" is the board chairman of the corporation "台積電 (TSMC)." Fig. <ref type="figure" target="#fig_1">2</ref> is a conceptual graph describing the relationships. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>ONTOLOGY-CENTRIC KNOWLEDGE ACQUISITION FROM NATURAL LANGUAGE INPUT</head><p>In the following, we will describe the three NLP modules: morphological, syntactic, and semantic analysis, and their cooperation with ontologies.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Morphological Analysis</head><p>A word segmentation algorithm is used for morphological analysis. It splits a sentence into a sequence of words. The words are possibly the words in the general ontology, the proper nouns in the domain ontologies, or compound words from a grammatically word-formation process. For example, the sentence "聯電1月29日至2月27日處分聯發 科股票150張 (UMC sold 150 kilo-shares of MediaTek stocks during 1/29 to 2/27.) " can be split into words: "聯 電 (UMC)," "1月29日 (1/29)," "至 (to)," "2月27日 (2/27)," " 處 分 (sold)," " 聯 發 科 (MediaTek)," " 股 票 (stock)," "150," and "張 (kilo-shares)."</p><p>The corporation names "聯電 (UMC)" and "聯發科 (MediaTek)" come from the domain ontology, the dates, "1 月29日 (1/29)" and "2月27日 (2/27)" and the numeral determinatives (ND), "150" and "9678萬 (96.78 million)," from a word-formation process.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Syntactic Analysis</head><p>A shallow syntactic analysis is performed in this module due to the lack of full Chinese grammar. The analysis is divided into two phases. In the first phase, a phraseformation process is performed. A parser based on the CYK algorithm <ref type="bibr" target="#b0">[1]</ref> is used to concatenate words into phrases. For example, the three words "1月29日 (1/29)," " 至 (to)," "2月27日 (2/27)" in Table <ref type="table" target="#tab_0">1</ref> can be combined to form the phrase "1月29日至2月27日 (1/29 to 2/27)."</p><p>In the second phase, we use the Information-based Case Grammar (ICG) to recognize some of the thematic roles of each of the words in a sentence. The thematic roles are defined in the general ontology and are represented as relations. For example, a basic pattern in the ICG AGENT[{NP, PP[由]}]&lt;VC2&lt;GOAL[NP] denotes that a verbal head with the syntactic category "VC2" has two thematic roles: agent and goal. The agent could be an NP (noun phrase) or a PP[由] (preposition phrase led by "由 (by or through)"), and should occur on the left-hand side of the head. The goal could be an NP (noun phrase) and should occur on the right-hand side of the head.</p><p>A head-driven approach is performed to recognize the thematic roles using the basic patterns. We design an automaton, called ICG-machine, to perform the recognition process. It is somewhat different from the Mealy machine. An enhanced scanning algorithm enables the ICG-machine to scan an input and output all acceptance paths. Besides, it is able to scan a fragmental input and output partially matched paths while no fully matched paths exist. For the basic patterns of each of the syntactic categories, we create an ICG-machine to perform the recognition process. For example, there are five basic patterns for the syntactic category "VC2." The basic patterns can be used to create an ICG-machine as illustrated in Fig. <ref type="figure" target="#fig_2">3</ref>.  For example, the machine scans an input: "NP1 NP2 VC2 NP3 NP4 DM," then four matched paths shown in Table <ref type="table">.</ref> 2 are outputted. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Semantic Analysis</head><p>In case the word "處分(sold)" with the syntactic category "VC2" is a head of the sentence, the agent is probably "聯 電 (UMC)" or "1月29日至2月27日 (1/29 to 2/27)" and the goal is probably "聯發科 (MediaTek)" or "股票 (stock)" since all of them are noun phrases. (See Table <ref type="table" target="#tab_1">2</ref>) In other words, there are two candidates respectively to the agent and the goal. However, the agent and the goal are unique to the head in this case. Here are syntactic ambiguities. A conceptual graph describing the relationships among the three words "聯電(UMC)," "處分(sold)," and "股票(stock)."</p><p>The ontologies are used to resolve the ambiguities. In the general ontology, the concept "sell" has an "agent" relation to the concept "corporation." The phrase "1月29日 至2月27日(1/29 to 2/27)" cannot be an instance of the concept "corporation." Therefore the word "聯電(UMC)" is the agent. The same as the reason, the goal is "股票 (stock)." Some of syntactic ambiguities can be resolved due to specific constraints of the ontology. The domain ontology affords the same functionality too. After recognition of the thematic roles, the sentence can be interpreted as a conceptual graph shown in Fig. <ref type="figure" target="#fig_3">4</ref>.</p><p>Presently, three unknown roles: "1月29日至2月27日 (1/29 to 2/27)," "聯發科 (MediaTek)," and "150張(150 kilo-shares)" have not been identified yet. A common characteristic of languages -"local dependency" exists in text everywhere. Using the characteristic, we find the nearest relations between unrecognized and recognized words. Thus, three additional relationships can be found. The head "處分(sold)" has a "time" relation to the duration "1月29日至2月27日 (1/29 to 2/27)." The word "股票 (stock)" has a "corporation-of-issue" relation to the word " 聯發科 (MediaTek)" and a "quantity" relation to the phrase "150 張 (150 kilo-shares)." Fig. <ref type="figure" target="#fig_4">5</ref> shows the full relationships among the members of the sentence. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>EXPERIMENTS AND DISCUSSION</head><p>To initially evaluate the performance of iOkra in automatic knowledge acquisition, we conduct an experiment on a collection of 501 news titles randomly selected from Yahoo!股市 (tw.stock.yahoo.com). Each of the titles may consist of one or more clauses and is manually annotated as a set of instances and statements. The evaluation metrics used in this experiment includes: recall rate, precision rate, and F-measure. The experiment is conducted on the titles, statements, and concepts, in which a correct title means all the statements in the title must be fully recognized. The experimental result is shown in Table <ref type="table" target="#tab_2">3</ref>. The test data contains some titles that cannot be split into words correctly by the automatic word segmentation process. Therefore we conduct an additional experiment on the titles that can be split correctly. There are totally 391 tiles in this set. The experimental result is shown in Table <ref type="table" target="#tab_3">4</ref>. By an analysis on errors, we summarize the errors in two aspects: NLP technologies and ontology engineering. In NLP, there are three major problems as follows:</p><p>1. Ellipsis and anaphora problem. Many titles consist of several clauses. Some of the clauses share a common word.</p><p>2. Unknown word problem. Many new created words, translated names, loanwords, etc. occur in the title.</p><p>3. Word segmentation problem. As shown in Tables <ref type="table" target="#tab_3">3  and 4</ref>, many errors result from the word segmentation.</p><p>For iOkra, several derived research topics on the ontology field are described as follows:</p><p>1. Consistency between different ontologies. In a multiontology-supported system, how to maintain the consistency between different ontologies is a wellknown important issue.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Integration between ontology and knowledge base.</head><p>In an ontology-based knowledge system, one, either ontology or knowledge base, is changed, another should do something to correspond to the change.</p><p>3. Cross-domain knowledge. Text knowledge may be cross two or more domains. How to acquire and represent such knowledge is still a problem.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>CONCLUSION AND FUTURE WORK</head><p>This paper presents an ontology-centric knowledge representation and acquisition framework, called iOkra.</p><p>Combining NLP technologies with replaceable ontologies, the framework is able to automatically acquire knowledge from natural language input. Based on iOkra, a prototypical document annotation system is constructed. By using different domain ontologies, the system is able to automatically annotate text documents of different domains.</p><p>A preliminary experimental result shows the system performance at title level achieves 65.93% in F-measure, 80.88% at statement level, and 89.25% at concept level. Without considering the errors from word segmentation, the performance is as follows: 70.93% at title level, 83.88% at statement level, and 91.90% at concept level. In the future, we will work on the research topics mentioned above.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 .</head><label>1</label><figDesc>Figure 1. The proposed framework for knowledge acquisition.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 .</head><label>2</label><figDesc>Figure 2. A conceptual graph describing instances and statements</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 .</head><label>3</label><figDesc>Figure 3. The ICG-machine created from the basic patterns of the syntactic category "VC2."</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 4 .</head><label>4</label><figDesc>Figure 4. A conceptual graph describing the relationships among the three words "聯電(UMC)," "處分(sold)," and "股票(stock)."</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 5 .</head><label>5</label><figDesc>Figure 5. The relations within the sentence "聯電1月29日 至2月27日處分聯發科股票150張(UMC sold 150 kiloshares of MediaTek stocks during 1/29 to 2/27.)."</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 .</head><label>1</label><figDesc>The words and phrases in the sentence "聯電1月29日至2月27日處分聯發科股票150張(UMC sold 150 kilo-shares of MediaTek stocks during 1/29 to 2/27.)" associated with their possible syntactic and semantic categories</figDesc><table><row><cell>Word</cell><cell>Syntactic category</cell><cell>Phrase</cell><cell>Syntactic category</cell><cell>Semantic category</cell></row><row><cell>聯電(UMC)</cell><cell>Nba</cell><cell>聯電(UMC)</cell><cell>NP (Nba)</cell><cell>corporation</cell></row><row><cell>1月29日(1/29)</cell><cell>Ndabd</cell><cell></cell><cell></cell><cell></cell></row><row><cell>至 (to)</cell><cell>P61</cell><cell>1月29日至2月27日(1/29 to 2/27)</cell><cell>NP (Ndabf)</cell><cell>duration</cell></row><row><cell>2月27日(2/27)</cell><cell>Ndabd</cell><cell></cell><cell></cell><cell></cell></row><row><cell>處分(sold)</cell><cell>VC2, Nac</cell><cell>處分(sold)</cell><cell>VC2, NP (Nac)</cell><cell>sell</cell></row><row><cell>聯發科(MediaTek)</cell><cell>Nba</cell><cell>聯發科(MediaTek)</cell><cell>NP (Nba)</cell><cell>corporation</cell></row><row><cell>股票(stock)</cell><cell>Nab</cell><cell>股票(stock)</cell><cell>NP (Nab)</cell><cell>stock</cell></row><row><cell>150 張(kilo-shares)</cell><cell>ND Nbc, Nfa, VC2</cell><cell>150張(150 kilo-shares)</cell><cell>DM</cell><cell>quantity</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 .</head><label>2</label><figDesc>The matched paths by inputting the syntactic sequence "NP 1 NP 2 VC2 NP 3 NP 4 DM" into the ICGmachine shown in Fig.3Path NP 1 NP 2 VC 2 NP 3 NP 4 DM Terminal</figDesc><table><row><cell>1</cell><cell>agent</cell><cell>-</cell><cell cols="2">head goal</cell><cell>-</cell><cell>-</cell><cell>T1</cell></row><row><cell>2</cell><cell>agent</cell><cell>-</cell><cell>head</cell><cell>-</cell><cell>goal</cell><cell>-</cell><cell>T1</cell></row><row><cell>3</cell><cell>-</cell><cell cols="3">agent head goal</cell><cell>-</cell><cell>-</cell><cell>T1</cell></row><row><cell>4</cell><cell>-</cell><cell cols="2">agent head</cell><cell>-</cell><cell>goal</cell><cell>-</cell><cell>T1</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3 .</head><label>3</label><figDesc>Experimental result for overall test data</figDesc><table><row><cell></cell><cell cols="3">Title Statement Concept</cell></row><row><cell>Recall rate</cell><cell>65.86%</cell><cell>78.21%</cell><cell>86.80%</cell></row><row><cell>Precision rate</cell><cell>66.00%</cell><cell>83.73%</cell><cell>91.85%</cell></row><row><cell>F-measure</cell><cell>65.93%</cell><cell>80.88%</cell><cell>89.25%</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 4 .</head><label>4</label><figDesc>Experimental result for the titles that can be split correctly by the automatic word segmentation process</figDesc><table><row><cell></cell><cell cols="3">Title Statement Concept</cell></row><row><cell>Recall rate</cell><cell>70.84%</cell><cell>81.46%</cell><cell>89.35%</cell></row><row><cell>Precision rate</cell><cell>71.02%</cell><cell>86.45%</cell><cell>94.60%</cell></row><row><cell>F-measure</cell><cell>70.93%</cell><cell>83.88%</cell><cell>91.90%</cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>ACKNOWLEDGEMENTS</head><p>This paper is a partial result of Project A321XS1A10 conducted by ITRI under sponsorship of the Ministry of Economic Affairs, R.O.C. The authors would like to thank the CKIP Group of Sinica, R.O.C. for providing the ICG.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">The Theory of Parsing, Translation, and Compiling</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">V</forename><surname>Aho</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">D</forename><surname>Ullman</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1972">1972</date>
			<publisher>Prentice Hall</publisher>
			<pubPlace>Englewood Cliffs, N.J.</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Natural Language Understanding</title>
		<author>
			<persName><forename type="first">J</forename><surname>Allen</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1994">1994</date>
			<publisher>The Benjamin/Cummings Publishing Company, Inc</publisher>
			<pubPlace>Redwood City, CA</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">The Semantic Web</title>
		<author>
			<persName><forename type="first">T</forename><surname>Berners-Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hendler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Lassila</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2001">2001</date>
			<publisher>Scientific American</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Information-based Case Grammar</title>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">J</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">R</forename><surname>Huang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 13th International Conference on Computational Linguistics (COLING &apos;90)</title>
				<meeting>the 13th International Conference on Computational Linguistics (COLING &apos;90)<address><addrLine>University of Helsinki, Finland</addrLine></address></meeting>
		<imprint>
			<date type="published" when="1990">1990</date>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page" from="54" to="59" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title/>
		<author>
			<persName><forename type="first">A</forename><surname>Gomez-Perez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Fernandez-Lopez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Corcho</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Technical Roadmap D</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="issue">1</biblScope>
			<date type="published" when="2002">2002</date>
			<publisher>OntoWeb</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">Web Ontology Language (OWL) Use Cases and Requirements-working draft 3</title>
		<author>
			<persName><forename type="first">J</forename><surname>Heflin</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2003">2003</date>
			<publisher>W3C</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">A DAML+OIL-Compliant Chinese Lexical Ontology</title>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">S</forename><surname>Lai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">J</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">K</forename><surname>Hsu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 19th International Conference on Computational Linguistics</title>
				<meeting>the 19th International Conference on Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2002">2002</date>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page" from="1238" to="1242" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Ontology Learning for the Semantic Web</title>
		<author>
			<persName><forename type="first">A</forename><surname>Maedche</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Staab</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE intelligent Systems</title>
		<imprint>
			<biblScope unit="volume">16</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="72" to="79" />
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Ontology-based Knowledge Representation for Bioinformatics</title>
		<author>
			<persName><forename type="first">R</forename><surname>Stevens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">A</forename><surname>Goble</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bechhofer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Briefings in Bioinformatics</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="398" to="416" />
			<date type="published" when="2000">2000</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
