<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">A New Corpus Resource for Studies in the Syntactic Characteristics of Terminologies in Contemporary English</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Alex</forename><forename type="middle">C</forename><surname>Fang</surname></persName>
							<email>acfang@cityu.edu.hk@2</email>
							<affiliation key="aff0">
								<orgName type="department" key="dep1">Department of Chinese</orgName>
								<orgName type="department" key="dep2">Translation and Linguistics</orgName>
								<orgName type="laboratory">Dialogue Systems Group</orgName>
								<orgName type="institution">City University of Hong Kong Tat Chee</orgName>
								<address>
									<addrLine>Avenue</addrLine>
									<settlement>KowloonHong Kong SAR</settlement>
									<country key="CN">PR China</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Jing</forename><surname>Cao</surname></persName>
							<affiliation key="aff0">
								<orgName type="department" key="dep1">Department of Chinese</orgName>
								<orgName type="department" key="dep2">Translation and Linguistics</orgName>
								<orgName type="laboratory">Dialogue Systems Group</orgName>
								<orgName type="institution">City University of Hong Kong Tat Chee</orgName>
								<address>
									<addrLine>Avenue</addrLine>
									<settlement>KowloonHong Kong SAR</settlement>
									<country key="CN">PR China</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Yang</forename><surname>Song</surname></persName>
							<email>songyang2@student.cityu.edu.hk</email>
							<affiliation key="aff0">
								<orgName type="department" key="dep1">Department of Chinese</orgName>
								<orgName type="department" key="dep2">Translation and Linguistics</orgName>
								<orgName type="laboratory">Dialogue Systems Group</orgName>
								<orgName type="institution">City University of Hong Kong Tat Chee</orgName>
								<address>
									<addrLine>Avenue</addrLine>
									<settlement>KowloonHong Kong SAR</settlement>
									<country key="CN">PR China</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">A New Corpus Resource for Studies in the Syntactic Characteristics of Terminologies in Contemporary English</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">11CE872877B96670065CE8BF34EE5620</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T19:18+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>syntactic tree</term>
					<term>treebank</term>
					<term>syntactic function</term>
					<term>terminology</term>
					<term>ICE-GB</term>
					<term>noun phrase</term>
					<term>term annotation</term>
					<term>corpus</term>
					<term>syntax</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In this paper, we present a new corpus resource that has been constructed specially for the study of the syntactic characteristics of terminologies. The corpus is based on the British component of the International Corpus of English (ICE-GB), comprising four parallel subject domains from two text categories (i.e. academic vs. popular prose) with a total of about 200,000 running word tokens. The resource is richly annotated at lexical, grammatical, syntactic, and terminological levels. It is also parameterized according to both text categories and subject domains. The corpus resource is expected to contribute towards a linguistically motivated description of terms and their internal structures. It is also expected to provide an analytical framework for the study of relations between terminological use and text categories as well as subject domains.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Automatic term recognition (ATR) and extraction have been a challenging task and encouraged rigorous efforts of researchers from a wide range of backgrounds and disciplines. Nevertheless, past work on terminological extraction tends to focus on specific subject domains, and mainly in the field of biochemistry and medicine such as <ref type="bibr" target="#b0">Ananiadou et al. 2000</ref><ref type="bibr" target="#b15">, Nenadic et al. 2005</ref><ref type="bibr" target="#b1">, Aubin and Hamon 2006</ref><ref type="bibr" target="#b18">, and Ville-Ometz et al. 2007</ref>, to name just a few. Some work on other domains such as computing (e.g. <ref type="bibr" target="#b4">Eumeridou et al. 2004;</ref><ref type="bibr" target="#b12">L'Homme 2002;</ref><ref type="bibr" target="#b14">Nakagawa and Mori 2003)</ref>, economy (e.g. <ref type="bibr">Rodriguez et al. 2007)</ref>, and legislation (e.g. <ref type="bibr" target="#b10">Ha et al. 2008;</ref><ref type="bibr" target="#b11">Kit and Liu 2008)</ref>. Those studies are domain specific in a good sense that they concentrate on domain-specific issues like domain knowledge and associated knowledge expressions on the lexical level. Yet they are domain limited in an undesirable sense, which leads to difficulty in evaluating the performance and interoperability of the existing term TIA'09 recognition systems across a set of different domains. Additionally, it remains an issue how such systems will adapt to new domains.</p><p>Another noticeable issue is that, among the linguistic features employed in ATR systems, syntactic features have been mainly observed at the phrasal level, and seldom from the perspective of syntactic structures at a clausal level. Grammatical patterns, such as 'noun', 'noun + noun', 'adjective + noun', 'noun + preposition + noun', have been integrated with statistic measurements to determine the termhood (e.g. <ref type="bibr" target="#b8">Frantzi et al. 2000;</ref><ref type="bibr">Pazienza et al. 2005)</ref>. <ref type="bibr" target="#b4">Eumeridou et al. (2004)</ref> go beyond the grammatical patterns and examine how term occurrences correlate the argument structure of verbs across three domains chosen from the British National Corpus. Their findings show an uneven distribution of terms in different argument structures<ref type="foot" target="#foot_0">1</ref> , and they also notice the influence that different domains have upon term occurrences. Although the study focuses on the verbal syntax only, it does indicate that syntactic features of terminological entities warrant a worthwhile research topic and that text categories such as registerial types and subject domains should also be a parameter to consider. It is reasonable to believe that further improvement of ATR systems can be achieved by exploring deeper, linguistically motivated analysis of the relation between terminologies and linguistic parameters.</p><p>The main focus of this paper is to present a new corpus resource that has been constructed specially for the study of the syntactic characteristics of terminologies. Existing term-annotated corpora are typically domain-specific, such as GENIA <ref type="bibr" target="#b16">(Ohta et al. 2002)</ref>, and typically used as a resource for statistical training. The new corpus resource is different in that it is built on general domains and is richly annotated for syntactic information, especially for detailed annotation of the syntactic categories and their functions within the clause complex that is often dependent on verb subcategorization. The corpus is based on the British component of the International Corpus of English (ICE-GB), comprising four parallel subject domains from two text categories (i.e. academic vs. popular prose) with a total of about 200,000 running word tokens. The resource is richly annotated at lexical, grammatical, syntactic, and terminological levels. It is also parameterized according to both text categories and subject domains. The tree bank is expected to contribute towards a linguistically motivated description of terms and their associated syntactic structures. It will also provide an analytical framework for the study of relations between terminological use and text types as well as subject domains. The richly annotated trees will facilitate studies in the linguistic relations of terms for the purpose of ontology construction.</p><p>In the rest of this paper, we will first of all describe the construction of the corpus, including the selection of the corpus material, the annotation schemes for grammar and syntax, and an inter-annotator analysis of the manual annotation of terms. We shall then report some of our initial empirical observations of the syntactic characteristics of noun phrases (NP) that are terminological entities as opposed to generic NPs across different types and domains. For this purpose, we will describe the distribution of general NPs in terms of text categories and subject domains. We will then describe the distribution of terminological NPs according to the same parameters, focusing on their syntactic functions in the tree structure.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Corpus Construction</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Corpus resource for term annotation</head><p>Our on-going research attempts to extend the previous studies by exploring the syntactic characteristics of terminological entities across different text types and subject domains in contemporary English. To achieve our objectives, the British component of the International Corpus of English (ICE-GB; Greenbaum 1996) was chosen as a basis for the following reasons: First, it is encoded for a variety of text categories and subject domains. Secondly, it is already grammatically tagged, syntactically parsed and manually validated. Finally and most importantly, it is annotated with a rich set of linguistically motivated syntactic relations that will maximally enhance our intended study. The following sections will first describe the resource created from the ICE-GB and introduce its part-of-speech (POS) and syntactic annotations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.1">Creation of a sub-corpus</head><p>The British component of the International Corpus of English (ICE-GB) is a onemillion-word corpus comprising both spoken and written British English from the 1990s <ref type="bibr" target="#b9">(Greenbaum 1996;</ref><ref type="bibr" target="#b7">Fang 2007)</ref>. The spoken section represents 60% of the total size of the corpus with 300 sample texts. The written section accounts for 40% of the corpus with 200 texts. Each component text has about 2,000 word tokens. Table <ref type="table" target="#tab_0">1</ref> summarizes the text categories in the ICE-GB together with the number of component texts. Given the purpose of our study, texts from the category of informational writing constitute a suitable source of texts, which is further divided into three sub-categories: academic writing, popular writing and press news reports. Two contrastive text types, i.e., academic writing and popular writing, were chosen. The two text types cover four parallel subject domains comprising ten texts each. Table <ref type="table" target="#tab_1">2</ref> presents the composition of the sub-corpus created from ICE-GB. As can be seen from Table <ref type="table" target="#tab_1">2</ref>, the sub-corpus comprises 80 texts similar in size with a total number of 193,206 word tokens.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>TIA'09</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.2">Tree annotations in the ICE-GB</head><p>All the texts in ICE-GB are richly annotated grammatically and syntactically <ref type="bibr" target="#b5">(Fang 1996</ref><ref type="bibr" target="#b6">(Fang , 2000</ref><ref type="bibr" target="#b6">(Fang , 2006</ref><ref type="bibr" target="#b7">(Fang , 2007))</ref>. When the 80 texts from ICE-GB were selected to create the sub-corpus, a treebank was effectively created that comprises 8,306 syntactic trees.  As noted in Figure <ref type="figure" target="#fig_0">1</ref> above, the tree structure is richly annotated with fine-grained grammatical and syntactic information. At the grammatical level, words are coded with part-of-speech (POS) tags that include a head tag (such as nouns, verb, and adjectives) with a set of attributes indicating the subcategorizations of the head tag.</p><p>For instance, the verb found enclosed within a pair of curly brackets is tagged as V(montr,edp), namely, a mono-transitive verb in past participial form. As another example, {The} is assigned a label ART(def), meaning it is a definite article, and {fibres} is a common noun in its plural form. Syntactically, each node comprises two labels: one representing its syntactic category (such as noun phrase and adjective phrase) and the other the syntactic function. Take the node SU NP() as an example, which indicates that it is a noun phrase (NP) functioning as the subject (SU) of the clause. The same NP comprises a determiner (DT), the head (NPHD) and a postmodifier (NPPO). The definite article The constitutes the central determiner (DTCE), a daughter node of DT. See Appendix for a complete list of all the parsing symbols. With such a system of syntactic categories and their associated syntactic functions, the corpus forms a valuable testbed according to which grammatical relations of various kinds can be investigated. The syntactic framework will also form an informative context within which terms and term relations can be usefully examined.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Term annotation</head><p>Term annotation was carried out manually during a period of four months, and has gone through the following procedures:</p><p>• Training of the annotators: The training session helps the annotators get familiar with the special format of the target texts, which are parsed and represented in a form exemplified in Fig. <ref type="figure" target="#fig_0">1</ref>. • Analysis of inter-annotator agreement: This step was taken to establish the consistency and therefore the quality of the annotations by the three different annotators given the same text, and a higher statistic agreement will demonstrate the confidence of the manual annotation. • Actual annotation: With an annotation guideline, annotators mark up the terms with the help of dictionaries, online dictionaries and term banks. • Manual examination of terminological annotations. In the remaining of this section, we shall first describe the annotation guideline and then report the results from the inter-annotator agreement test. The basic statistics of the terminologically annotated corpus resource will be presented in Section 3.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.1">Annotation guideline</head><p>Before describing the guideline, we first introduce the operational definition of terminological entities. To our understanding, terms by definition primarily correspond to noun-phrase (NP) groups and thus consist of words that are single nouns or complex noun phrases <ref type="bibr">(Kageura et al. 2004;</ref><ref type="bibr">Nakagawa 2001;</ref><ref type="bibr" target="#b14">Nakagawa and Mori 2003)</ref>. Following <ref type="bibr" target="#b4">Eumeridou et al. (2004)</ref>, we also consider terms in a pragmatic sense. Take text w2a-031 for example. The text is about "blind shaft drilling" under the domain of technology. In addition to terms in technology and engineering, we may also mark up terminological entities from related domains such as environment. Given such a definition, a working guideline for annotation was made:</p><p>TIA'09 • Among the NPs, proper names of places, countries, organizations or institutes are excluded from the current study, and therefore, will not be annotated.</p><p>• Variant terms will be annotated.</p><p>o Singular and plural forms of a term will both be regarded as terms in case some termbanks only collect singular form of a term. o When an N1+N2 compound is a term, the sequence N2 + of + N1 will also be treated as a term. o Variant spellings of the same term will be accepted.</p><p>• With nested terms, we only mark up the longest part as a multi-word term.</p><p>• Terms are marked with '&lt;' at the beginning and '&gt;' at the end in the tree diagram, and the resulting NP is described by an additional attribute 'term'. See Figure <ref type="figure" target="#fig_1">2</ref>.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.2">Inter-annotator agreement</head><p>Three annotators were trained to mark up terms. All the three annotators are university students majoring in linguistics. Among them, two are undergraduates who have been admitted to postgraduate study and one is a PhD candidate. To measure the inter-annotator agreement, two texts were taken from the pre-selected sub-corpus from ICE-GB, with a total number of about 4,000 words. During the annotation stage, the annotators were allowed to refer to the guideline or other sources such as online termbanks and dictionaries, in addition to their linguistic knowledge. They were not allowed to confer with each other over the annotation.</p><p>We then compared the annotations among the three annotators by using F score, which is considered to be a standard measure to determine the inter-annotator agreement <ref type="bibr" target="#b2">(Corbett et al. 2007</ref>) and has been commonly used in previous studies (see, for example, Demetriou and Gaizauskas 2003, <ref type="bibr" target="#b13">Morgan et al. 2004</ref><ref type="bibr">, Vlachos and Gasperin 2006</ref><ref type="bibr">and Kolarik 2008)</ref>. Therefore, the inter-annotator agreement was computed pair-wise using a measure defined in (1):</p><p>(1)</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A New Corpus Resource for Studies in the Syntactic Characteristics of Terminologies in Contemporary English</head><p>where M 1 and M 2 are the number of markable terms in a given text marked up by Annotators 1 and 2 respectively, and C is the total number of times both annotators agree on a markable term in that same text. To calculate the F score, the total number of terms marked by annotators A, B, and C were counted respectively. Next, all of the exact matches were found and counted. For an exact match, the left and right boundaries had to match entirely. Table <ref type="table" target="#tab_4">3</ref> summarizes the inter-annotator agreement. Annotators A, B and C respectively identified 604, 594 and 594 terms independently. The total number of commonly identified terms is given for paired annotators. All the F scores for each paired annotators all above 95%, suggesting a high level of inter-annotator agreement. The results suggest that a high level of agreement is possible by training and by referring to the annotation guideline. Such a finding shows that trained annotators can achieve a high level of consistency even without expert domain knowledge, a finding that is contrary to the past experience that extensive training is needed for consistent annotation of terms in specialized domains such as biochemistry and medicine.</p><p>After the inter-annotator agreement test, the three annotators carried out the actual annotation and met to discuss the uncertain situations when necessary. Finally, the annotated corpus was manually validated by one annotator with the help of online resources and specialized dictionaries.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Syntactic Features of NP Constructions</head><p>In this section, we present some initial descriptive statistics and chart the distribution of NP constructions across different text categories and domains. We will first explain how we retrieve the syntactic functions of NPs according to the tree structure, followed by a description of the basic statistics of NP constructions in the corpus. We shall then present the preliminary observations of the syntactic features of NPs that are marked as terms.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">A general description of NP constructions by category and domain</head><p>As explained in Section 2.1.2, every NP is assigned a function label and additional attributes if necessary. To count the frequency of NP constructions in trees is straightforward in most cases except for two scenarios, where the functions are labeled as CJ (conjoin; see Fig. <ref type="figure">3</ref>) and DEFUNC (appositive NP that does not perform any syntactic function; see Fig. <ref type="figure">4</ref>). In Fig. <ref type="figure">3</ref>, the direct object NP is described by the TIA'09 attribute coordn, indicating the presence of a coordinated construction whose conjoins are marked as CJ. In such a scenario, a CJ will inherit the function of its mother node and be counted as a separate OD NP. Therefore, instead of counting one OD and two CJ functions, we count two OD functions for the NPs in Fig 3 <ref type="figure">.</ref> Similarly, NPs with DEFUNC labels are also relocated and assigned the function label of the governing NP. See Fig. <ref type="figure">4</ref> for an example, where DEFUNC NP is treated as SU NP. In this particular case, instead of one DEFUNC and one SU, two SU functions are counted.  With this treatment of conjoin and appositive NPs, NP constructions in all the eight subject domains were retrieved and summarized in Table <ref type="table" target="#tab_5">4</ref>. As can be observed in Table <ref type="table" target="#tab_5">4</ref>, there is an uneven distribution of 17 different functions of NPs across domains. In general, NPs seem to occur most frequently at the position of PC in all the domains, followed by SU and OD. Nevertheless, when we examine the functions by category and domain, we notice more interesting patterns. First, NPs in domains of academic writing tend to occur less frequently at the position of SU than those in their counterparts of popular writing. Second, domains in academic writing are more likely A New Corpus Resource for Studies in the Syntactic Characteristics of Terminologies in Contemporary English 9</p><p>to have a comparatively higher occurrence of PC as a syntactic function than their counterparts in popular writing. They also tend to have fewer occurrences of OD.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">A statistical description of term-NP constructions</head><p>When examining the distribution of term-NPs, we also related the CJ and DEFUNC functions to their mother nodes. Accordingly, the actual distribution of term-NPs across difference categories and domains were calculated and presented in Table <ref type="table" target="#tab_6">5</ref>. Interesting features emerge from the initial frequency count. First, academic writing tends to have more terms than popular writing in both parameters (i.e. category and domain). In a broad sense, the total number of terms in academic writing is higher than that of popular writing. From the perspective of subject domains, individual domains belonging to academic writing tend to have more terms than their counterparts in popular writing. Such a result suggests that formal writing tends to contain more term candidates than informal writing. Second, science domains (i.e. NAT and TEC) tend to contain more terms than arts domains (i.e. HUM and SOC). It can be also noticed that the number of terms in AHUM is higher than that of ATEC, and it is understandable since AHUM has the highest number of NPs among the domains in academic writing. Third, across the eight domains term-NPs seem to appear most frequently at the position of PC, followed by SU and OD. Fourth, it would be easy to make a contrastive study on certain syntactic functions across the eight domains. For example, terms are more likely to occur at the position of A in ATEC when compared with the other seven domains, and they are more likely to appear at the position CS in AHUM when examined across domains. Such information can be taken as a flexible value in assigning weights to syntactic functions in accordance with particular domains in ATR.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>TIA'09</head><p>It is worth mentioning that syntactic labels at the phrasal level can be further classified at the clausal level. For example, a considerable number of NPs occur at the position of PC, which should be related to its mother node, namely PP, whose functions could be analyzed differently as A PP and NPPO PP, revealing further variations of use across the eight categories.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Conclusion</head><p>In this paper, we presented a new corpus resource that has been constructed specially for the study of the syntactic characteristics of terminologies for a linguistically motivated description of terms and their internal structures. The corpus is based on the British component of the International Corpus of English, comprising four parallel subject domains from two text categories (i.e. academic vs. popular prose) with a total of about 200,000 running word tokens. It is richly annotated at lexical, grammatical, syntactic, and terminological levels. It is parameterized according to both text categories and subject domains. We first described the construction of the corpus, including the selection of the corpus material, the annotation schemes for grammar and syntax, and an inter-annotator analysis of the annotation of terms. We then described the corpus resource by reporting some of our initial empirical observations of NP constructions and term-NP constructions. Interesting patterns were observed in terms of syntactic distribution of NPs and term-NPs across different categories and domains. In particular, term-NPs show observable difference across different categories and domains. In other words, the corpus resource can provide an analytical framework for the study of relations between terminological use and text types as well as subject domains.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 -</head><label>1</label><figDesc>Fig. 1 -An example of syntactic annotations in the ICE-GB</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 2 -</head><label>2</label><figDesc>Fig. 2 -Examples of term annotations in the tree structure</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head></head><label></label><figDesc>Fig. 3 -An example of CJ NP</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 .</head><label>1</label><figDesc>The structure of ICE-GB</figDesc><table><row><cell></cell><cell>Spoken</cell><cell></cell><cell></cell><cell>Written</cell><cell></cell></row><row><cell>Dialogue</cell><cell>Private Public</cell><cell>100 80</cell><cell>Non-printed</cell><cell>Student writing Correspondence</cell><cell>20 30</cell></row><row><cell></cell><cell>Unscripted</cell><cell>70</cell><cell></cell><cell>Informational</cell><cell>100</cell></row><row><cell>Monologue</cell><cell>Mixed Scripted</cell><cell>20 30</cell><cell>Printed</cell><cell>Instructional Persuasive</cell><cell>20 10</cell></row><row><cell></cell><cell></cell><cell></cell><cell></cell><cell>Creative</cell><cell>20</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 .</head><label>2</label><figDesc>The structure of the sub-corpus</figDesc><table><row><cell>Text Type</cell><cell>Subject Domain</cell><cell>Domain Code</cell><cell># of Texts</cell><cell># of Words</cell></row><row><cell></cell><cell>Humanities</cell><cell>AHUM</cell><cell>10</cell><cell>24,363</cell></row><row><cell>Academic</cell><cell>Social sciences</cell><cell>ASOC</cell><cell>10</cell><cell>24,280</cell></row><row><cell>writing</cell><cell>Natural sciences</cell><cell>ANAT</cell><cell>10</cell><cell>24,165</cell></row><row><cell></cell><cell>Technology</cell><cell>ATEC</cell><cell>10</cell><cell>23,386</cell></row><row><cell></cell><cell>Humanities</cell><cell>PHUM</cell><cell>10</cell><cell>27,168</cell></row><row><cell>Popular</cell><cell>Social sciences</cell><cell>PSOC</cell><cell>10</cell><cell>23,110</cell></row><row><cell>writing</cell><cell>Natural sciences</cell><cell>PNAT</cell><cell>10</cell><cell>23,150</cell></row><row><cell></cell><cell>Technology</cell><cell>PTEC</cell><cell>10</cell><cell>23,584</cell></row><row><cell>Total</cell><cell></cell><cell></cell><cell>80</cell><cell>193,206</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 3 .</head><label>3</label><figDesc>A summary of the inter-annotator agreement</figDesc><table><row><cell>Annotator</cell><cell># of Terms</cell><cell>Paired Annotators</cell><cell># of Terms in Common</cell><cell>F Score</cell></row><row><cell>A</cell><cell>604</cell><cell>A-B</cell><cell>575</cell><cell>95.99%</cell></row><row><cell>B</cell><cell>594</cell><cell>A-C</cell><cell>576</cell><cell>96.16%</cell></row><row><cell>C</cell><cell>594</cell><cell>B-C</cell><cell>584</cell><cell>98.32%</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head>Table 4 .</head><label>4</label><figDesc>Summary of NP constructions</figDesc><table><row><cell></cell><cell cols="8">AHUM ASOC ANAT ATEC PHUM PSOC PNAT PTEC</cell></row><row><cell>Function</cell><cell>Freq</cell><cell>Freq</cell><cell>Freq</cell><cell>Freq</cell><cell>Freq</cell><cell>Freq</cell><cell>Freq</cell><cell>Freq</cell></row><row><cell>A</cell><cell>26</cell><cell>40</cell><cell>14</cell><cell>66</cell><cell>54</cell><cell>61</cell><cell>60</cell><cell>40</cell></row><row><cell>AJPR</cell><cell>0</cell><cell>1</cell><cell>13</cell><cell>10</cell><cell>7</cell><cell>1</cell><cell>3</cell><cell>4</cell></row><row><cell>AVPR</cell><cell>3</cell><cell>6</cell><cell>9</cell><cell>3</cell><cell>16</cell><cell>11</cell><cell>11</cell><cell>8</cell></row><row><cell>CO</cell><cell>15</cell><cell>3</cell><cell>11</cell><cell>8</cell><cell>21</cell><cell>9</cell><cell>8</cell><cell>6</cell></row><row><cell>CS</cell><cell>215</cell><cell>147</cell><cell>144</cell><cell>184</cell><cell>260</cell><cell>180</cell><cell>190</cell><cell>172</cell></row><row><cell>DT</cell><cell>210</cell><cell>64</cell><cell>13</cell><cell>32</cell><cell>174</cell><cell>98</cell><cell>69</cell><cell>58</cell></row><row><cell>ELE</cell><cell>176</cell><cell>77</cell><cell>97</cell><cell>193</cell><cell>242</cell><cell>52</cell><cell>60</cell><cell>200</cell></row><row><cell>FOC</cell><cell>17</cell><cell>4</cell><cell>4</cell><cell>6</cell><cell>4</cell><cell>8</cell><cell>9</cell><cell>3</cell></row><row><cell>NPPO</cell><cell>250</cell><cell>150</cell><cell>419</cell><cell>237</cell><cell>246</cell><cell>59</cell><cell>32</cell><cell>99</cell></row><row><cell>NPPR</cell><cell>1</cell><cell>10</cell><cell>23</cell><cell>21</cell><cell>15</cell><cell>13</cell><cell>14</cell><cell>12</cell></row><row><cell>OD</cell><cell>850</cell><cell>806</cell><cell>634</cell><cell>778</cell><cell>924</cell><cell>951</cell><cell>812</cell><cell>947</cell></row><row><cell>OI</cell><cell>15</cell><cell>13</cell><cell>1</cell><cell>7</cell><cell>31</cell><cell>27</cell><cell>12</cell><cell>9</cell></row><row><cell>PC</cell><cell>3138</cell><cell>2982</cell><cell>3301</cell><cell>2834</cell><cell>3060</cell><cell>2585</cell><cell>2807</cell><cell>2356</cell></row><row><cell>PMOD</cell><cell>0</cell><cell>0</cell><cell>4</cell><cell>2</cell><cell>4</cell><cell>2</cell><cell>4</cell><cell>1</cell></row><row><cell>PROD</cell><cell>1</cell><cell>4</cell><cell>0</cell><cell>1</cell><cell>1</cell><cell>1</cell><cell>1</cell><cell>4</cell></row><row><cell>PRSU</cell><cell>33</cell><cell>53</cell><cell>39</cell><cell>37</cell><cell>29</cell><cell>55</cell><cell>32</cell><cell>42</cell></row><row><cell>SU</cell><cell>1685</cell><cell>1640</cell><cell>1626</cell><cell>1597</cell><cell>1986</cell><cell>1957</cell><cell>1859</cell><cell>1850</cell></row><row><cell>Total</cell><cell>6635</cell><cell>6000</cell><cell>6352</cell><cell>6016</cell><cell>7074</cell><cell>6070</cell><cell>5983</cell><cell>5811</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_6"><head>Table 5 .</head><label>5</label><figDesc>Summary of term-NP constructions</figDesc><table><row><cell></cell><cell cols="8">AHUM ASOC ANAT ATEC PHUM PSOC PNAT PTEC</cell></row><row><cell>Function</cell><cell>Freq</cell><cell>Freq</cell><cell>Freq</cell><cell>Freq</cell><cell>Freq</cell><cell>Freq</cell><cell>Freq</cell><cell>Freq</cell></row><row><cell>A</cell><cell>4</cell><cell>3</cell><cell>1</cell><cell>16</cell><cell>2</cell><cell>1</cell><cell>1</cell><cell>5</cell></row><row><cell>AJPR</cell><cell>0</cell><cell>0</cell><cell>2</cell><cell>4</cell><cell>0</cell><cell>1</cell><cell>0</cell><cell>1</cell></row><row><cell>AVPR</cell><cell>0</cell><cell>0</cell><cell>0</cell><cell>1</cell><cell>0</cell><cell>1</cell><cell>1</cell><cell>0</cell></row><row><cell>CO</cell><cell>12</cell><cell>1</cell><cell>10</cell><cell>5</cell><cell>7</cell><cell>4</cell><cell>4</cell><cell>4</cell></row><row><cell>CS</cell><cell>106</cell><cell>48</cell><cell>63</cell><cell>76</cell><cell>85</cell><cell>55</cell><cell>68</cell><cell>36</cell></row><row><cell>DT</cell><cell>140</cell><cell>29</cell><cell>8</cell><cell>20</cell><cell>73</cell><cell>54</cell><cell>40</cell><cell>35</cell></row><row><cell>ELE</cell><cell>16</cell><cell>35</cell><cell>40</cell><cell>47</cell><cell>56</cell><cell>14</cell><cell>42</cell><cell>92</cell></row><row><cell>FOC</cell><cell>12</cell><cell>1</cell><cell>3</cell><cell>4</cell><cell>2</cell><cell>0</cell><cell>6</cell><cell>2</cell></row><row><cell>NPPO</cell><cell>10</cell><cell>14</cell><cell>7</cell><cell>5</cell><cell>10</cell><cell>1</cell><cell>1</cell><cell>3</cell></row><row><cell>NPPR</cell><cell>0</cell><cell>7</cell><cell>14</cell><cell>9</cell><cell>2</cell><cell>5</cell><cell>11</cell><cell>6</cell></row><row><cell>OD</cell><cell>456</cell><cell>341</cell><cell>316</cell><cell>408</cell><cell>316</cell><cell>331</cell><cell>379</cell><cell>480</cell></row><row><cell>OI</cell><cell>5</cell><cell>5</cell><cell>0</cell><cell>45</cell><cell>6</cell><cell>4</cell><cell>4</cell><cell>2</cell></row><row><cell>PC</cell><cell>1637</cell><cell>1435</cell><cell>1886</cell><cell>1496</cell><cell>1043</cell><cell>982</cell><cell>1199</cell><cell>1082</cell></row><row><cell>SU</cell><cell>510</cell><cell>536</cell><cell>753</cell><cell>654</cell><cell>422</cell><cell>442</cell><cell>673</cell><cell>621</cell></row><row><cell>Total</cell><cell>2908</cell><cell>2455</cell><cell>3103</cell><cell>2790</cell><cell>2024</cell><cell>1895</cell><cell>2429</cell><cell>2369</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">In lexical semantic terms, argument structure refers to the semantic type of the verb and its related elements such as agent and theme. The same term is also loosely used in syntax to refer to the subcategorisation, or valency structure or complémentation type of verbs.</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowlegement</head><p>The work described in this paper was supported partially by research grants (Nos 7002190, 7200120 and 7002387) from City University of Hong Kong. The authors would like to thank the reviewers for their valuable comments and suggestions.</p></div>
			</div>

			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>TIA'09</head></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Evaluation of Automatic Term Recognition of Nuclear Receptors from MEDLINE</title>
		<author>
			<persName><forename type="first">S</forename><surname>Ananiadou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Albert</forename><forename type="middle">S</forename><surname>Schuhmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Genome Informatics</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="page" from="450" to="451" />
			<date type="published" when="2000">2000</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Improving Term Extraction with Terminological Resources</title>
		<author>
			<persName><forename type="first">S</forename><surname>Aubin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Hamon</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">FinTAL 2006</title>
				<imprint>
			<date type="published" when="2006">2006</date>
			<biblScope unit="volume">4139</biblScope>
			<biblScope unit="page" from="380" to="387" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Annotation of Chemical Named Entities</title>
		<author>
			<persName><forename type="first">P</forename><surname>Corbett</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Batchelor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Teufel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of BioNLP 2007: Biological, translational, and clinical language processing</title>
				<meeting>BioNLP 2007: Biological, translational, and clinical language processing</meeting>
		<imprint>
			<date type="published" when="2007">2007</date>
			<biblScope unit="page" from="57" to="64" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Corpus resources for development and evaluation of a biological text mining system</title>
		<author>
			<persName><forename type="first">G</forename><surname>Demetriou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Gaizauskas</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Intelligent Systems for Molecular Biology (ISMB) Workshop on Text Mining</title>
				<meeting>the Intelligent Systems for Molecular Biology (ISMB) Workshop on Text Mining</meeting>
		<imprint>
			<date type="published" when="2003">2003. BioLINK2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">An Analysis of Verb Subcategorization Frames in Three Special Language Corpora with View towards Automatic Term Recognition</title>
		<author>
			<persName><forename type="first">E</forename><surname>Eumeridou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Nkwenti-Azeh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mcnaught</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computers and the Humanities</title>
		<imprint>
			<biblScope unit="volume">38</biblScope>
			<biblScope unit="page" from="37" to="60" />
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">AUTASYS: Automatic Tagging and Cross-Tagset Mapping</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">C</forename><surname>Fang</surname></persName>
		</author>
		<editor>S. GREENBAUM Ed</editor>
		<imprint>
			<date type="published" when="1996">1996</date>
			<biblScope unit="page" from="110" to="124" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">From Cases to Rules and Vice Versa: Robust Practical Parsing with Analogy</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">C</forename><surname>Fang</surname></persName>
		</author>
		<author>
			<persName><surname>Fang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Sixth International Workshop on Parsing Technologies</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Gelbukh</surname></persName>
		</editor>
		<meeting>the Sixth International Workshop on Parsing Technologies<address><addrLine>Berlin Heidelberg</addrLine></address></meeting>
		<imprint>
			<publisher>Springer-Verlag</publisher>
			<date type="published" when="2000">2000. 2006</date>
			<biblScope unit="volume">3878</biblScope>
			<biblScope unit="page" from="168" to="179" />
		</imprint>
	</monogr>
	<note>Computational Linguistics and Intelligent Text Processing</note>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">English Corpora and Automated Grammatical Analysis</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">C</forename><surname>Fang</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2007">2007</date>
			<publisher>The Commercial Pess</publisher>
			<pubPlace>Beijing</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Automatic recognition of multi-word terms: the C-value/NC-value method</title>
		<author>
			<persName><forename type="first">K</forename><surname>Frantzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ananiadou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Mima</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of Digital Libraries</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="117" to="132" />
			<date type="published" when="2000">2000</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">Comparing English World: The International Corpus of English</title>
		<author>
			<persName><forename type="first">S</forename><surname>Greenbaum</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1996">1996</date>
			<publisher>Oxford University Press</publisher>
			<pubPlace>Oxford</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Mutual Terminology Extraction Using a Statistical Framework</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">A</forename><surname>Ha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Mitkov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Corpas</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Procesamiento del lenguaje natural. N</title>
		<imprint>
			<biblScope unit="volume">41</biblScope>
			<biblScope unit="page" from="107" to="112" />
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Measuring Mono-word Termhood by Rank Difference via Corpus Comparison</title>
		<author>
			<persName><forename type="first">C</forename><surname>Kit</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Liu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Terminology</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="page" from="204" to="229" />
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Chemical Names: Terminological Resources and Corpora Annotation</title>
		<author>
			<persName><forename type="first">C</forename><surname>Kolá Ik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Klinger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">C</forename><surname>Homme</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">LREC 2008 Workshop: Building and Evaluating Resources for Biomedical Text</title>
				<meeting><address><addrLine>France</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2002">2008. 2002. 2002</date>
			<biblScope unit="page" from="65" to="70" />
		</imprint>
	</monogr>
	<note>Proceedings of Terminology and Knowledge Engineering (TKE</note>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Gene Name Identification and Normalization using a Model Organism Database</title>
		<author>
			<persName><forename type="first">A.-A</forename><surname>Morgan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Hirschman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Colosimo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A.-S</forename><surname>Yeh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J.-B</forename><surname>Colombe</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Biomedical Informatics</title>
		<imprint>
			<biblScope unit="volume">37</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="398" to="410" />
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Automatic term recognition based on statistics of compound nouns and their components</title>
		<author>
			<persName><forename type="first">H</forename><surname>Nakagawa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Mori</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Terminology</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="201" to="219" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Mining Biomedical Abstracts: What&apos;s in a Term?</title>
		<author>
			<persName><forename type="first">G</forename><surname>Nenadic</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Spasic</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ananiadou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Natural Language Processing -IJCNLP 2004 First International Joint Conference</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<editor>
			<persName><forename type="first">K.-Y</forename><surname>Su</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Tsujii</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J.-H</forename><surname>Lee</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">O</forename><surname>Kwong</surname></persName>
		</editor>
		<imprint>
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">GENIA Corpus: An Annotated Research Abstract Corpus in Molecular Biology Domain</title>
		<author>
			<persName><forename type="first">T</forename><surname>Ohta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tateisi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Mima</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tsujii</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Terminology Extraction: An Analysis of Linguistic and Statistical Approaches</title>
				<meeting><address><addrLine>San Diego, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2002">2002. 2005</date>
			<biblScope unit="volume">185</biblScope>
			<biblScope unit="page" from="255" to="279" />
		</imprint>
	</monogr>
	<note>Proceedings of the Human Language Technology Conference</note>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">A Corpus and Lexical Resources for Multiword Terminology Extraction in the Field of Economy in a Minority Language</title>
		<author>
			<persName><forename type="first">B</forename><surname>Rodríguez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Mario</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">D</forename><surname>Noya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">G</forename><surname>Otero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">L</forename><surname>Martínez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">M M</forename><surname>Mato</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Rojo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">P</forename><surname>Del Río</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">S</forename><surname>Docío</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 3rd Language and Technology Conference</title>
				<meeting>the 3rd Language and Technology Conference</meeting>
		<imprint>
			<date type="published" when="2007">2007</date>
			<biblScope unit="page" from="359" to="363" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Enhancing automatic recognition and extraction of term variants with linguistic features</title>
		<author>
			<persName><forename type="first">F</forename><surname>Ville-Ometz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Royauté</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Zasadzinski</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Terminology</title>
		<imprint>
			<biblScope unit="volume">13</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="35" to="59" />
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
