Learning to Generate Semantic Annotation for Domain Specific Sentences Jianming Li, Lei Zhang, Yong Yu Department of Computer Science and Engineering, Shanghai JiaoTong University, Shanghai, 200030, P.R.China Jianming119@sina.com, tozhanglei@hotmail.com, yyu@mail.sjtu.edu.cn ABSTRACT learning based approach is effective. Our approach is Seas of web pages in the Internet contain free texts in independent on any ML algorithm. In the prototype natural language that are only read by human beings. To ALPHA system, we employed instance-based learning. be understandable for machines, these pages should be Link Grammar is first used to get the syntactic structures annotated with semantic markups. Manually annotating of sentences. The learning process then learns to map the large amounts of pages is an arduous work. This has made syntactic structures to semantic structures – RDF graphs. automatic semantic annotation an urgent challenge. In this WordNet [7] and the domain relation hierarchy are used as paper, we propose a machine-learning based automatic the domain ontology in the whole semantic analysis and annotation approach. This approach can be trained for representation process. Preliminary results gained from different domains and requires nearly no manual rules. the ALPHA system demonstrated the feasibility of the The annotation is on the sentence level and is in RDF approach. format. We adopt a dependency grammar – Link Grammar The paper is organized as follows. Section 1.1 explains the [2] –for this purpose. ALPHA system, a prototype of this concept of “Domain Specific Sentences” used in this approach has been developed with IBM China Research paper. Section 1.2 briefly shows what the result RDF looks Lab. We expect many improvements are possible for this like. Section 1.3 explains the reason to adopt Link approach and our work may be selectively adopted or Grammar. Section 2 outlines the whole approach by giving enhanced. an overview. Section 3 presents the detailed process that 1 Introduction generates RDF graph from domain specific sentences. There are seas of web pages in the Internet and nearly all Section 4 discusses the result of ALPHA system. Section of them contain free texts in natural language that are only 5 concludes our work by comparing related work. read by human beings. Annotating these pages with 1.1 Domain Specific Sentences semantic markups is one promising way to make them Domain specific sentences point to those sentences that understandable for machines. Unfortunately, automatic are frequently occurring in one certain application domain semantic annotation for the natural language sentences in text but scarcely in others. They are assumed to own the these pages is a daunting tas k and we are often forced to following characteristics: do it manually or semi-automatically using handwritten I. vocabulary set is limited rules. In this paper, we propose a machine-learning (ML) II. word usage has patterns based automatic semantic annotation approach that can III.semantic ambiguities are rare be trained for different domains and require almost no IV.terms and jargon of the domain appear frequently manual rules. The annotation resulted form this approach The notion of sublanguage [3,4] has been well discussed lies in the sentence level, i.e., we will annotate each last decade. Domain specific sentences actually can be sentence or prime sentences in a web page. This approach seen as sentences in a domain sublanguage. As previous stems from our previous research on semantic analysis on study has shown, a common vocabulary set and some natural language sentences using Conceptual Graphs specific patterns of word usage can be identified in a (CG). domain sublanguage. These results provide ground for us Free texts in the Internet contain various information in to assume the above characteristics about domain specific diverse domains. The method we proposed in this paper is sentences. In the rest of this paper, we will show how for domain specific sentences that are sentences occur in characteristics I to III are employed in our work. Terms a specific application domain. Though the sentences are and jargon will be dealt with in the following section by limited in one domain, our method itself is domain adding them to the Link Grammar dictionary. independent and the system can be trained for various domains. Domain specific sentences are usually very stylish in the 1.2 RDF Graph words, phrases, grammar and semantics they employ, After the annotation, sentences from web pages will be which lead to a strong patterned text for which machine marked up with RDF statements. We illustrate the representation by using an example sentence “I go to Shanghai”. The corresponding RDF statement will be like Each word in Link Grammar has a linking requirement the following: stating what types of links it can attach and how they are dictionary of about 60000 words together with their linking requirements. Although the CMU link parser still has I difficulties in parsing complex syntactic structures in real WN16-2-012345 commercial environment, it is now ready for use in relatively large prototypes. Applying Link Grammar to languages other than English (e.g. Chinese [19]) is also possible. go The most important reason that makes us adopt Link WN16-2-012345 Grammar in our work is the structure similarity between Link Grammar parse result and RDF graph. Fig.2 shows this similarity by comparing the Link Grammar parse result, the typical parse tree of a constituent grammar and the shanghai The link structure: Sp*I MVp Js I go to Shanghai The grammar tree: S AGNT DEST I Go Shanghai NP VP Fig. 1. RDF graph for the example sentence “I go to Shanghai” Class “Concept” represents concept in sentence. In the PRO V PP current implementation, we are using WordNet [7] as experimental concept ontology. Property I go PREP N “WordNetSenseIndex” uniquely identifies a word sense (concept) in WordNet database. Properties such as to Shanghai “AGNT” (agent), “DEST” (destination) are sub-properties derived from a general property “Relation”. All the sub- The RDF graph: properties of “Relation” are organized as a hierarchy and thus form the relation ontology. [18] AGNT DEST I Go Shanghai The RDF statement can also be diagramed as a directed labeled graph with nodes and arcs as depicted in figure 1. Since the diagram is simpler and easier to understand, we Fig. 2. Link structure is more like a RDF graph will use the diagram, which we call RDF graph, to represent RDF statements instead of writing long RDF RDF graph for the same example sentence. In fact, this statements in the rest of the paper. similarity comes from the common foundation of both RDF graph and Link Grammar. RDF graph consists of concepts 1.3 Link Grammar and relations. The relations denote the semantic Link Grammar is a dependency grammar system we employ associations between concepts. Similarly, link structure in our work. For the same sentence “I go to Shanghai”, the consists of words and links. The links directly connect Link Grammar parse result is shown in the top of Fig.2. syntactically and semantically related words [2]. Open The labeled arcs between words are called links. The words [17] (such as noun, adjective and verb) access labels are the types of the links. For example, the “Sp*I” concepts from the catalog of conceptual types, while between “I” and “go” represents the type of links closed words [17] (such as prepositions) and links help between “I” and a plural verb form. “MVp” connects verb clarify the semantic relationships between the concepts. to its modifying prepositional phrases. “Js” connects prepositions to their objects. In Link Grammar, there is a finite set of such link types. Based on this similarity and restricted to a specific ontology. Since semantic ambiguities are rare in domain domain, we propose to automatically generate annotation specific sentences (see section 1.1), it is relatively easy to by learning the mapping from link structure to RDF graph. perform these mapping operations (the process of Another important feature of Link Grammar is that the semantic analysis). grammar is distributed among words [2]. There can be a What training phase does is a preparation. Before the separate grammar definition (linking requirement) for each system can learn to do the mapping in the generating word. Thus, expressing the grammar of new words or phase, we convert the mapping into machine learning area. words that have irregular usage is relatively easy. We can Most of studied tasks in machine learning area are to infer just define the grammar for these words and add them to a function that classifies a feature vector into one of a the dictionary. This is how we deal with terms and jargon finite set of categories [6]. We thus translate the mapping of a domain in our approach. Because the vocabulary set operation into classification operation by encoding the for a domain is limited (see section 1.1), we can add all operation as category and encoding the context in which unknown words (including terms and jargon) to the the operation is performed as feature vector. We call the current dictionary of Link Grammar with affordable amount feature vector context vector since it encodes the context of work. information of an operation. The vector generator in the left down corner of Fig.3 is the component that executes this task. 2 Overview of the approach After sufficient training vectors and categories are Our approach of automatic page annotation is a process obtained in the training phase, the system can enter into consisting of two phases: the training phase and the the generating phase. RDF generator, the main part of the generating phase, as shown in Fig.3. generating phase will implement the following algorithm Training Phase Generating Phase Domain Specific Training Link RDF Sentence Corpora Parser link link structure structure Training RDF Interface Generator Domain Knowledge Expert Domain context Mapping Ontology vector classification Operations Vector Machine Training Learning Generator Vectors Engine Fig. 3. Overview of the approach The first step of both phases is to invoke Link Grammar, under the help of ML engine and Link Grammar after it is and parse the sentence into its link structure, which will be given a sentence from object domain. mapped to RDF through different means in the two phases. 1 get the link structure for the sentence from link parser. In the training phase, some domain experts will go through 2 generate an empty RDF graph. a three-operation process to transfer the link structure into RDF graph manually based on their domain knowledge. 3 for (i = 1 to 3) { //perform the three kinds of operations Each operation maps a certain part of the syntactic 4 generate all possible context vectors from link structure to its corresponding semantic representation structure for the i-th kind of operation. according to the syntactic and semantic context. 5 for (every context vector) { Concepts, schemata 1 and relations contained in the 6 if (an operation is needed for this vector) { semantic representation are selected from the domain 7 classify the vector using ML engine. 8 decode the classified category as an operation. 1 Schemata, is a set of RDF graphs describing background 9 perform the operation on the link structure and information in a domain. 10 modify the RDF graph according to the operation In the training phase, domain experts select all the open 11 result (using concepts, schemata and relations words in the link structure one by one. Once an open 12 from the domain ontology). word is selected, the training interface can provide the expert a list of possible concepts or schemata retrieved 13 } from the domain ontology. The expert then chooses the 14 } appropriate one from the list. 15 } This operation is then encoded by the vector generator 16 do integration on the RDF graph. into a context vector and its category. For example, the 17 output the final RDF annotation for the sentence. context vector for the open word “polo” in the example sentence may be 2 in which “NN” is Algorithm 1. The algorithm of the RDF Generator the POS (part-of-speech) tag3 and “Dmu” and “Mp” are Our approach is independent of specific ML algorithm the innermost left and right link types of “polo” (see used. In ALPHA system, we adopt IBL (Instance Based Fig.4). All the context information is obtained from the link Learning) for the ML engine because IBL makes it easy to structure. determine whether an operation is needed for an arbitrary The category for the context vector is encoded as the vector in the above algorithm (line 6). IBL can return a result of the operation – the ID of the selected concept or distance value along with the classification result. If the schemata in the domain ontology. The encoding is distance value is too large, it can be determined that no something like “WN16-2-330153” which can be used later operation is needed for the vector because it is far from as a key to retrieve concept (in WordNet terminology, Ss Js MVp Dmu Mp Ds Pv MVa Jp The polo with an edge is refined enough for work Fig. 4. The link structure for the example sentence being similar to existing training vectors and may be word sense) from the WordNet database. deemed as noise. For other learning methods, this Since WordNet is not specific for any domain, some determination may not be easily achieved. words in a certain domain may not exactly match any In the following section, we will explain Algorithm.1 and sense in the list. For those words the experts are asked to the three operations by taking an example sentence “The choose the most similar sense instead of adding a new polo with an edge is refined enough for work”, which is sense to WordNet so as to preserve the hierarchy in excerpted from a corpora of clothes descriptions collected WordNet for further research. from many clothes shops on the Web. In the sentence, III. Generating “Polo” is a brand and represents a certain kind of shirts and “edge” actually means collar. The link structure for Generating all possible context vectors (line 4 of this sentence is shown in Fig.4. Algorithm 1.) is actually to generate one context vector for each open word in the link structure of the sentence. The 3 Learning to generate RDF generated context vector is then sent to the ML engine as In this section we will introduce the three operations that to do a classification. The returned category is an map a link structure to RDF: word-conceptualization, link- encoding of concept or schema ID. In line 9 of Algorithm folding and relationalization. These three kinds of 1, the RDF generator retrieves the concept or schema from operations must be performed exactly in the right order in the domain ontology according to the decoded ID and both the training phase and the generating phase because creates a concept node in the RDF graph. a later operation may use information generated in the Because word usage has patterns in domain specific previous operations. In section 3.4, the integration on sentences, we expect that similar context vectors appear RDF graph (line 16 of Algorithm.1) is explained. 3.1 Word-Conceptualization 2 The vector is just an example. For brevity, we are not I. Function trying to make the vector encoding perfect in this paper. Word-conceptualization is the first operation to be Actually, what context information is encoded into the performed. Its function is to annotate open words as vector is a separate problem. This problem is isolated concepts in the sentence to form the skeleton of the initial into the vector generator component. In the current empty RDF graph and mark close words for further implementation, we defined a configuration file for the operation. This operation can be seen as a word sense vector generator to address the issue. disambiguation operation. 3 We can augment the link parser with a POS tagger so II. Training that the accurate POS tag information can be added to the link structure and be obtained from it later. for a given open word on a specific word sense. Based on And the category is the encoding of the ID of the PART these similar context vectors, we expect the ML engine relation in the domain ontology. can return correct classification with a high possibility III. Generating since the semantic ambiguity is also rare in domain specific sentences. In the generating phase, generating all possible context vectors for this operation (line 4 of Algorithm 1.) is After this step, all concept nodes of the RDF graph actually generating one context vector for every possible should be created. The RDF graph for the example case in which a closed word connects two concepts. This sentence is shown in Fig.5. For convenience, we use needs consult the concept information generated in the simple concept names in Fig.5. The “S-WORK” is the word-conceptualization operations. If an operation is “SUITABLE-FOR-WORK” schema in domain ontology. needed for the vector, it is sent to the ML engine to do a classification. The returned category is an encoding of the relation ID in domain ontology. In line 9 of Algorithm.1, POLO EDGE REFINE the RDF generator retrieves the relation from domain ontology according to the ID and creates the relation between the two concepts. ENOUGH S-WORK: *x For the example sentence, there are three closed words that need link-folding operation: ‘with’, ‘is’and ‘for’, as Fig. 5. RDF graph after word-conceptualization shown in Fig.4. Among them, the word ‘is’is an auxiliary verb and ‘with’and ‘for’are prepositions. 3.2 Link-Folding The relation implied by the auxiliary verb ‘is’is THEME I. Function and the ‘for’ between ‘refined’ and ‘work’ implies a The following two operations focus on creating the RESULT relation. The RDF graph after this step has semantic relations between the concepts. Closed words relations added between concepts. As to our example (especially prepositions) with their links imply sentence, the RDF graph has grown to Fig.6. relationships between the concepts of the words they connect. In the example sentence, “… polo --- with --- PART EDGE ENOUGH edge …” fragment implies a PART relation between POLO:# [POLO:#] and [EDGE]. We then “fold” the 'with' and its left and right links and replace them with a PART relation. THME This is just what the link-folding operation does. REFINE RSLT S-WORK: *x Closed words with their links representing semantic relations can be seen as word usage patterns. In domain specific sentences, such patterns are expected to occur Fig. 6. RDF graph after link-folding frequently. This actually enables the machine to learn the 3.3 Relationalization patterns from training corpora. In addition, since semantic I. Function ambiguities are rare in domain specific sentences, it can be Semantic relation can also be implied by a link that directly expected that the result of the learning converge on the connects two concepts in the link structure. For example, correct relation. Similar analysis also applies to the next the ‘MVa’link between ‘refined’and ‘enough’in the link operation –relationalization in section 3.3. structure of example sentence implies a MANNER relation. II. Training The relationalization operation translates this kind of links In the training phase, the domain expert can select any into corresponding semantic relations. closed word that connects two concepts and implies a II. Training semantic relation and map it to the responding semantic In the training phase, domain knowledge expert can select relation from the relation ontology4. any link that implies a semantic relation between concepts The context vector for this operation may encode context it connects. The expert then selects the semantic relation information such as the POS tag of the closed word, the from the domain ontology for the connected two left and right link types and the two concepts. The concepts. category is an encoding of the relation ID in the domain The context vector for this operation can include ontology. For the “… polo --- with --- edge … ”case, the information such as the link type and the concepts. For context vector may be 5. the “… refined –MVa – enough … ”, the context vector may be . The category for the context vector can be encoded as the relation ID in the 4 For brevity, we omitted the direction of a relation here. domain ontology. For the above vector, it is the ID of the 5 MANNER relation. The POLO and EDGE in the vector are actually the concept IDs in the domain ontology. We will use the III. Generating same convention in the following vector examples. In the generating phase, generating all possible context After the expansion, we can do a simple co-reference vectors for this operation (line 4 of Algorithm 1.) is detection that draws a co-reference line between the actually generating one context vector for every link that type SUITABLE-FOR-WORK(x) is connects two concepts. If an operation is needed for the vector, it is sent to the ML engine to do a classification. SUTB The returned category is an encoding of the relation ID in CLOTHES: x WORK-SITUATION domain ontology. In line 9 of Algorithm.1, the RDF generator retrieves the relation from domain ontology according to the ID and creates the relation between the Fig. 8. The definition for SUITABLE-FOR-WORK two concepts. After this step, more relations may be created in the RDF undetermined variable x and the current topic [POLO:#]. graph. As to the example sentence, the MANNER relation After this step, the final graph is generated. Fig.9 is the will be created to connect the [REFINE] concept and the result for our example sentence “The polo with an edge is [ENOUGH] concept and the whole graph grows to Fig.7. refined enough for work”. 3.5 Summary POLO:# PART EDGE Through the sections from 3.1 to 3.4, we have explained how we map link structure to RDF graph and convert the THME REFINE RSLT S-WORK: *x PART POLO:# EDGE MANR ENOUGH THME MANR ENOUGH Fig.7.RDF graph after relationalization REFINE 3.4 Integration RSLT Integration is the last step (line 16) in Algorithm.1. This step is not a part of the training phase. It only appears in SUTB CLOTHES: x WORK-SITUATION the generating phase and it is the only step that uses manually constructed heuristics. What it does includes simple co-reference detection and nested graph creation. Fig. 9. The final RDF graph of the example sentence In the discussion of the previous three operations, we mapping to machine learning area. Word- don’t involve lambda expressions for brevity. In fact, they conceptualization builds concepts in the RDF graph. Link- may appear when words for concepts are missed in the folding and relationalization connect concepts with sentence. They may als o be introduced when schema is semantic relations. In the last step, we use manually selected in word-conceptualization phase. In order to constructed heuristics to do simple co-reference detection complete the RDF graph, we need to draw co-reference and nested graph creation. lines between the variables in these lambda expressions. 4 Results Although there is machine-learning based approach for We have developed a prototype called ALPHA system co-reference detection [9], in our work we mainly focus on written in C. ALPHA system is now running on Solaris. It the generation of RDF graph for a single sentence. can be trained for different domains. Currently in our work, Discourse analysis and co-reference detection is left for a clothes domain is chosen as the sample domain. Nearly separate research work. For different domains, we may 300 clothes descriptions, 500 sentences have been construct different heuristics for them. In our current wok collected from clothes shops on the Web6 and are trained we simply make all undetermined references to point to the in ALPHA system. Among them, 34 descriptions and 93 topic currently under discussion. sentences are reserved for testing. The test result is Nested graph (context) may be introduced by expanding shown in Fig.10. Using different IBL algorithms, the schema definition or removing modal/tense modifiers of a accuracy7 of concepts varies from 60% ~ 80%, and that of concept. Although RDF specification lacks a clear relation varies from 45% ~ 60%. semantics about RDF reification, we are now using RDF The result demonstrated the feasibility of our approach. reification mechanism to represent nested graph (context). In our example, we have mentioned in section 3.1 that the concept type S-WORK is actually a “SUITALBE-FOR- 6 Those online shops include www.brooksbrothers.com WORK” schema from the domain ontology. We can do an and www.gap.com, etc. expansion on it. Fig.8 is the definition for the “SUITALBE- 7 FOR-WORK” schema. SUTB represents the relation Here the accuracy of concepts = concepts annotated SUITABLE. correctly / total concepts, and the accuracy of relations = links annotated correctly / links that should be annotated. Link Grammar has a strong impact on the accuracy of according to construction rules on how to fill the slots. ALPHA system. Although its characteristics make it Although this approach has been successfully applied in relatively easily to add domain grammar, it has some many applications, it heavily depends on manually created trouble in disambiguating the syntactic structure of over- construction rules on the parse tree. abridged sentences in clothes domain, such as “Back Another kind of technique advanced in previous work is vent.”, which causes a serious failure in ALPHA system. to directly map between syntactic structure and semantic Though we are aware of the problem, we will let it be at Fig. 10. The accuracy of concepts and relations about different algorithm present because we want to pay more attention to structure such as [13]. We call them structure-mapping. In semantic disambiguation. this respect, they are more similar to our work. To map to To improve the accuracy of ALPHA system, we are more flat structures of conceptual graphs, [13] uses considering developing new algorithms that can compute syntactic predicates to represent the grammatical relations the distances of vectors more accurately. We are also in the parse tree. Instead, in our work, Link Grammar is considering making changes in the architecture so as to employed to directly obtain a more flat structure. Different support the analysis of clauses and idioms. Further more, from [13]’s approach, our work doesn’t use manual rules. other application domains will be selected to test our Moreover, we separate the semantic mapping into several approach. steps that greatly reduce the total number of possibilities. In another work in [14], parse tree is first mapped to a 5 Related works “syntactic conceptual graph”. The “syntactic conceptual Ontology-based annotation is most studied such as [15], graph ” is then mapped to a real conceptual graph. This [16]. [15] extends HTML with semantic extensions and approach again heavily uses manually constructed builds an interactive and graphic tool to help annotate mapping rules. web pages manually. What it does is to associate an Up to now most methods for annotation are by hand or object in HTML with a concept from their ontology. After heavily depend on rules created manually. These methods gaining experiences from manually annotation, they also will have difficulty in applying to the Web because of the conceive an information extraction-based approach for tremendously large amounts of pages. Our approach semi-automatic annotation of natural language texts by provides an automatic way to annotate them in a faster mapping terms to ontological concepts. Different from it, and robust way. Research on machine learning in natural our approach is fully automatic after the training phase. language processing using corpora data [6] has increased Our approach also generates the semantic markup in significantly and there are growing numbers of successful standard RDF format. applications of symbolic machine learning In natural language annotation, grammar-based approach techniques[10,11]. Our work presents a preliminary is often used. They can roughly be divided into slot-filling inquest into the use of traditional machine learning and structure-mapping categories according to their techniques to automatically generate semantic markups generating techniques. for domain specific sentences. We expect that many Slot-filling techniques such as [12] fill template semantic improvements are possible and our work may be graphs with thematic roles identified in the parse tree. selectively adopted or enhanced. Often the graph of one tree node in the syntactic parsing tree is constructed using the graph of its child nodes References 12. Cyre,W.R., Armstrong J.R., and Honcharik,A.J., Generating 1 Walter Daelemans, Jakub Zavrel, Kovan der Sloot, and Simulation Models from Natural Language Specifications, in Antal van den Bosch, TiMBL: Tilburg Memory Based Simulation 65:239-251, 1995. Learner version 3.0 Reference Guide, March 8, 2000 13 Paola Velardi, et.,all, Conceptual Graphs for the analysis 2. Daniel D.Sleator and Davy Temperley, Parsing English with and generation of sentences, in IBM Journal of Research and a Link Grammar, in the Third International Workshop on Development, 32(2), pp.251-267, 1988. Parsing Technologies, August 1993. 14 Caroline Barrière, From a Children’s First Dictionary to a 3. Sager Naomi, "Sublanguage: Linguistic Phenomenon, Lexical Knowledge Base of Conceptual Graphs, Ph.D Computational Tool," In R. Grishman and R. Kittredge thesis, School of Computing Science, Simon Fraser (eds.), Analyzing Language in Restricted Domains: University, 1997. Available at Sublanguage Description and Processing, Lawrence ftp://www.cs.sfu.ca/pub/cs/nl/BarrierePhD.ps.gz Erlbaum, Hillsdale, NJ, 1986 15 M.Erdmann, A.Maedche, H.-P.Schnurr, and Steffen Staab. 4. R. Kittredge and J.Lehrberger, “Sublanguage: Study of From manual to semi-automatic semantic annotaiton: about language in restricted semantic domain”, Walter de Gruyter, ontology -based text annotation tool, In P. Buitelaar & K. Berlin and New York, 1982. Hasida (eds). Proceedings of the COLING 2000 Workshop 5. The information about the link parser from Carnegie Mellon on Semantic Annotation and Intelligent Content, August University is available at: 2000 http://link.cs.cmu.edu/link/index.html 16 Michael Schenk. Ontology-based semantical annotation of 6. Raymond J.Mooney and Claire Cardie, Symbolic Machine XML. Master's thesis, Univeritat (TH) Karlshruhe, 1999 Learning for Natural Language Processing, in the tutorial of 17. James Allen, “Natural Language Understanding”, 2nd ACL'99, 1999. Available at http://www.cs.cornell.edu/Info/ edition, pp.24-25, the Benjamin/Cummings Publishing, People/cardie/tutorial/tutorial.html 1995. 7. George A.Miller, WordNet: An On-line Lexical Database, in 18 John F. Sowa, Knowledge Representation: Logical, the International Journal of Lexicography, Vol.3, No.4, Philosophical, and Computational Foundations, Brooks 1990. Cole Publishing Co., Pacific Grove, CA, 2000. 8. Mitchell P.Marcus, Beatrice Santorini, and Mary Ann 19 Carol Liu, Towards A Link Grammar for Chinese, Marcinkiewicz, Building a large annotated corpus of Submitted for publication in Computer Processing of English: the Penn Treebank, Computational Linguistics, Chinese and Oriental Languages - the Journal of the 19:313-330, 1993. Chinese Language Computer Society. 9. McCarthy,J., and Lehnert,W., Using Decision Trees for Coreference Resolution. In Mellish, C. (Ed.), Proceedings of the Fourteenth International Conference on Artificial Intelligence, pp. 1050-1055. 1995. 10. Claire Cardie and Raymond J.Mooney, Machine learning and natural language (introduction to special issue on natural language learning). Machine Learning, 34, 5-9, 1999. 11. Brill, E. and Mooney, R.J. An overview of empirical natural language processing, AI Magazine, 18(4), 13-24, 1997.