-

A Framework for Resource Annotation and Classi cation in Bioinformatics

Nadia Yacoubi Ayadiy

nadia.yacoubi@asu.edu

Malika Charrady

malika.charrad@riadi.rnu.tn

Soumaya Amdouniz

Mohamed Ben ahmedy

mohamed.benahmed@riadi.rnu.tn

Semantic annotation is commonly recognized as one of the cornerstones of the semantic Web. In the context of Web services, semantic annotations can support e ective and e cient discovery of services, and guide their composition into work ows. Because semantic annotation is a time consuming and expensive task, (semi-)automatic approaches for semantic annotation extraction are required. In this paper, we propose a semi-automatic extraction approach of lightweight semantic annotations from textual description of Web services. In contrast with most of the existing semi-automatic approaches for semantic annotations of Web services which rely on a prede ned domain ontology, we investigate the use of NLP techniques to derive service properties given a corpus of textual description of bioinformatics services. We evaluate the performance of the annotation extraction method and the importance of lightweight annotations to classify bioinformatics Web services in order to bootstrap the service discovery process. Our framework relies an unsupervised clustering approach based on a simultaneous clustering algorithm that enables to determine biclusters of Web services and semantic annotations highly correlated.

Semantic Annotation Semantic Web Service Block Clustering Bioinformatics

During the last decade, semantic Web services (SWS) [ 20 ] technology have been proposed and investigated to support e ective and e cient service discovery, composition and invocation by machines. Despite the appealing characteristics of semantic Web services principles, their uptake on a Web-scale has been signi cantly less prominent than initially anticipated [ 21 ]. In fact, research on semantic Web services has mostly focused on devising domain-independent Web service description ontologies such as OWL-S [ 19 ] and WSMO [ 22 ]. Semantic Annotations for WSDL (SAWSDL) [ 15 ] adopts a bottom-up approach by adding semantics to existing Web service standards through mapping syntactic de nitions to a set of ontological concepts. All of these approaches rely on a pre-determined domain ontology to explicit service semantics. Reasoning tasks performed with semantic Web service descriptions is mainly conditioned by the quality of this domain ontology [ 4 ]. The existence of a domain ontology to capture domain knowledge in an explicit and formal way is crucial. In several elds, many domain ontologies have been developed for several purposes. The complexity of reasoning tasks increases when semantic service descriptions are generated by means of several domain ontologies. In the bioinformatics eld, the OBO foundary1 lists around 60 ontologies for life sciences including molecular biology, anatomy, biochemistry, environment, neuroscience, etc. (for a survey, see [ 24 ]). None of these ontologies is suitable to annotate bioinformatics Web services; although, they are rich in semantics but not enough generic to capture high-level concepts and their semantic relationships.

In this paper, we propose a bottom-up approach to extract domain-dependant lightweight semantic annotation from textual description of Web services. Such annotations of Web services aims to capture static (i.e., domain concepts) and procedural knowledge (i.e., tasks) of a domain. Despite their importance, few domain ontologies exist for the purpose of Web services annotation, and thus, building such ontologies is a challenging task. Natural language documentations of Web services are short textual descriptions intended to close the "semantic gap" between low-level technical features of Web services (e.g., data types, port types, or data formats) and the high-level, meaning-bearing features a user is interested in and refers to when discovering a Web service. Hence, our semi-automatic approach combines di erent extraction patterns to generate lighweight annotations describing service properties such as inputs, outputs, or functionnalities. We notice that our extraction method provides a good starting point for ontology building.

Therefore, we rely on a simultaneous clustering algorithm, namely CROKI2 [ 13 ], to identify clusters (groups) of services that are described by a speci c subset of highly correlated annotations. Simultaneous clustering step has two bene ts. Firsly, clustering Web services based on semantic annotations would greatly boost the ability of Web services search engines to select suitable services given a discovery query. Secondly, it enables to detect implicit associations (relationships) between highly correlated annotations which is crucial in an ontology building process. In fact, the co-occurrence of a subset of annotations within a subset of Web services re ects implicit relationships that could be taxonomic or non taxonomic between these annotations. To the best of our knowledge, no approach was developed using block-clustering, however, most of the approaches enables either annotations clustering [ 16, 1 ] or services clustering [ 17, 12 ].

The paper is organized as follows. The section 2 reviews related work conducted in the elds of automatic annotation of Web services and block clustering. Section 3 presents our framework for semantic annotation and clustering of Web services. In the section 4, we present and discuss the results of our experimentations. Section 5 concludes the paper and outlines our future work.

1 http://www.obofoundry.org/ Related Work

2.1

Semantic annotation learning for Semantic Web services

Converting an existing Web service into a semantic Web service requires signi cant e ort and must be repeated for each new Web service. We review in this section research work that focus on learning semantic annotations by exploiting textual descriptions, WSDL les or even Web forms. Hess and al. proposes ASSAM (Automated Semantic Annotation with Machine Learning), a semi-automatic WSDL annotator application. ASSAM [ 14 ] relies on a pre-determined domain ontology and uses a machine learning algorithm to provide users with suggestions on how to describe the elements in the WSDL le. However, because of the intensive expert user intervention, applicability of such solution for large-scale annotation of web services could be impractical despite of the fact that these solutions tend to provide high-quality annotations. Sabou et al. [ 23 ] proposes an automatic extraction method based on Natural Language Processing (NLP). Experimentations was conducted in the bioinformatics eld by learning an ontology from the documentation of Web services in the context of the myGrid project. The evaluation of the extracted ontology shows that the approach is a helpful tool to support process of building domain ontologies for Web services. Our approach relies on [ 23 ]'s approach by using also NLP processing techniques to generate semantic annotations of Web services.

Also, within the bioinformatics space, Afzal et al. [ 2 ] developed a text mining approach based on literature to learn semantic pro le of bioinformatics resources. The approach identi es a set of semantic classes of descriptors that could be attached to a bioinformatics resource: data, data resource, task, and algorithm. The instances of these classes were collected by harvesting a corpus of scienti c papers along with related sentences containing the resource name. However, the case study conducted in [ 2 ] shows that the coverage broad of the myGrid ontology used as annotation support is partially limited especially to capture functional service descriptions. The quality of extracted descriptors was only measured from the curator's perspective view which is not accurate in the semantic Web context where Web services are supposed to be discovered and composed by agents.

Ambite and al. [ 3 ] present an approach to automatically discover and create semantic Web services. The idea behind their approach is to start with a set of known sources and the corresponding semantic descriptions and then discover similar sources, extract the source data, build semantic descriptions of the sources, and then turn them into semantic Web services. Authors implemented the Deimos system and evaluated it across ve domains. In contrast to our work, the goal of Deimos is to build a semantic description that is su ciently detailed to support automatic retrieval and composition. Our work aims to generate lightweight annotations useful to classify Web services and bootstrap the service discovery process in the bioinformatics eld.

Web service Clustering

With the expectable growth of the number of available Web services and service repositories, the need for mechanisms that enable the automatic organization and discovery of services becomes increasingly important. In this context, most of the existing research rely on a one-way clustering, either annotations clustering [ 16, 1 ] or services clustering [ 12, 17 ]. When clustering algorithms are used, each service in a given services cluster is described using all annotations. Similarly, each annotation in an annotation cluster characterizes all services. For instance, Based on their approach presented in [ 2 ], Afzal and al. propose in [ 1 ] to use lexical kernel metrics to identify semantically related networks of resources by computing similarity between annotations. However, the goal of our work is to identify groups of services that are more described by a speci c subset of annotations which refers to nd biclusters of services and annotations highly correlated in order to bootstrap the service discovery process. We rely on simultaneous clustering which is an approach enabling to nd local pattern where a subset of subjects might be similar to each other based on only a subset of attributes. Simultaneous clustering, usually designated by biclustering, co-clustering or block clustering aims to nd sub-matrices, which are subgroups of rows and subgroups of columns that exhibit a high correlation. A number of algorithms that perform simultaneous clustering on rows and columns of a matrix have been proposed to date. This type of algorithms has been proposed and used in many elds, such as bioinfomatics [ 18 ], Web mining [ 8 ] and text mining [ 6 ]. Table 1 outlines a comparison between one-way clustering and simultaneous clustering. Clustering Simultaneous Clustering - applied to either the rows or the - performs clustering in the two columns of the data matrix separately dimensions simultaneously ) global model. ) local model. - produce clusters of rows or seeks blocks of rows and clusters of columns. columns that are interrelated. - Each subject in a given subject - Each subject in a bicluster is selected cluster is de ned using all the using only a subset of the variables variables. Each variable in a variable and each variable in a bicluster is selected cluster characterizes all subjects. using only a subset of the subjects. - Clusters are exhaustive - The clusters on rows and columns should not be exclusive and/or exhaustive 3

General Framework

The proposed framework is comprised of two main steps. The rst one aims to perform a semi-automatic semantic annotation extraction from Web services textual documentations. Semantic annotations enables to describe service properties such as functionalities, inputs, outputs, and other domain-dependant features. One particluarity of textual Web service description is that they employ natural language in a speci c way. In fact, such texts belong to what was de ned as sublanguages [ 23 ]. A sublanguage is a specialized form of natural language which is used within a particular domain and characterized by a specialized vocabulary, semantic relations, and syntax (e.g., medical test report). The semantic annotation extraction step exploits the linguistic regularities of a sublanguage to identify semantic service properties. The second step of our approach consists on Web service clustering in terms of semantic annotations. This step allows to discover subgroups (biclusters) of Web services and subgroups of semantic annotations that exhibit a high correlation by applying the CROKI2 algorithm [ 13 ]. In following, we present in further details the two steps. 3.1

Semantic Annotation Extraction of Web services

The semantic annotation extraction phase allows to identify two types of knowledge: domain concepts and procedural knowledge describing services tasks. First, a morphosyntactic analysis of textual description of Web services is performed. In this step, a sentence splitter and a tokeniser components are used to extract sentences and basic linguistic entities. Then, a POS (Part-Of-Speech) Tagger is performed to associate to each word (token) a grammatical category and thus distinguish the morphology of various entities. For example, the sentence below, the tagger identify a verb (i.e., compute), three nouns (i.e., structure, RNA, sequence), an adjective (i.e., secondary ), and a preposition (i.e., for ). compute (VB) Secondary (JJ) Structure (NN) for (Prep) RNA (NN) sequence (NN).

We distinguish di erent types of syntactic patterns depending on the semantic annotation type. Syntactic patterns describe selectional constraints that exploit sublanguages particularities. We distinguish syntactic patterns that allow to extract inputs and outputs of services, services tasks, and domain-dependant features which are strongly related to the bioinformatics domain:

1. Identifying service tasks is crucial for the service discovery and

composition issue. We observed that, in majority of textual descriptions of Web services, verbs identify the functionnality performed by a Web service. In our work, we consider di erent classes of verbs which inform on the service task. For example, VBRetrieval is the class of verbs that indicates a retrieval process (e.g., get, retrieve, fetch, search, nd, return, query ). A frequently occuring pattern which involves this verbs class and the preposition from can be used to easily determine the output and the retrieved resource as described by the following selectional pattern:

VBRetrieval <Output> from <Source>.

Other verb classes were recognized, such as VBExtraction which is a class of verbs denoting an extraction process, VBExtraction=fextract, scan, identify, locate, analyseg. 2. Identifying inputs and outputs of Web services. Inputs and outputs of Web services denote domain concepts which are generally depicted by nouns in the corpus. However, to get high-quality annotations, we create a list of biological terms comprised by a set of single word terms. When two or more biological concepts are used together, we interpret them as a single biological concept and update the list by adding it, i.e., gene expression, transcription factors, protein structure, tertiary protein structure, amino acid sequence, chromosome segment, etc. We de ne di erent heuristics that identify the roles of concepts (input or output) depending on the structure of the sentence. Some extraction patterns are presented in Table 2. Therefore, our extraction patterns identi es cases when several concepts are related via logical operators such as "and ", "or ". In this case, the same role is assigned to each concept. 3. Identifying domain-dependant features. We de ne a set of extraction patterns that focus on bioinformatics-dependant features. For example, we propose patterns to identify data formats (e.g., FASTA, GFF, GIF, etc.) related to inputs/outputs formats. An example of such patterns is described as follows: % computes <OutputService> for % <InputService> described with <dataFormat> %. We propose to use a simultaneous clustering approach to classify Web services in terms of semantic annotations. Our approach aims to nd biclusters of Web services and annotations by applying CROKI2 algorithm [ 13 ]. We propose an accelerated version of this algorithm in [ 7 ]. The general purpose of a block clustering algorithm is described as follows. Given the data matrix A, with set of rows X = (X1; :::; Xn) and set of columns Y = (Y1; :::; Yn), aij , 1 i n and 1 j n is the value in the data matrix A corresponding to row i and column j. Simultaneous clustering algorithms aim to identify a set of biclusters Bk(Ik; Jk), where Ik is a subset of the rows X and Jk is a subset of the columns Y. Ik rows exhibit similar behavior across Jk columns, or vice versa and every bicluster Bk satis es some criteria of homogeneity.

Croki2 algorithm. The Croki2 algorithm is applied to the contingency table composed of services and annotations to identify a row partition P = (P1; :::; PK ) composed of K clusters and a column partition Q = (Q1; :::; QL) composed of L clusters that maximizes X 2 value of the new contingency table (P,Q) obtained by regrouping rows and columns in respectively K and L clusters. Croki2 consists in applying K-means algorithm on rows and on columns alternatively to construct a series of couples of partitions (P n; Qn) that optimizes Chi2 value of the new contingency table T1(P; Q) de ned by this expression: k 2 [1; :::; K] and l 2 [1; :::; L].

Marginal frequencies in table T1 are :

T1(k; l) = X X

aij i2Pk j2Ql fkl = X X

fij i2Pk j2Ql fk: = X f:l = X i2Pk j2Ql fi: f:j Biclusters validity. The application of Croki2 algorithm leads to an exhaustive enumeration of biclusters. It is possible to select only biclusters satisfying certain criteria such as a user-speci ed bicluster size, bicluster homogeneity and bicluster relevancy [ 13 ].

{ Homogeneity H is the inertia conserved by the bicluster divided by the initial inertia. and

H = Bkl=Tkl Tkl = X X fi:f:j (fij =fi:f:j

1)2 i2Pk j2Ql Bkl = gk:g:l(gkl=gk:g:l 1)2 The value of this ratio is between 0 and 1. A high value of this ratio indicates that the bicluster is homogenous. { Relevancy R is the inertia conserved by the bicluster divided by the global inertia.

R = Bkl=B Our experimental corpus consists of 100 bioinformatics services descriptions from the biocatalogue2, a new curated life science Web services repository. The development of Biocatalogue shows the dramatic increase of bioinformatics Web services and tools with 2053 services and 148 providers3. Biocatalogue allows users to discover Web services through keyword-based retrieval or category browsing. Annotations manually attached to Web services are either textual descriptions or lists of tags. Tagging Web services with a set of lexical tokens de ned by users is not a perfect way to enable an e cient service discovery. Manual resource tagging is an error prone and time consuming task. Figure 1 shows the top-20 tags used on biocatalogue. In total, 951 tags were created by users to describe services. The use of tags to describe Web services raises several issues such as the ambiguity of their signi cance (e.g., BioMoby or soaplab in Figure 1), the variability of the spelling for several tags that may refer to the same concept. Finally, the lack of explicit knowledge representations in folksonomies (a set of tags) to express whenever the tag describes for example a service task, service input or output which prevents their use towards a signi cant resource discovery. In our work, Web services are semantically annotated based on their textual descriptions. Extracted semantic annotations enable to automatically construct a semantic service pro le. In following, we evaluate respectively the annotation extraction module and the block clustering algorithm. We designed an annotation extraction module using the GATE [ 10 ] framework. We used the ANNIE plugin (A Nearly-New IE system) which contains a tokeniser, a gazetteer (system of lexicons), a POS Tagger, a sentence Splitter, and a Named Entity (NE) transducer. The various extraction patterns described in section 3.1. were implemented using JAPE [ 11 ], a rich and exible rule mechanism which is part of the GATE framework. The NE transducer applies JAPE

2 http://www.biocatalogue.org 3 Last Access on 22th april 2011

rules to input service descriptions in order to generate semantic annotations. Indeed, JAPE (Java Annotation Patterns) engine provides nite state transduction over annotations based on regular expressions. A JAPE grammar consists of a set pattern/action rules. A JAPE rule has a Left-Hand-Side (LHS) and a Right-hand-Side (RHS). The LHS speci es the annotation pattern that may contain regular expression operators (e.g., *, ?, +). The RHS consists of annotation manipulation statements. Annotations matched on the LHS of a rule are referred to on RHS by means of labels that are attached to patten elements. The gazetteer lookup modules, part of the JAPE engine, enable to identify domain concepts in the textual description based on a set of lists of tokens. We have created di erent lexicons lists containing bioconcepts, service tasks, dataformats and identi ers (e.g., EntrezGene ID, KEGG ID). Figure 2 illustrates an example of JAPE rule for input service annotation.

We evaluate the results of our experimentations in terms of three metrics: precision, recall and F-measure as depicted in Table 3. The three metrics are calculated as follows.

P recision =

Correct + 1=2P artial

Correct + Spurious + 1=2P artial Recall =

Correct + 1=2P artial

Correct + M issing + 1=2P artial F measure = ( 2 + 1)P

R 2R + P

GATE provides an automatic tool for automatic evaluation, named AnnotationDi to compare a set of annotations generated manually and the set of the annotations generated by our extraction method. To measure the performance of the extraction method, we manually identi ed semantic annotations from the service descriptions corpus. Then, using the AnnotationDi Tool, we compared this set of annotations with the ones that were extracted through extraction patterns.

Block Clustering Evaluation

The application of Croki2 algorithm leads to an exhaustive enumeration of biclusters. The data used to evaluate the Croki2 algorithm consists on 98 services and 78 annotations only. The choice of meaningful ones is based on homogeneity and Relevancy as described in the previous section. Given that CROKI2 algorithm uses k-means to cluster rows and columns, the number of clusters needs to be speci ed by user. Therefore, we extend the use of some validity indices, namely BH [ 5 ], proposed initially for one-way clustering to CROKI2 biclustering algorithm [ 9, 7 ]. Accelerated CROKI2 algorithm have been implemented in R environment.

Best biclusters have high values of homogeneity and relevancy ( g.3 and Table 4). For example, biclusters 2, 3, 4 and 6 are the most homogeneous (H=100%) and bicluster 5 is the most relevant (R=10%). Services and annotations that compose each selected bicluster are highly correlated. Each service in a bicluster is described by a subset of annotations and each annotation in a bicluster describe only services belonging to the same bicluster. All biclusters are signi cant from the bioinformatics view. For example, bicluster 1 is comprised by services related to pathway and protein interactions, bicluster 2 is composed of services related only to pairwise sequence alignment, in contrast with bicluster 5 which is comprised by services related to pairwise and multiple sequence alignment. 5

Conclusion

This work is part of our ongoing research work. We propose a semi-automatic approach to learn lightweight semantic annotations given a corpus of textual descriptions of Web services. The conducted experimentations show that the approach allows to generate high-quality annotations, mostly because of the ne-grained extraction rules of the approach and the regularity of the sublanguage used to describe Web services in the bioinformatics domain. Our approach consists on a good starting point towards building domain ontologies. As future work, we aim to develop a methodology of domain ontologies building devoted to semantic annotations of Web services by harvesting textual descriptions, WSDL les, and even existing domain ontologies. The main goal of the methodology would be the automatic construction of semantic Web services. Therefore, one motivation of this work is to facilitate the resource discovery within the bioinformatics domain. Thus, we rely on a block clustering algorithm to determine a set of biclusters of services coupled with a set of semantic annotations highly correlated. The results demonstrate the potential of block clustering to model the relatedness between both resources and annotations which is very prominent in the context of service discovery.

Hammad

Afzal , James Eales, Robert Stevens, and

Goran

Nenadic . Mining semantic networks of bioinformatics e-resources from the literature . In Semantic Web Applications and Tools for Life Sciences (SWAT4LS) Workshop , 2009 .

Hammad

Afzal , Robert Stevens, and

Goran

Nenadic . Mining Semantic Descriptions of Bioinformatics Web Resources from the Literature . In Proceedings of European Semantic Web Conference , pages 535 { 549 , 2009 .

3. Jose Luis Ambite, Sirish Darbha, Aman Goel, Craig A. Knoblock , Kristina Lerman, Rahul Parundekar, and Thomas

Russ . Automatically constructing semantic web services from online sources . In International Semantic Web Conference , volume 5823 of Lecture Notes of Computer Science, pages 17 { 32 . Springer, 2009 .

Nadia

Yacoubi Ayadi , Zoe Lacroix, and Maria-Esther Vidal . Bionmap: a deductive approach for resource discovery . In Proceedings of International Conference on Information Integration and Web-based Applications Services (iiWAS'08) , pages 477 { 482 . ACM, 2008 .

5. Frank

Baker and Lawrence J.

Hubert . Measuring the power of hierachical cluster analysis . Journal of the American Statistical Association , pages 31 { 38 , 1975 .

6. Charles-Edmond Bichot . Co-clustering Documents and Words by Minimizing the Normalized Cut Objective Function . Journal of Mathematical Modelling and Algorithms (JMMA) , 9 ( 2 ): 131 { 147 , June 2010 .

Malika

Charrad . Analyse croisee des sites Web par des methodes de bipartitionnement . Editions Universitaires Europeenne , 2011 .

Malika

Charrad , Yves Lechevallier, Mohamed Ben Ahmed, and

Gilbert

Saporta . Block clustering for web pages categorization . In Proceedings of Intelligent Data Engineering and Automated Learning (IDEAL'2009), number 5788 in Lecture Notes in Computer Science , pages 260 { 267 . Springer, 2009 .

Malika

Charrad , Yves Lechevallier, Mohamed Ben Ahmed, and

Gilbert

Saporta . On the number of clusters in block clustering algorithms . In Proceedings of FLAIRS Conference. AAAI Eds , 2010 .

10. H. Cunningham , D.

Maynard , K.

Bontcheva , and V.

Tablan . GATE: A framework and graphical development environment for robust NLP tools and applications . In Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics , 2002 .

11. H. Cunningham , D.

Maynard , and V.

Tablan . JAPE : a java annotation patterns engine (second edition) . department of computer science, university of she eld, 2000 .

12. Khalid

Elgazzar

Ahmed E.

Hassan , and Patrick Martin. Clustering wsdl documents to bootstrap the discovery of web services . In Proceedings of IEEE International Conference on Web Services (ICWS'10) , pages 147 { 154 , 2010 .

13. G. Govaert. Classi cation croisee . PhD thesis , Paris 6, 1983 .

14. Andreas

, Eddie Johnston , and Nicholas Kushmerick. ASSAM: A tool for semiautomatically annotating semantic web services . In Proceedings of International Semantic Web Conference (ISWC'04) , volume 3298 of LNCS , pages 320 { 334 , 2004 .

15. Jacek

Kopecky

, Tomas Vitvar, Carine Bournez, and

Joel

Farrell . SAWSDL: Semantic annotations for WSDL and XML schemas . IEEE Internet Computing , 11 ( 6 ): 60 { 67 , 2007 .

16. Victor Kunin and Christos A. Ouzounis. Clustering the annotation space of proteins . BMC Bioinformatics , 6 : 24 , 2005 .

17. Jiangang

, Yanchun Zhang, and

Jing

He . E ciently nding web services using a clustering semantic approach . In Proceedings of Context enabled source and service selection, integration and adaptation Workshop , pages 51 { 58 . ACM, 2008 .

18. SC . Madeira and AL. Oliveira . Biclustering algorithms for biological data analysis: A survey . IEEE Transactions on Computational Biology and Bioinformatics , pages 24 { 45 , 2004 .

19. David Martin, Mark Burstein ,

Drew

Mcdermott ,

Sheila

Mcilraith , Massimo Paolucci , Katia Sycara, Deborah L. Mcguinness ,

Evren

Sirin , and

Naveen

Srinivasan . Bringing semantics to web services with OWL-S . World Wide Web, 10 ( 3 ): 243 { 277 , 2007 .

20. Sheila

. McIlraith , Tran Cao Son, and Honglei Zeng . Semantic web services . IEEE Intelligent Systems , 16 : 46 { 53 , 2001 .

21.

Pedrinaci and

Domingue . Toward the next wave of services: Linked services for the web of data . Journal of Universal Computer Science , 16 ( 13 ): 1694 { 1719 , 2010 .

22. Dumitru

Roman

, Uwe Keller, Holger Lausen, Jos de Bruijn, Ruben Lara, Michael Stollberg, Polleres, Cristina Feier, Cristoph Bussler, and

Dieter

Fensel . Web Service Modeling Ontology. Applied Ontology , 1 ( 1 ): 77 { 106 , 2005 .

23. Marta

Sabou

, Chris Wroe, Carole Goble, and

Gilad

Mishne . Learning domain ontologies for web service descriptions: an experiment in bioinformatics . In Proceedings of the 14th international conference on World Wide Web , pages 190 { 198 . ACM, 2005 .

24. Barry

Smith

Michael

Ashburner , Cornelius Rosse, Jonathan Bard,

William

Bug , Werner Ceusters,

Louis J.

Goldberg , Karen Eilbeck, Amelia Ireland,

Christopher J.

Mungall , Neocles Leontis, Philippe Rocca-Serra, Alan Ruttenberg, SusannaAssunta Sansone, Richard H. Scheuermann, Nigam Shah, Patricia L. Whetzel , and Suzanna Lewis . The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration . Nature Biotechnology , 25 ( 11 ): 1251 { 1255 , November 2007 .