<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Framework for Resource Annotation and Classi cation in Bioinformatics</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nadia Yacoubi Ayadiy</string-name>
          <email>nadia.yacoubi@asu.edu</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Malika Charrady</string-name>
          <email>malika.charrad@riadi.rnu.tn</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Soumaya Amdouniz</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mohamed Ben ahmedy</string-name>
          <email>mohamed.benahmed@riadi.rnu.tn</email>
        </contrib>
      </contrib-group>
      <abstract>
        <p>Semantic annotation is commonly recognized as one of the cornerstones of the semantic Web. In the context of Web services, semantic annotations can support e ective and e cient discovery of services, and guide their composition into work ows. Because semantic annotation is a time consuming and expensive task, (semi-)automatic approaches for semantic annotation extraction are required. In this paper, we propose a semi-automatic extraction approach of lightweight semantic annotations from textual description of Web services. In contrast with most of the existing semi-automatic approaches for semantic annotations of Web services which rely on a prede ned domain ontology, we investigate the use of NLP techniques to derive service properties given a corpus of textual description of bioinformatics services. We evaluate the performance of the annotation extraction method and the importance of lightweight annotations to classify bioinformatics Web services in order to bootstrap the service discovery process. Our framework relies an unsupervised clustering approach based on a simultaneous clustering algorithm that enables to determine biclusters of Web services and semantic annotations highly correlated.</p>
      </abstract>
      <kwd-group>
        <kwd>Semantic Annotation</kwd>
        <kwd>Semantic Web Service</kwd>
        <kwd>Block Clustering</kwd>
        <kwd>Bioinformatics</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        During the last decade, semantic Web services (SWS) [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] technology have been
proposed and investigated to support e ective and e cient service discovery,
composition and invocation by machines. Despite the appealing characteristics
of semantic Web services principles, their uptake on a Web-scale has been signi
cantly less prominent than initially anticipated [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. In fact, research on semantic
Web services has mostly focused on devising domain-independent Web service
description ontologies such as OWL-S [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] and WSMO [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. Semantic
Annotations for WSDL (SAWSDL) [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] adopts a bottom-up approach by adding
semantics to existing Web service standards through mapping syntactic de nitions to
a set of ontological concepts. All of these approaches rely on a pre-determined
domain ontology to explicit service semantics. Reasoning tasks performed with
semantic Web service descriptions is mainly conditioned by the quality of this
domain ontology [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The existence of a domain ontology to capture domain
knowledge in an explicit and formal way is crucial. In several elds, many domain
ontologies have been developed for several purposes. The complexity of
reasoning tasks increases when semantic service descriptions are generated by means
of several domain ontologies. In the bioinformatics eld, the OBO foundary1
lists around 60 ontologies for life sciences including molecular biology, anatomy,
biochemistry, environment, neuroscience, etc. (for a survey, see [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]). None of
these ontologies is suitable to annotate bioinformatics Web services; although,
they are rich in semantics but not enough generic to capture high-level concepts
and their semantic relationships.
      </p>
      <p>In this paper, we propose a bottom-up approach to extract domain-dependant
lightweight semantic annotation from textual description of Web services. Such
annotations of Web services aims to capture static (i.e., domain concepts) and
procedural knowledge (i.e., tasks) of a domain. Despite their importance, few
domain ontologies exist for the purpose of Web services annotation, and thus,
building such ontologies is a challenging task. Natural language documentations of
Web services are short textual descriptions intended to close the "semantic gap"
between low-level technical features of Web services (e.g., data types, port types,
or data formats) and the high-level, meaning-bearing features a user is interested
in and refers to when discovering a Web service. Hence, our semi-automatic
approach combines di erent extraction patterns to generate lighweight annotations
describing service properties such as inputs, outputs, or functionnalities. We
notice that our extraction method provides a good starting point for ontology
building.</p>
      <p>
        Therefore, we rely on a simultaneous clustering algorithm, namely CROKI2
[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], to identify clusters (groups) of services that are described by a speci c
subset of highly correlated annotations. Simultaneous clustering step has two
bene ts. Firsly, clustering Web services based on semantic annotations would
greatly boost the ability of Web services search engines to select suitable services
given a discovery query. Secondly, it enables to detect implicit associations
(relationships) between highly correlated annotations which is crucial in an ontology
building process. In fact, the co-occurrence of a subset of annotations within a
subset of Web services re ects implicit relationships that could be taxonomic
or non taxonomic between these annotations. To the best of our knowledge, no
approach was developed using block-clustering, however, most of the approaches
enables either annotations clustering [
        <xref ref-type="bibr" rid="ref1 ref16">16, 1</xref>
        ] or services clustering [
        <xref ref-type="bibr" rid="ref12 ref17">17, 12</xref>
        ].
      </p>
      <p>The paper is organized as follows. The section 2 reviews related work
conducted in the elds of automatic annotation of Web services and block clustering.
Section 3 presents our framework for semantic annotation and clustering of Web
services. In the section 4, we present and discuss the results of our
experimentations. Section 5 concludes the paper and outlines our future work.</p>
    </sec>
    <sec id="sec-2">
      <title>1 http://www.obofoundry.org/</title>
      <sec id="sec-2-1">
        <title>Related Work</title>
        <p>2.1</p>
        <sec id="sec-2-1-1">
          <title>Semantic annotation learning for Semantic Web services</title>
          <p>
            Converting an existing Web service into a semantic Web service requires signi
cant e ort and must be repeated for each new Web service. We review in this
section research work that focus on learning semantic annotations by exploiting
textual descriptions, WSDL les or even Web forms. Hess and al. proposes ASSAM
(Automated Semantic Annotation with Machine Learning), a semi-automatic
WSDL annotator application. ASSAM [
            <xref ref-type="bibr" rid="ref14">14</xref>
            ] relies on a pre-determined domain
ontology and uses a machine learning algorithm to provide users with
suggestions on how to describe the elements in the WSDL le. However, because of the
intensive expert user intervention, applicability of such solution for large-scale
annotation of web services could be impractical despite of the fact that these
solutions tend to provide high-quality annotations. Sabou et al. [
            <xref ref-type="bibr" rid="ref23">23</xref>
            ] proposes
an automatic extraction method based on Natural Language Processing (NLP).
Experimentations was conducted in the bioinformatics eld by learning an
ontology from the documentation of Web services in the context of the myGrid
project. The evaluation of the extracted ontology shows that the approach is a
helpful tool to support process of building domain ontologies for Web services.
Our approach relies on [
            <xref ref-type="bibr" rid="ref23">23</xref>
            ]'s approach by using also NLP processing techniques
to generate semantic annotations of Web services.
          </p>
          <p>
            Also, within the bioinformatics space, Afzal et al. [
            <xref ref-type="bibr" rid="ref2">2</xref>
            ] developed a text mining
approach based on literature to learn semantic pro le of bioinformatics resources.
The approach identi es a set of semantic classes of descriptors that could be
attached to a bioinformatics resource: data, data resource, task, and algorithm.
The instances of these classes were collected by harvesting a corpus of scienti c
papers along with related sentences containing the resource name. However, the
case study conducted in [
            <xref ref-type="bibr" rid="ref2">2</xref>
            ] shows that the coverage broad of the myGrid ontology
used as annotation support is partially limited especially to capture functional
service descriptions. The quality of extracted descriptors was only measured from
the curator's perspective view which is not accurate in the semantic Web context
where Web services are supposed to be discovered and composed by agents.
          </p>
          <p>
            Ambite and al. [
            <xref ref-type="bibr" rid="ref3">3</xref>
            ] present an approach to automatically discover and
create semantic Web services. The idea behind their approach is to start with a
set of known sources and the corresponding semantic descriptions and then
discover similar sources, extract the source data, build semantic descriptions of the
sources, and then turn them into semantic Web services. Authors implemented
the Deimos system and evaluated it across ve domains. In contrast to our
work, the goal of Deimos is to build a semantic description that is su ciently
detailed to support automatic retrieval and composition. Our work aims to
generate lightweight annotations useful to classify Web services and bootstrap the
service discovery process in the bioinformatics eld.
          </p>
        </sec>
        <sec id="sec-2-1-2">
          <title>Web service Clustering</title>
          <p>
            With the expectable growth of the number of available Web services and service
repositories, the need for mechanisms that enable the automatic organization
and discovery of services becomes increasingly important. In this context, most
of the existing research rely on a one-way clustering, either annotations clustering
[
            <xref ref-type="bibr" rid="ref1 ref16">16, 1</xref>
            ] or services clustering [
            <xref ref-type="bibr" rid="ref12 ref17">12, 17</xref>
            ]. When clustering algorithms are used, each
service in a given services cluster is described using all annotations. Similarly,
each annotation in an annotation cluster characterizes all services. For instance,
Based on their approach presented in [
            <xref ref-type="bibr" rid="ref2">2</xref>
            ], Afzal and al. propose in [
            <xref ref-type="bibr" rid="ref1">1</xref>
            ] to use
lexical kernel metrics to identify semantically related networks of resources by
computing similarity between annotations. However, the goal of our work is to
identify groups of services that are more described by a speci c subset of
annotations which refers to nd biclusters of services and annotations highly correlated
in order to bootstrap the service discovery process. We rely on simultaneous
clustering which is an approach enabling to nd local pattern where a subset of
subjects might be similar to each other based on only a subset of attributes.
Simultaneous clustering, usually designated by biclustering, co-clustering or block
clustering aims to nd sub-matrices, which are subgroups of rows and subgroups
of columns that exhibit a high correlation. A number of algorithms that perform
simultaneous clustering on rows and columns of a matrix have been proposed to
date. This type of algorithms has been proposed and used in many elds, such
as bioinfomatics [
            <xref ref-type="bibr" rid="ref18">18</xref>
            ], Web mining [
            <xref ref-type="bibr" rid="ref8">8</xref>
            ] and text mining [
            <xref ref-type="bibr" rid="ref6">6</xref>
            ]. Table 1 outlines a
comparison between one-way clustering and simultaneous clustering.
Clustering Simultaneous Clustering
- applied to either the rows or the - performs clustering in the two
columns of the data matrix separately dimensions simultaneously
) global model. ) local model.
- produce clusters of rows or seeks blocks of rows and
clusters of columns. columns that are interrelated.
- Each subject in a given subject - Each subject in a bicluster is selected
cluster is de ned using all the using only a subset of the variables
variables. Each variable in a variable and each variable in a bicluster is selected
cluster characterizes all subjects. using only a subset of the subjects.
- Clusters are exhaustive - The clusters on rows and columns should
not be exclusive and/or exhaustive
3
          </p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>General Framework</title>
        <p>
          The proposed framework is comprised of two main steps. The rst one aims to
perform a semi-automatic semantic annotation extraction from Web services
textual documentations. Semantic annotations enables to describe service properties
such as functionalities, inputs, outputs, and other domain-dependant features.
One particluarity of textual Web service description is that they employ natural
language in a speci c way. In fact, such texts belong to what was de ned as
sublanguages [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ]. A sublanguage is a specialized form of natural language which
is used within a particular domain and characterized by a specialized
vocabulary, semantic relations, and syntax (e.g., medical test report). The semantic
annotation extraction step exploits the linguistic regularities of a sublanguage
to identify semantic service properties. The second step of our approach consists
on Web service clustering in terms of semantic annotations. This step allows
to discover subgroups (biclusters) of Web services and subgroups of semantic
annotations that exhibit a high correlation by applying the CROKI2 algorithm
[
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. In following, we present in further details the two steps.
3.1
        </p>
        <sec id="sec-2-2-1">
          <title>Semantic Annotation Extraction of Web services</title>
          <p>The semantic annotation extraction phase allows to identify two types of
knowledge: domain concepts and procedural knowledge describing services tasks. First,
a morphosyntactic analysis of textual description of Web services is performed.
In this step, a sentence splitter and a tokeniser components are used to extract
sentences and basic linguistic entities. Then, a POS (Part-Of-Speech) Tagger is
performed to associate to each word (token) a grammatical category and thus
distinguish the morphology of various entities. For example, the sentence
below, the tagger identify a verb (i.e., compute), three nouns (i.e., structure, RNA,
sequence), an adjective (i.e., secondary ), and a preposition (i.e., for ).
compute (VB) Secondary (JJ) Structure (NN) for (Prep) RNA (NN) sequence (NN).</p>
          <p>We distinguish di erent types of syntactic patterns depending on the
semantic annotation type. Syntactic patterns describe selectional constraints that
exploit sublanguages particularities. We distinguish syntactic patterns that allow
to extract inputs and outputs of services, services tasks, and domain-dependant
features which are strongly related to the bioinformatics domain:</p>
        </sec>
        <sec id="sec-2-2-2">
          <title>1. Identifying service tasks is crucial for the service discovery and</title>
          <p>composition issue. We observed that, in majority of textual descriptions
of Web services, verbs identify the functionnality performed by a Web service.
In our work, we consider di erent classes of verbs which inform on the service
task. For example, VBRetrieval is the class of verbs that indicates a retrieval
process (e.g., get, retrieve, fetch, search, nd, return, query ). A frequently
occuring pattern which involves this verbs class and the preposition from
can be used to easily determine the output and the retrieved resource as
described by the following selectional pattern:</p>
          <p>VBRetrieval &lt;Output&gt; from &lt;Source&gt;.</p>
          <p>
            Other verb classes were recognized, such as VBExtraction which is a class of
verbs denoting an extraction process, VBExtraction=fextract, scan, identify,
locate, analyseg.
2. Identifying inputs and outputs of Web services. Inputs and outputs
of Web services denote domain concepts which are generally depicted by
nouns in the corpus. However, to get high-quality annotations, we create a
list of biological terms comprised by a set of single word terms. When two
or more biological concepts are used together, we interpret them as a
single biological concept and update the list by adding it, i.e., gene expression,
transcription factors, protein structure, tertiary protein structure, amino acid
sequence, chromosome segment, etc. We de ne di erent heuristics that
identify the roles of concepts (input or output) depending on the structure of
the sentence. Some extraction patterns are presented in Table 2. Therefore,
our extraction patterns identi es cases when several concepts are related via
logical operators such as "and ", "or ". In this case, the same role is assigned
to each concept.
3. Identifying domain-dependant features. We de ne a set of extraction
patterns that focus on bioinformatics-dependant features. For example, we
propose patterns to identify data formats (e.g., FASTA, GFF, GIF, etc.)
related to inputs/outputs formats. An example of such patterns is described as
follows: % computes &lt;OutputService&gt; for % &lt;InputService&gt; described
with &lt;dataFormat&gt; %.
We propose to use a simultaneous clustering approach to classify Web services
in terms of semantic annotations. Our approach aims to nd biclusters of Web
services and annotations by applying CROKI2 algorithm [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ]. We propose an
accelerated version of this algorithm in [
            <xref ref-type="bibr" rid="ref7">7</xref>
            ]. The general purpose of a block
clustering algorithm is described as follows. Given the data matrix A, with set of
rows X = (X1; :::; Xn) and set of columns Y = (Y1; :::; Yn), aij , 1 i n and
1 j n is the value in the data matrix A corresponding to row i and column j.
Simultaneous clustering algorithms aim to identify a set of biclusters Bk(Ik; Jk),
where Ik is a subset of the rows X and Jk is a subset of the columns Y. Ik rows
exhibit similar behavior across Jk columns, or vice versa and every bicluster Bk
satis es some criteria of homogeneity.
          </p>
          <p>Croki2 algorithm. The Croki2 algorithm is applied to the contingency table
composed of services and annotations to identify a row partition P = (P1; :::; PK )
composed of K clusters and a column partition Q = (Q1; :::; QL) composed of L
clusters that maximizes X 2 value of the new contingency table (P,Q) obtained by
regrouping rows and columns in respectively K and L clusters. Croki2 consists in
applying K-means algorithm on rows and on columns alternatively to construct
a series of couples of partitions (P n; Qn) that optimizes Chi2 value of the new
contingency table T1(P; Q) de ned by this expression:
k 2 [1; :::; K] and l 2 [1; :::; L].</p>
          <p>Marginal frequencies in table T1 are :</p>
          <p>T1(k; l) = X X</p>
          <p>aij
i2Pk j2Ql
fkl = X X</p>
          <p>
            fij
i2Pk j2Ql
fk: = X
f:l = X
i2Pk
j2Ql
fi:
f:j
Biclusters validity. The application of Croki2 algorithm leads to an exhaustive
enumeration of biclusters. It is possible to select only biclusters satisfying certain
criteria such as a user-speci ed bicluster size, bicluster homogeneity and bicluster
relevancy [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ].
          </p>
          <p>{ Homogeneity H is the inertia conserved by the bicluster divided by the initial
inertia.
and</p>
          <p>H = Bkl=Tkl
Tkl = X X fi:f:j (fij =fi:f:j</p>
          <p>1)2
i2Pk j2Ql
Bkl = gk:g:l(gkl=gk:g:l
1)2
The value of this ratio is between 0 and 1. A high value of this ratio indicates
that the bicluster is homogenous.
{ Relevancy R is the inertia conserved by the bicluster divided by the global
inertia.</p>
          <p>
            R = Bkl=B
Our experimental corpus consists of 100 bioinformatics services descriptions from
the biocatalogue2, a new curated life science Web services repository. The
development of Biocatalogue shows the dramatic increase of bioinformatics Web
services and tools with 2053 services and 148 providers3. Biocatalogue allows users
to discover Web services through keyword-based retrieval or category browsing.
Annotations manually attached to Web services are either textual descriptions or
lists of tags. Tagging Web services with a set of lexical tokens de ned by users
is not a perfect way to enable an e cient service discovery. Manual resource
tagging is an error prone and time consuming task. Figure 1 shows the top-20
tags used on biocatalogue. In total, 951 tags were created by users to describe
services. The use of tags to describe Web services raises several issues such as
the ambiguity of their signi cance (e.g., BioMoby or soaplab in Figure 1), the
variability of the spelling for several tags that may refer to the same concept.
Finally, the lack of explicit knowledge representations in folksonomies (a set of
tags) to express whenever the tag describes for example a service task, service
input or output which prevents their use towards a signi cant resource
discovery. In our work, Web services are semantically annotated based on their textual
descriptions. Extracted semantic annotations enable to automatically construct
a semantic service pro le. In following, we evaluate respectively the annotation
extraction module and the block clustering algorithm.
We designed an annotation extraction module using the GATE [
            <xref ref-type="bibr" rid="ref10">10</xref>
            ] framework.
We used the ANNIE plugin (A Nearly-New IE system) which contains a
tokeniser, a gazetteer (system of lexicons), a POS Tagger, a sentence Splitter, and
a Named Entity (NE) transducer. The various extraction patterns described in
section 3.1. were implemented using JAPE [
            <xref ref-type="bibr" rid="ref11">11</xref>
            ], a rich and exible rule
mechanism which is part of the GATE framework. The NE transducer applies JAPE
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>2 http://www.biocatalogue.org 3 Last Access on 22th april 2011</title>
      <p>rules to input service descriptions in order to generate semantic annotations.
Indeed, JAPE (Java Annotation Patterns) engine provides nite state
transduction over annotations based on regular expressions. A JAPE grammar consists
of a set pattern/action rules. A JAPE rule has a Left-Hand-Side (LHS) and
a Right-hand-Side (RHS). The LHS speci es the annotation pattern that may
contain regular expression operators (e.g., *, ?, +). The RHS consists of
annotation manipulation statements. Annotations matched on the LHS of a rule are
referred to on RHS by means of labels that are attached to patten elements. The
gazetteer lookup modules, part of the JAPE engine, enable to identify domain
concepts in the textual description based on a set of lists of tokens. We have
created di erent lexicons lists containing bioconcepts, service tasks, dataformats
and identi ers (e.g., EntrezGene ID, KEGG ID). Figure 2 illustrates an example
of JAPE rule for input service annotation.</p>
      <p>We evaluate the results of our experimentations in terms of three metrics:
precision, recall and F-measure as depicted in Table 3. The three metrics are
calculated as follows.</p>
      <p>P recision =</p>
      <p>Correct + 1=2P artial</p>
      <p>Correct + Spurious + 1=2P artial
Recall =</p>
      <p>Correct + 1=2P artial</p>
      <p>Correct + M issing + 1=2P artial
F
measure =
( 2 + 1)P</p>
      <p>R
2R + P</p>
      <p>GATE provides an automatic tool for automatic evaluation, named
AnnotationDi to compare a set of annotations generated manually and the set of the
annotations generated by our extraction method. To measure the performance
of the extraction method, we manually identi ed semantic annotations from the
service descriptions corpus. Then, using the AnnotationDi Tool, we compared
this set of annotations with the ones that were extracted through extraction
patterns.</p>
      <sec id="sec-3-1">
        <title>Block Clustering Evaluation</title>
        <p>
          The application of Croki2 algorithm leads to an exhaustive enumeration of
biclusters. The data used to evaluate the Croki2 algorithm consists on 98 services
and 78 annotations only. The choice of meaningful ones is based on homogeneity
and Relevancy as described in the previous section. Given that CROKI2
algorithm uses k-means to cluster rows and columns, the number of clusters needs
to be speci ed by user. Therefore, we extend the use of some validity indices,
namely BH [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], proposed initially for one-way clustering to CROKI2
biclustering algorithm [
          <xref ref-type="bibr" rid="ref7 ref9">9, 7</xref>
          ]. Accelerated CROKI2 algorithm have been implemented in
R environment.
        </p>
        <p>Best biclusters have high values of homogeneity and relevancy ( g.3 and
Table 4). For example, biclusters 2, 3, 4 and 6 are the most homogeneous (H=100%)
and bicluster 5 is the most relevant (R=10%). Services and annotations that
compose each selected bicluster are highly correlated. Each service in a bicluster
is described by a subset of annotations and each annotation in a bicluster
describe only services belonging to the same bicluster. All biclusters are signi cant
from the bioinformatics view. For example, bicluster 1 is comprised by services
related to pathway and protein interactions, bicluster 2 is composed of services
related only to pairwise sequence alignment, in contrast with bicluster 5 which
is comprised by services related to pairwise and multiple sequence alignment.
5</p>
        <sec id="sec-3-1-1">
          <title>Conclusion</title>
          <p>This work is part of our ongoing research work. We propose a semi-automatic
approach to learn lightweight semantic annotations given a corpus of textual
descriptions of Web services. The conducted experimentations show that the
approach allows to generate high-quality annotations, mostly because of the
ne-grained extraction rules of the approach and the regularity of the
sublanguage used to describe Web services in the bioinformatics domain. Our approach
consists on a good starting point towards building domain ontologies. As future
work, we aim to develop a methodology of domain ontologies building devoted to
semantic annotations of Web services by harvesting textual descriptions, WSDL
les, and even existing domain ontologies. The main goal of the methodology
would be the automatic construction of semantic Web services. Therefore, one
motivation of this work is to facilitate the resource discovery within the
bioinformatics domain. Thus, we rely on a block clustering algorithm to determine
a set of biclusters of services coupled with a set of semantic annotations highly
correlated. The results demonstrate the potential of block clustering to model
the relatedness between both resources and annotations which is very prominent
in the context of service discovery.</p>
        </sec>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Hammad</given-names>
            <surname>Afzal</surname>
          </string-name>
          , James Eales, Robert Stevens, and
          <string-name>
            <given-names>Goran</given-names>
            <surname>Nenadic</surname>
          </string-name>
          .
          <article-title>Mining semantic networks of bioinformatics e-resources from the literature</article-title>
          .
          <source>In Semantic Web Applications and Tools for Life Sciences (SWAT4LS) Workshop</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Hammad</given-names>
            <surname>Afzal</surname>
          </string-name>
          , Robert Stevens, and
          <string-name>
            <given-names>Goran</given-names>
            <surname>Nenadic</surname>
          </string-name>
          .
          <article-title>Mining Semantic Descriptions of Bioinformatics Web Resources from the Literature</article-title>
          .
          <source>In Proceedings of European Semantic Web Conference</source>
          , pages
          <volume>535</volume>
          {
          <fpage>549</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3. Jose Luis Ambite, Sirish Darbha, Aman Goel, Craig A.
          <string-name>
            <surname>Knoblock</surname>
          </string-name>
          , Kristina Lerman, Rahul Parundekar, and
          <string-name>
            <surname>Thomas</surname>
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Russ</surname>
          </string-name>
          .
          <article-title>Automatically constructing semantic web services from online sources</article-title>
          .
          <source>In International Semantic Web Conference</source>
          , volume
          <volume>5823</volume>
          of Lecture Notes of Computer Science, pages
          <volume>17</volume>
          {
          <fpage>32</fpage>
          . Springer,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Nadia</given-names>
            <surname>Yacoubi</surname>
          </string-name>
          <string-name>
            <given-names>Ayadi</given-names>
            , Zoe Lacroix, and
            <surname>Maria-Esther Vidal</surname>
          </string-name>
          .
          <article-title>Bionmap: a deductive approach for resource discovery</article-title>
          .
          <source>In Proceedings of International Conference on Information Integration and Web-based Applications Services (iiWAS'08)</source>
          , pages
          <fpage>477</fpage>
          {
          <fpage>482</fpage>
          . ACM,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Frank</surname>
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Baker</surname>
            and
            <given-names>Lawrence J.</given-names>
          </string-name>
          <string-name>
            <surname>Hubert</surname>
          </string-name>
          .
          <article-title>Measuring the power of hierachical cluster analysis</article-title>
          .
          <source>Journal of the American Statistical Association</source>
          , pages
          <volume>31</volume>
          {
          <fpage>38</fpage>
          ,
          <year>1975</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Charles-Edmond Bichot</surname>
          </string-name>
          .
          <article-title>Co-clustering Documents and Words by Minimizing the Normalized Cut Objective Function</article-title>
          .
          <source>Journal of Mathematical Modelling and Algorithms (JMMA)</source>
          ,
          <volume>9</volume>
          (
          <issue>2</issue>
          ):
          <volume>131</volume>
          {
          <fpage>147</fpage>
          ,
          <year>June 2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Malika</given-names>
            <surname>Charrad</surname>
          </string-name>
          .
          <article-title>Analyse croisee des sites Web par des methodes de bipartitionnement</article-title>
          .
          <source>Editions Universitaires Europeenne</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Malika</given-names>
            <surname>Charrad</surname>
          </string-name>
          , Yves Lechevallier, Mohamed Ben Ahmed, and
          <string-name>
            <given-names>Gilbert</given-names>
            <surname>Saporta</surname>
          </string-name>
          .
          <article-title>Block clustering for web pages categorization</article-title>
          .
          <source>In Proceedings of Intelligent Data Engineering and Automated Learning (IDEAL'2009), number 5788 in Lecture Notes in Computer Science</source>
          , pages
          <volume>260</volume>
          {
          <fpage>267</fpage>
          . Springer,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>Malika</given-names>
            <surname>Charrad</surname>
          </string-name>
          , Yves Lechevallier, Mohamed Ben Ahmed, and
          <string-name>
            <given-names>Gilbert</given-names>
            <surname>Saporta</surname>
          </string-name>
          .
          <article-title>On the number of clusters in block clustering algorithms</article-title>
          .
          <source>In Proceedings of FLAIRS Conference. AAAI Eds</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. H.
          <string-name>
            <surname>Cunningham</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Maynard</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Bontcheva</surname>
            , and
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Tablan</surname>
          </string-name>
          .
          <article-title>GATE: A framework and graphical development environment for robust NLP tools and applications</article-title>
          .
          <source>In Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics</source>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. H.
          <string-name>
            <surname>Cunningham</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Maynard</surname>
            , and
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Tablan</surname>
          </string-name>
          .
          <article-title>JAPE : a java annotation patterns engine (second edition)</article-title>
          . department of computer science, university of she eld,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Khalid</surname>
            <given-names>Elgazzar</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Ahmed E.</given-names>
            <surname>Hassan</surname>
          </string-name>
          , and Patrick Martin.
          <article-title>Clustering wsdl documents to bootstrap the discovery of web services</article-title>
          .
          <source>In Proceedings of IEEE International Conference on Web Services (ICWS'10)</source>
          , pages
          <fpage>147</fpage>
          {
          <fpage>154</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13. G. Govaert.
          <article-title>Classi cation croisee</article-title>
          .
          <source>PhD thesis</source>
          , Paris 6,
          <year>1983</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Andreas</surname>
            <given-names>He</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eddie Johnston</surname>
          </string-name>
          , and Nicholas Kushmerick.
          <article-title>ASSAM: A tool for semiautomatically annotating semantic web services</article-title>
          .
          <source>In Proceedings of International Semantic Web Conference (ISWC'04)</source>
          , volume
          <volume>3298</volume>
          <source>of LNCS</source>
          , pages
          <volume>320</volume>
          {
          <fpage>334</fpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Jacek</surname>
            <given-names>Kopecky</given-names>
          </string-name>
          , Tomas Vitvar, Carine Bournez, and
          <string-name>
            <given-names>Joel</given-names>
            <surname>Farrell</surname>
          </string-name>
          . SAWSDL:
          <article-title>Semantic annotations for WSDL and XML schemas</article-title>
          .
          <source>IEEE Internet Computing</source>
          ,
          <volume>11</volume>
          (
          <issue>6</issue>
          ):
          <volume>60</volume>
          {
          <fpage>67</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <article-title>Victor Kunin and Christos A. Ouzounis. Clustering the annotation space of proteins</article-title>
          .
          <source>BMC Bioinformatics</source>
          ,
          <volume>6</volume>
          :
          <fpage>24</fpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Jiangang</surname>
            <given-names>Ma</given-names>
          </string-name>
          , Yanchun Zhang, and
          <string-name>
            <given-names>Jing</given-names>
            <surname>He</surname>
          </string-name>
          .
          <article-title>E ciently nding web services using a clustering semantic approach</article-title>
          .
          <source>In Proceedings of Context enabled source and service selection, integration and adaptation Workshop</source>
          , pages
          <volume>51</volume>
          {
          <fpage>58</fpage>
          . ACM,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>SC</surname>
          </string-name>
          . Madeira and AL.
          <string-name>
            <surname>Oliveira</surname>
          </string-name>
          .
          <article-title>Biclustering algorithms for biological data analysis: A survey</article-title>
          .
          <source>IEEE Transactions on Computational Biology and Bioinformatics</source>
          , pages
          <fpage>24</fpage>
          {
          <fpage>45</fpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19. David Martin,
          <string-name>
            <surname>Mark Burstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Drew</given-names>
            <surname>Mcdermott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Sheila</given-names>
            <surname>Mcilraith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Massimo Paolucci</given-names>
            , Katia Sycara, Deborah L.
            <surname>Mcguinness</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Evren</given-names>
            <surname>Sirin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Naveen</given-names>
            <surname>Srinivasan</surname>
          </string-name>
          .
          <article-title>Bringing semantics to web services with OWL-S</article-title>
          . World Wide Web,
          <volume>10</volume>
          (
          <issue>3</issue>
          ):
          <volume>243</volume>
          {
          <fpage>277</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Sheila</surname>
            <given-names>A</given-names>
          </string-name>
          .
          <string-name>
            <surname>McIlraith</surname>
            , Tran Cao Son, and
            <given-names>Honglei</given-names>
          </string-name>
          <string-name>
            <surname>Zeng</surname>
          </string-name>
          .
          <article-title>Semantic web services</article-title>
          .
          <source>IEEE Intelligent Systems</source>
          ,
          <volume>16</volume>
          :
          <fpage>46</fpage>
          {
          <fpage>53</fpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <given-names>C.</given-names>
            <surname>Pedrinaci</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Domingue</surname>
          </string-name>
          .
          <article-title>Toward the next wave of services: Linked services for the web of data</article-title>
          .
          <source>Journal of Universal Computer Science</source>
          ,
          <volume>16</volume>
          (
          <issue>13</issue>
          ):
          <volume>1694</volume>
          {
          <fpage>1719</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Dumitru</surname>
            <given-names>Roman</given-names>
          </string-name>
          , Uwe Keller, Holger Lausen, Jos de Bruijn, Ruben Lara, Michael Stollberg, Polleres, Cristina Feier, Cristoph Bussler, and
          <string-name>
            <given-names>Dieter</given-names>
            <surname>Fensel</surname>
          </string-name>
          .
          <source>Web Service Modeling Ontology. Applied Ontology</source>
          ,
          <volume>1</volume>
          (
          <issue>1</issue>
          ):
          <volume>77</volume>
          {
          <fpage>106</fpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Marta</surname>
            <given-names>Sabou</given-names>
          </string-name>
          , Chris Wroe, Carole Goble, and
          <string-name>
            <given-names>Gilad</given-names>
            <surname>Mishne</surname>
          </string-name>
          .
          <article-title>Learning domain ontologies for web service descriptions: an experiment in bioinformatics</article-title>
          .
          <source>In Proceedings of the 14th international conference on World Wide Web</source>
          , pages
          <volume>190</volume>
          {
          <fpage>198</fpage>
          . ACM,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Barry</surname>
            <given-names>Smith</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Michael</given-names>
            <surname>Ashburner</surname>
          </string-name>
          , Cornelius Rosse, Jonathan Bard,
          <string-name>
            <given-names>William</given-names>
            <surname>Bug</surname>
          </string-name>
          , Werner Ceusters,
          <string-name>
            <given-names>Louis J.</given-names>
            <surname>Goldberg</surname>
          </string-name>
          , Karen Eilbeck, Amelia Ireland,
          <string-name>
            <given-names>Christopher J.</given-names>
            <surname>Mungall</surname>
          </string-name>
          , Neocles Leontis, Philippe Rocca-Serra, Alan Ruttenberg, SusannaAssunta Sansone, Richard H. Scheuermann, Nigam Shah, Patricia L.
          <string-name>
            <surname>Whetzel</surname>
            , and
            <given-names>Suzanna</given-names>
          </string-name>
          <string-name>
            <surname>Lewis</surname>
          </string-name>
          .
          <article-title>The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration</article-title>
          .
          <source>Nature Biotechnology</source>
          ,
          <volume>25</volume>
          (
          <issue>11</issue>
          ):
          <volume>1251</volume>
          {
          <fpage>1255</fpage>
          ,
          <string-name>
            <surname>November</surname>
          </string-name>
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>