<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards Evaluating the Impact of Semantic Support for Curating the Fungus Scienti c Literature</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Marie-Jean Meurs</string-name>
          <email>mjmeurs@encs.concordia.ca</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Caitlin Murphy</string-name>
          <email>cmurphy@gene.concordia.ca</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nona Naderi</string-name>
          <email>nad@encs.concordia.ca</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ingo Morgenstern</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Carolina Cantu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shary Semarjit</string-name>
          <email>ssharyg@gene.concordia.ca</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Greg Butler</string-name>
          <email>gregb@encs.concordia.ca</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Justin Powlowski</string-name>
          <email>powlow@alcor.concordia.ca</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Adrian Tsang</string-name>
          <email>tsang@gene.concordia.ca</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rene Witte</string-name>
          <email>rwitte@cse.concordia.ca</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Centre for Structural and Functional Genomics</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Biology</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Department of Chemistry and Biochemistry Concordia University</institution>
          ,
          <addr-line>Montreal, QC</addr-line>
          ,
          <country country="CA">Canada</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Department of Computer Science and Software Engineering</institution>
        </aff>
      </contrib-group>
      <fpage>34</fpage>
      <lpage>39</lpage>
      <abstract>
        <p>We present our ongoing development of a semantic infrastructure supporting biofuel research. Part of this e ort is the automatic curation of knowledge from the massive amount of information on fungal enzymes that is available in genomics. Working closely with biologists who manually curate the existing literature, we developed ontological NLP pipelines, integrated through Web-based interfaces, to help them in two main tasks: spending less time to mine the literature for facts, while also being provided with richer and semantically linked information. An ongoing challenge is to measure precisely how much the developed semantic technologies bene t the end users and what their overall impact on the quality of the curated data is. We present preliminary evaluation results that show a signi cant reduction in manual curation time.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Producing sustainable liquid fuels with low environmental impact is one of
the major technological challenges the world is facing today. Industrialized
and developing countries consider biofuels, fuels produced from biomass, as a
promising alternative to fossil based fuels. Extracting sugars from cellulose to
produce biofuels requires to break down cellulose by using speci c molecules
called enzymes. Therefore, in the current race for replacing petroleum based fuels
with renewable biofuels, discovering the most e cient enzymes for the cellulose
degradation is a key challenge.</p>
      <p>
        The largest knowledge source available to biofuel researchers is the PubMed
bibliographic database, containing more than 19 million citations from over 21,000
life science journals. PubMed is linked to other databases, like Entrez Genome,
which provides access to genomic sequences or BRENDA, The Comprehensive
Enzyme Information System [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], which is the main collection of enzyme functional
corresponding author
data available to the scienti c community. A biology researcher querying PubMed
using keywords collects an often long list of relevant papers. The way to analyze
this collection is reading all the abstracts and sometimes the full text papers:
this task is time consuming, di cult to handle and signi cant knowledge can be
easily missed.
      </p>
      <p>
        To address this problem, Natural Language Processing (NLP) and Semantic
Web approaches are increasingly adopted in biomedical research [
        <xref ref-type="bibr" rid="ref10 ref2">2, 10</xref>
        ]. The
work-in-progress we present in this paper focuses on the automatic extraction of
knowledge from the massive amount of information on enzymes in fungi available
from genome research. Text mining systems, like the one we developed here, are
typically evaluated with intrinsic metrics, such as precision and recall. However,
while these metrics can give insight into the accuracy of a system, they do not
necessarily correspond to their extrinsic performance [
        <xref ref-type="bibr" rid="ref1 ref4">1, 4</xref>
        ]: How much does the
system actually improve the tasks performed by users? Thus, in this work we
are interested in also evaluating the impact of our semantic systems on the work
performed by our biologists and the quality of the curated data.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Project Context and System Architecture</title>
      <p>
        Before we describe our overall architecture and the text mining pipelines, we
brie y introduce the user groups involved and the semantic entities we analyse.
User Groups. The identi cation and the development of e ective fungal enzyme
cocktails are key elements of the biore nery industry. In this context, the manual
curation of fungal genes provides the thorough knowledge required for guiding
research and experiments. The biology researchers involved in this curation are
lling the mycoCLAP database [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], which is a searchable database of fungal
genes encoding lignocellulose-active proteins that have been biochemically
characterized. The curators are therefore the rst user group of our system. The
biology researchers who make decision about the experiments to conduct and
the experimenters executing them represent two further user groups. They are
mainly interested in the ability of combining multiple semantic queries to the
curated data, thereby integrating the various knowledge resources.
experimenters
      </p>
      <p>browser
biology
researchers
curators
articles</p>
      <p>
        NLP methods
semantic
representation
curated
database
external
databases
Linked Data
Semantic Entities. The system we are developing has to support the manual
curation process; therefore, the semantic annotation types have been de ned by
the curators according to the information they need to store in the mycoCLAP
database. Entities include information such as organisms, enzymes, assays, genes,
kinetic properties, reactions, substrates, and environmental conditions. To
facilitate semantic discovery, linking and querying these concepts across literature
and databases, these entities are modeled in OWL ontologies, which are
automatically populated from documents. As an example, Fig. 1 shows two main entities
encoded in our ontology, organisms [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] and enzymes. The ontology is used both
during the text mining process and for querying the extracted information.
Semantic Resources. In terms of knowledge sources, the system relies on external
and internal processing resources and ontologies. The Taxonomy database [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]
from NCBI is used for initializing the NLP resources supporting the
organism recognition. BRENDA [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] provides the enzyme knowledge along with the
UniProtKB/SwissProt [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. References to the original sources are integrated
into the curated data. This facilitates semantic connections through standard
Linked Data techniques, e.g., from an organism mention in a research paper to
its corresponding entry in the NCBI Taxonomy database.
      </p>
      <p>System Architecture. With the large number of di erent user groups and their
diverging requirements, as well as the existing and continuously updated project
infrastructure, we needed to nd solutions for incrementally adding semantic
support without disrupting day-to-day work. Our solution deploys a
looselycoupled, service-oriented architecture that provides semantic services through
existing and new clients. To connect these individual services and their results,
we rely on standard semantic data formats, like OWL and RDF, which provide
both loose coupling and semantic integration, as new data can be browsed and
queried as soon as it is added to the framework (Fig. 2).</p>
      <p>
        NLP services are provided by the Semantic Assistants architecture [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], which
facilitates the publication of NLP pipelines through standard Web services with
WSDL descriptions. Users can access these Semantic Assistants services from
their desktop through client plug-ins for common tools, such as the Firefox Web
browser or the OpenO ce word processor.
      </p>
    </sec>
    <sec id="sec-3">
      <title>Text Mining Pipelines</title>
      <p>
        Our text mining pipelines are based on the General Architecture for Text
Engineering (GATE) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. All documents rst undergo basic preprocessing steps using
o -the-shelf components, such as tokenization, sentence splitting, and
part-ofspeech tagging. Custom pipelines then extract the semantic entities mentioned
above and populate the OWL ontologies using the OwlExporter component. The
same pipeline can be run for automatic (batch) ontology population, embedded
in Teamware (described below) for manual annotation, or brokered to desktop
clients through Web services for literature mining and curation.
      </p>
      <p>
        Organism Recognition. The organism tagging and extraction relies on external
resources that are automatically translated for reuse in our system, thereby
providing users with the ability to update their installation when the NCBI
Taxonomy database changes. Additionally, a custom built organism ontology,
presented in Fig. 1, formally describes the linguistic structure of organism entities
at di erent levels of the taxonomic hierarchy [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. The GATE pipeline consists of
modules for organism entity detection based on pattern matching to the NCBI
reference taxonomy, providing scienti c names and the NCBI Taxonomy Identi er.
Strain mentions are extracted using a speci c text tokenization and a machine
learning based approach.
      </p>
      <p>
        Enzyme Recognition. Despite the standards published by the Enzyme
Commission [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], enzymes are often described by the authors under various formats.
An enzyme-speci c text tokenization, along with grammar rules written in the
JAPE language, analyses tokens with the -ase enzyme su x. Then, the enzyme
entity recognition relies on automatically extracted knowledge from the BRENDA
database. A pattern matching approach provides enzyme name identi cation.
The detected enzyme mentions are associated with their EC number, their
Recommended Name, their Systematic Name and their URL on the BRENDA website.
Temperature and pH Facts. Temperature and pH mentions are involved in
several biological facts, like the temperature and pH dependence/stability or
the description of the activity and kinetic assay conditions. Our GATE pipeline
contains PRs based on JAPE rules and gazetteer lists of speci c vocabulary that
enable the detection of these key mentions at the sentence level.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Intrinsic and Extrinsic Evaluation</title>
      <p>As explained above, text mining systems require an evaluation showing their
e ciency and e ectiveness, both intrinsically and from an end user's point of
view. In this section, we rst discuss the development of the gold standard corpus
and present preliminary evaluation results of our system.
4.1</p>
      <sec id="sec-4-1">
        <title>The Manual Annotation Process</title>
        <p>
          For the intrinsic evaluation, we are building a gold standard corpus of freely
accessible full-text articles by manually annotating them using GATE Teamware [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], a
Web-based management platform for collaborative annotation and curation. The
annotation team is composed of four biology researchers. The researcher in charge
of the curation task and an annotator having a strong background in fungus
literature curation are considered as expert annotators. Their inter-annotator
agreement is over 80%, hence their annotation sets are always de ned as the
most reliable sets during the adjudication process. The corpus is composed of
ten papers related to a class of enzymes. Glycoside hydrolase papers and lipase
papers each represent 40% of the articles, whereas 20% are related to peroxidases.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2 Intrinsic Evaluation: Precision and Recall</title>
        <p>The correctness of our text mining pipelines is evaluated in terms of precision,
recall and F-measure. The reference is provided by the manually annotated (gold
standard) corpus. The preliminary results on the four most common entities
(Enzyme, Organism, pH and Temperature) are shown in Table 1.
The impact of the system on the curation and annotation tasks is evaluated in
terms of required time (range and average) per paper and measured in minutes.
Paper selection. Since the beginning of the curation task, approximately 1000
papers have been examined. The time needed to examine an unannotated full
paper and to make a decision about its selection for curation, without any
semantic support, previously ranged from 2 to 3 minutes. With added support
through the text mining services, the required time decreased to 1{2 minutes.
Paper curation. Among the 1000 examined papers, around 600 were already
selected for curation. The time needed to curate an unannotated full paper, i.e.,
extracting salient facts for entry into the mycoCLAP database, ranged from 30 to
45 minutes for the fully manual work ow. With added semantic support through
the text mining pipelines, the required time decreased to 20{30 minutes.
Paper annotation. For full paper annotation, we investigated the impact of
di erent levels of semantic support on the time required to add annotations
(Table 2). All sets have been manually annotated by four annotators. The 4
papers of the rst set (SET 1) were annotated without any semantic support.
The second set (SET 2) is composed of 3 papers, which have been pre-annotated
by a degraded version of the system, using only generic tools, such as simple
gazetteering list, resulting in lower precision and recall. The third set (SET
3) contains 3 papers, pre-annotated using the complete text mining pipelines,
including the specialized tools and external resources as described above.</p>
        <p>From the preliminary results, we can conclude that (1) there is a signi cant
reduction of the average time required for paper selection, curation and annotation
and (2) the level of support has a measurable impact as well.
set and level of semantic support available tags
SET 1 (no semantic support) ;
SET 2 (partial semantic support) enzyme, organism, pH, temperature
SET 3 (full semantic support) enzyme, organism, pH, temperature</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>We presented our ongoing development of a semantic infrastructure for enzyme
data management. In the context of biofuel research, our system targets the
automatic extraction of knowledge on fungal enzymes from genome research
literature. Preliminary experiments show that semantic support allows for a
signi cant decrease in manual curation time. However, future work is needed to
evaluate the impact of such a system on the quality of the curated data.
Acknowledgments. Funding for this work was provided by Genome Canada
and Genome Quebec.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Alex</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grover</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Haddow</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kabadjov</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klein</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          :
          <article-title>Assisted curation: does text mining really help</article-title>
          .
          <source>In: Paci c Symposium on Biocomputing</source>
          . vol.
          <volume>13</volume>
          , pp.
          <volume>556</volume>
          {
          <issue>567</issue>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Ananiadou</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McNaught</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Text Mining for Biology And Biomedicine</article-title>
          . Artech House, Inc., Norwood, MA, USA (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bontcheva</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cunningham</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roberts</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tablan</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Web-based Collaborative Corpus Annotation: Requirements and a Framework Implementation</article-title>
          . In:
          <article-title>New Challenges for NLP Frameworks</article-title>
          . pp.
          <volume>20</volume>
          {
          <fpage>27</fpage>
          . ELRA, Valletta,
          <source>Malta (May 22</source>
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Caporaso</surname>
            ,
            <given-names>J.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Deshpande</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fink</surname>
            ,
            <given-names>J.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bourne</surname>
            ,
            <given-names>P.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cohen</surname>
            ,
            <given-names>K.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hunter</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Intrinsic evaluation of text mining tools may not predict performance on realistic tasks</article-title>
          .
          <source>In: Paci c Symposium on Biocomputing</source>
          . vol.
          <volume>13</volume>
          , pp.
          <volume>640</volume>
          {
          <fpage>651</fpage>
          . World Scienti c Publishing (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Cunningham</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maynard</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bontcheva</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tablan</surname>
          </string-name>
          , V.:
          <article-title>GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications</article-title>
          .
          <source>In: Proc. 40th Anniversary Meeting of the ACL</source>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Federhen</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>The Taxonomy Project</article-title>
          . In: McEntyre,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Ostell</surname>
          </string-name>
          ,
          <string-name>
            <surname>J</surname>
          </string-name>
          . (eds.)
          <source>The NCBI Handbook, chap. 4. National Library of Medicine (US)</source>
          ,
          <source>National Center for Biotechnology Information</source>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <source>International Union of Biochemistry and Molecular Biology: Enzyme Nomenclature</source>
          <year>1992</year>
          . Academic Press, San Diego, California (
          <year>1992</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Murphy</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Powlowski</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Butler</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tsang</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Curation of characterized glycoside hydrolases of fungal origin</article-title>
          .
          <source>Database</source>
          <year>2011</year>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Scheer</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grote</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schomburg</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Munaretto</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rother</surname>
            , M.,
            <given-names>S</given-names>
          </string-name>
          ohngen,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Stelzer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Thiele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Schomburg</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          :
          <article-title>BRENDA, the enzyme information system in 2011</article-title>
          .
          <source>Nucleic Acids Res</source>
          .
          <volume>39</volume>
          , (Database issue):
          <source>D670{676</source>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Shadbolt</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berners-Lee</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hall</surname>
            ,
            <given-names>W.:</given-names>
          </string-name>
          <article-title>The semantic web revisited</article-title>
          .
          <source>IEEE Intelligent Systems</source>
          <volume>21</volume>
          (
          <issue>3</issue>
          ),
          <volume>96</volume>
          {
          <fpage>101</fpage>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <article-title>The UniProt Consortium: The Universal Protein Resource (UniProt)</article-title>
          .
          <source>Nucleic Acids Research</source>
          37(D),
          <volume>169</volume>
          {
          <fpage>174</fpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Witte</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gitzinger</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Semantic Assistants { User-Centric Natural Language Processing Services for Desktop Clients</article-title>
          .
          <source>In: 3rd Asian Semantic Web Conference (ASWC</source>
          <year>2008</year>
          ). LNCS, vol.
          <volume>5367</volume>
          , pp.
          <volume>360</volume>
          {
          <fpage>374</fpage>
          . Springer, Bangkok, Thailand (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Witte</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kappler</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baker</surname>
            ,
            <given-names>C.J.O.</given-names>
          </string-name>
          :
          <article-title>Ontology Design for Biomedical Text Mining</article-title>
          . In: Semantic Web:
          <article-title>Revolutionizing Knowledge Discovery in the Life Sciences</article-title>
          ,
          <source>chap. 13</source>
          , pp.
          <volume>281</volume>
          {
          <fpage>313</fpage>
          . Springer (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>