<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>CSE Framework: A UIMA-based Distributed System for Configuration Space Exploration</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Elmer Garduno</string-name>
          <email>elmerg@sinnia.com</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Zi Yang</string-name>
          <email>ziy@cs.cmu.edu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Avner Maiberg</string-name>
          <email>amaiberg@cs.cmu.edu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Collin McCormack</string-name>
          <email>collin.w.mccormack@boeing.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yan Fang</string-name>
          <email>yan.fang@oracle.com</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eric Nyberg</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Boeing Company</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Carnegie Mellon University</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Oracle Corporation</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>To efficiently build data analysis and knowledge discovery pipelines, researchers and developers tend to leverage available services and existing components by plugging them into different phases of the pipelines, and then spend hours to days seeking the right components and configurations that optimize the system performance. In this paper, we introduce the CSE framework , a distributed system for a parallel experimentation test bed based on UIMA and uimaFIT, which is general and flexible to configure and powerful enough to sift through thousands option combinations to determine which represent the best system configuration.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        To efficiently build data analysis and knowledge discovery “pipelines”, researchers and
developers tend to leverage available services and existing components by plugging them into different
phases of the pipelines [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], and then spend hours, seeking for the components and configurations
that optimize the system performance. The Unstructured Information Management Architecture
(UIMA) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] provides a general framework for defining common types in the information system
(type system), designing pipeline phases (CPE descriptor), and further configuring the
components (AE descriptor) without changing the component logic. However there is no easy way to
configure and execute a large set of combinations without repeated executions, while evaluating
the performance of each component and configuration.
      </p>
      <p>
        To fully leverage existing components, it must be possible to automatically explore the space
of system configurations and determine the optimal combination of tools and parameter settings
for a new task. We refer to this problem as configuration space exploration, which can be formally
defined as a constraint optimization problem. A particular information processing task is defined
by a configuration space, which consists of mt components that define each of the n phases with
corresponding configurations. Given a limited total resource capacity C and input set S,
configuration space exploration (CSE) aims to find the trace (a combination of configured components)
within the space that achieves the highest expected performance without exceeding C total cost.
Details on the mathematical definition and proposed greedy solutions can be found in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        In this paper, we introduce the CSE framework implementation, a distributed system for
parallel experimentation test bed based on UIMA and uimaFIT [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. In addition, we highlight the results
from two case studies where we applied the CSE framework to the task of building biomedical
question answering systems.
      </p>
      <p>!
!
We highlight some features of the implementation in this section. Source code, examples,
documentation, and other resources are publicly available on GitHub5. To benefit developers who are
already familiar with UIMA framework, we have developed a CSE tutorial in alignment with the
examples in the official UIMA tutorial.</p>
      <p>Declarative descriptors. To leverage the CSE framework, users need to specify how the
components should be organized in the pipeline, which values need to be specified for each
component configuration, which is the input set, and what measurement metrics should to be applied.
Analogous to a typical UIMA CPE descriptor, components, configurations, and collection
readers in the CSE framework are declared in extended configuration descriptors which are based on
the YAML format. An example of the main pipeline descriptor and a component descriptor are
shown in Figure 1.</p>
      <p>Architecture. Each pipeline can contain an arbitrary number of AnalysisEngines declared by
using the class keyword or by inheriting configuration options from other components by name.
Combinations of components are configured using an options block and parameter combinations
within a component are configured on a cross-opts block. To take full advantage of the CSE
framework capabilities, users inherit from a cse.phase, a CAS multiplier that provides, option
multiplexing, intermediate resource persistence and resource management for long running
components. The architecture also supports grouping options into sub-pipelines as a convenient way
of reducing the configuration space for combinations whose performance is already known.</p>
      <p>Evaluation. Unlike a traditional scientific workflow management system, CSE emphasizes
the evaluation of component performance, based on user-specified evaluation metrics and
goldstandard outputs at each phase. In addition the framework keeps track of the performance of all the
executed traces, this allows inter-component evaluation and automatic tracking of performance
improvements through time.</p>
      <p>Automatic data persistence. To support further error analysis and reproduction of
experimental results, intermediate data (CASes) and evaluation results are kept in a repository
accessible from any trace at any point during the experiment. To prevent duplicate execution of traces the
system keeps track of all the execution traces an recovers those CASes whose predecessors have
5 http://oaqa.github.io/</p>
      <p># Comp # Conf # Trace # Exec
Participants ∼1,000 ∼1,000
CSE 13 32
already been executed. Also the overall results from experiments are kept in a historical database
to allow researchers to keep track of the performance improvements along time.</p>
      <p>Configurable selection and pruning. If gold-standard data is provided for a certain phase,
then components up to that phase can be evaluated. Given the measured cost of executing the
components provided, components can be ranked, selected or pruned for evaluation and
optimization of subsequent phases. The component ranking strategy can be configured by the user;
several heuristic strategies are implemented in the open source software.</p>
      <p>Distributed architecture. We have extended the CSE framework implementation to execute
the task set in parallel on a distributed system using JMS. The components and configurations
are deployed into the cluster beforehand. The execution, fault tolerance and bookkeeping are
managed by a master server. In addition we leverage the UIMA-AS capabilities to execute specific
configurations in parallel as separate services directly from the pipeline.
3</p>
    </sec>
    <sec id="sec-2">
      <title>Building biomedical QA Systems via CSE</title>
      <p>As a case study, we apply the CSE framework to the problem of building effective biomedical
question answering (BioQA) on two different tasks.</p>
      <p>In one case, we employ the topic set and benchmarks, including gold-standard answers and
evaluation metrics, from the question answering task of the TREC Genomics Track 2006, as
well as commonly-used tools, resources, and algorithms cited by participants. The implemented
components, benchmarks, task-specific evaluation methods are included in domain-specific layer
named BioQA, which was plugged into the BaseQA framework.</p>
      <p>
        The configuration space was explored with the CSE framework, automatically yielding an
optimal configuration of the given components which outperformed published results for the same
task. We compare the settings and results for the experiment with the official TREC 2006
Genomics test results for the participating systems in Table 1. We can see that the best system
derived automatically by the proposed CSE framework can outperform the best participating system
in terms of both DocMAP and PsgMAP, with fewer, more basic components. This experiments
ran on a 40 node cluster during 24 hours allowing the execution on 200K components over 2,700
execution traces. More detailed analysis can be found in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        We also used the CSE framework to automatically configure a different type of biomedical
question answering system for the QA4MRE (Question Answering for Machine Reading
Evaluation) task at CLEF. The CSE framework identified a better combination, which achieved 59.6%
performance gain over the original pipeline. Details can be found in the working note paper [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
4
      </p>
    </sec>
    <sec id="sec-3">
      <title>Related Work</title>
      <p>
        Previous work has been done on this area, in particular DKPro Lab [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] a flexible lightweight
framework for parameter sweep experiments and the U-Compare [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] framework, an evaluation
platform for running tools on text targets and compare components, that generates statistics and
instance-based visualizations of outputs.
      </p>
      <p>One of the main advantages of the CSE framework is that it allows the exploration of very
large configuration spaces by distributing the experiments over a cluster of workers and collecting
the statistics on a centralized way. Another advantage on the CSE framework is that configurations
can have arbitrary nesting levels as long as they form a DAG by using sub-pipelines. Also results
can be compared end-to-end at a global level to understand overall performance trends on time.
One area where CSE could take advantage of the aforementioned frameworks is on having a
graphical UI for pipeline configuration, better visualization tools for combinatorial and
instancebased comparison and a more expressive language for workflow definition.
5</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion &amp; Future Work</title>
      <p>In this paper, we present a UIMA-based distributed system to solve a common problem in rapid
domain adaptation, referred to as Configuration Space Exploration. It features declarative
descriptors, evaluations, automatic data persistence, global resource caching, configurable configuration
selection and pruning, and distributed architecture. As a case study, we applied the CSE
framework to build a biomedical question answering system, which incorporated the benchmark from
TREC Genomics QA task, and the results showed the effectiveness of the CSE framework system.</p>
      <p>We are planning to adapt the system to a wide variety of interesting information processing
problems to facilitate rapid domain adaption and system building and evaluation of the
community. For educational purpose, we are also interested in adopt the CSE framework as an experiment
platform to teach students the principled ways to design, implement and evaluate an information
system.</p>
      <p>Acknowledgement. We thanks Leonid Boystov, Di Wang, Jack Montgomery, Alkesh Patel, Rui
Liu, Ana Cristina Mendes, Kartik Mandaville, Tom Vu, Naoki Orii, and Eric Riebling for the
contribution to the design and development of the system and valued suggestions to the paper.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>D.</given-names>
            <surname>Ferrucci</surname>
          </string-name>
          et al.
          <article-title>Towards the Open Advancement of Question Answering Systems</article-title>
          .
          <source>Technical report, IBM Research</source>
          , Armonk, New York,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. R. E. de Castilho and
          <string-name>
            <surname>I. Gurevych.</surname>
          </string-name>
          <article-title>A lightweight framework for reproducible parameter sweeping in information retrieval</article-title>
          .
          <source>In Proceedings of the DESIRE'11 workshop</source>
          , New York, NY, USA, Oct.
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>D.</given-names>
            <surname>Ferrucci</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Lally</surname>
          </string-name>
          .
          <article-title>UIMA: an architectural approach to unstructured information processing in the corporate research environment</article-title>
          .
          <source>Nat. Lang</source>
          . Eng.,
          <volume>10</volume>
          (
          <issue>3-4</issue>
          ), Sept.
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>P.</given-names>
            <surname>Ogren</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Bethard</surname>
          </string-name>
          .
          <article-title>Building test suites for UIMA components</article-title>
          .
          <source>In Proceedings of the (SETQA-NLP</source>
          <year>2009</year>
          )
          <article-title>workshop</article-title>
          , Boulder, Colorado,
          <year>June 2009</year>
          .
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>A.</given-names>
            <surname>Patel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Nyberg</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Mitamura</surname>
          </string-name>
          .
          <article-title>Building an optimal QA system automatically using configuration space exploration for QA4MRE'13 tasks</article-title>
          .
          <source>In Proceedings of CLEF</source>
          <year>2013</year>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Z.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Garduno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Fang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Maiberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>McCormack</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and E.</given-names>
            <surname>Nyberg</surname>
          </string-name>
          .
          <article-title>Building optimal information systems automatically: Configuration space exploration for biomedical information systems</article-title>
          .
          <source>In Proceedings of the CIKM'13</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>K.</given-names>
            <surname>Yoshinobu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Baumgartner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>McCrohon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ananiadou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Cohen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Hunter</surname>
          </string-name>
          , and
          <string-name>
            <surname>J. Tsujii.</surname>
          </string-name>
          <article-title>UCompare: share and compare text mining tools with UIMA</article-title>
          .
          <source>Bioinformatics</source>
          ,
          <volume>25</volume>
          (
          <issue>15</issue>
          ):
          <fpage>1997</fpage>
          -
          <lpage>1998</lpage>
          ,
          <year>2009</year>
          . in press.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>