<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Active Curation of Artifacts and Experiments is Changing the Way Digital Libraries will Operate</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Bruce R. Childers⇤</string-name>
          <email>childers@cs.pitt.edu</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jack W. Davidson</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Wayne Graves</string-name>
          <email>graves@acm.org</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bernard Rous</string-name>
          <email>rous@acm.org</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>David Wilkinson⇤</string-name>
          <email>dwilk@cs.pitt.edu</email>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science University of Virginia</institution>
          ,
          <addr-line>Charlottesville, Virginia 22904</addr-line>
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>-A new type of “active curation system” is becoming prevalent in computer science that provides executable access to artifacts and experiments behind published results and enables their reuse. These systems allow changing and repeating experiments to understand how an innovation behaves in conditions beyond the ones described in a paper. Active curation systems also enable accountability and accelerate research progress by giving access to complete experimental details. As these systems take hold, it is important to understand their capabilities and how digital libraries (DLs) should be integrated with them. In this position paper, we describe a study underway to explore how best to do this integration. The study uses case studies to understand how the systems and one particular DL, the Association for Computing Machinery's Digital Library, should work together to package, deliver and interact with artifacts and experiments as digital objects that can be executed. This study is a step toward developing the approaches required for production integration of active curation systems and digital libraries.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>I. INTRODUCTION</title>
      <p>Innovation in computer science (CS) typically relies heavily
on experimentation with software tools, data sets, parameters,
and a myriad of other information. These artifacts are used
to implement prototypes, empirically evaluate new ideas and
analyze implications. The artifacts are used in experiments to
produce results. An experiment incorporates all data,
workflows, setups and tools for an empirical study. A result is
the outcome of the study. Results are typically analyzed and
summarized for dissemination of a new idea to the broader
community through peer-reviewed articles.</p>
      <p>
        Within this context of experimentally-driven CS, a
farreaching trend has started to emerge: there is recognition of
the importance and benefit of gaining access to the artifacts,
experiments and results underlying published articles for
research accountability and credibility (important aspects of the
scientific method). In the United States, federal mandates are
This work is licensed under a CC-BY-ND 4.0 license.
driving the need for this access, such as data management
plans spearheaded by the National Science Foundation (NSF)
and now adopted by the Department of Energy (DOE).
Likewise, the National Institutes of Health (NIH) has put in place
data access mandates. Similar mandates and efforts have
appeared in the European Union, such as the Horizon 2020 Open
Research Data Pilot, and elsewhere. Simultaneously, a
grassroots conversation is advocating and incentivizing improved
experimental quality to increase research productivity through
access to full information and tools behind publications. For
example, Artifact Evaluation (AE) rewards production of
highquality artifacts and experiments [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. In AE, authors
provide the materials used to generate experimental results
for peer review separate from paper review. Several efforts
are underway for curation of digital artifacts and experiments
to similar ends. This trend will gain even more momentum as
CS enters the age of open access and accountability.
      </p>
      <p>Specifically, a new type of system, which we call a active
curation system, is emerging to curate digital objects that
can be executed (i.e., run on computational resources) to
facilitate all steps of research, including formation of new
ideas, implementation, experimentation, dissemination, and
archiving. These systems provide direct, executable access
to experiments, which may be shared, used in AE, archived
and/or accessed from a published article. They go beyond
code, result and/or experiment, repositories, which store and
share fixed/static information (like github or Bitbucket),
to allow modifying and running experiments.</p>
      <p>
        Active curation systems (e.g., DataMill [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], OCCAM [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ],
and EUDAT [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]) offer a “try before you buy” approach to
online sharing of interactive and modifiable experiments as
digital objects; these systems allow reviewers and readers to
try different parameters, alternative data sets, etc., to create and
run new experiments. The systems provide the computational
capabilities to support executing the original experiments from
a paper, changing the experiments, and running entirely new
ones. The system may be hosted in the cloud and
structured as centralized, distributed or federated structures that
encompass underlying repositories (ideally built with existing
efforts for sharing research data, like MyExperiment [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ],
Research Object [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], Figshare [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], Zenodo [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], DataVerse [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ],
and Open Science Framework [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]) and computational
resources. The computational resources may be marshalled
ondemand from cloud providers (e.g., Microsoft Azure, Amazon
AWS, and Google Cloud Platform), community systems (e.g.,
XSEDE [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], CloudLab [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], Chameleon [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], and PSC
Bridges [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]), or institutional (private) resources. Controls may
be provided to restrict what can be done with an experiment,
such as limiting an experiment’s scale to be runnable in a
short time interval with minimal resources for a “quick look”,
rather than full experimental runs. Active curation systems are
a compelling step in the evolution of research repositories and
community computational systems to tie both together and
then link them to the scholarly record.
      </p>
      <p>Regardless of the specific capabilities and implementation
choices, active curation systems help users understand how
innovations behave under conditions beyond the ones
described in the original papers. By giving executable access to
complete experimental details of an innovation and allowing
interactive modification, active curation systems also enable
accountability and accelerate research. These systems allow
other researchers to extend the original experiments and to use
the original artifacts for comparison. Furthermore, they enable
a new style of scientific dissemination, where the product, i.e.,
the artifacts and experiments themselves, take center stage,
in addition to, or possibly even replacing, the peer-reviewed
article.</p>
      <p>Given the diversity of CS research, it is unlikely that
any one active curation system can serve all needs; every
community has different requirements, ranging from relatively
small algorithmic descriptions (say in R) for data analytics, to
complex, multi-stage workflows in computer systems research,
to instrumented systems in cyber-physical and embedded
computing, to large-scale parallel execution in high-performance
computing (HPC), and everything and anything in between!
DLs must be prepared for this change to handle diverse digital
entities and the ways in which they are identified, described,
reviewed, delivered and manipulated. The researchers using
the platforms and developing the active curation systems
must also be informed about the requirements, standards and
needs of the DLs. Interfaces and metadata will be needed for
interoperability between the DL and the curation systems.</p>
    </sec>
    <sec id="sec-2">
      <title>II. INTEGRATING ACTIVE CURATION SYSTEMS</title>
      <p>To help publishers, such as the Association for Computing
Machinery, the Institute of Electrical and Electronics
Engineers (IEEE), and the Society for Industrial and Applied
Mathematics (SIAM), prepare for active curation systems and
the objects they will curate, we are carrying out a study
to understand the landscape of community and commercial
efforts. The study is designed to be objective, inclusive and
systematic to consider a range of systems relevant to different
communities. Rooted in specific use cases, the study will
assist publishers in developing the approaches and procedures
needed for integration. The case studies will trial specific
active curation systems that are integrated with the Association
for Computing Machinery’s (ACM) digital library, a
preeminent discovery service and archive of CS publications. The
studies may also be used as part of AE at selected conferences
to learn about new review approaches for evaluating software
artifacts and experimental outcomes. The information
gathered will influence artifact review, the DL and the curation
platforms themselves to better support community evaluation,
dissemination of results and interoperability of digital libraries
with external active curation systems. The trials will help test
and refine approaches to determine what does and does not
work, and demonstrate active curation for feedback from the
CS community.</p>
      <p>The study, which is already underway and has made initial
progress (see Section III), has several steps:
1) Identify relevant active curation systems to catalog
capabilities and targeted audiences. The identification of
these systems considers AE and archiving experiments
for access through the DL.
2) Develop “use cases” to test/evaluate the active curation
systems for a spectrum of research communities, ranging
from simple use cases to more complex ones (e.g.,
workflows involving multiple tools and data sets).
3) Evaluate exemplar active curation systems in the context
of the use cases to learn requirements and capabilities,
with an eye toward the impact on both AE and the DL.
4) From the evaluation, identify the primary capabilities
and issues that the DL, developers and authors need to
consider to incorporate and use active curation systems
with digital libraries.
5) Develop guidelines for technical best practices for
active curation systems, preparing software artifacts and
experiments for curation as digital objects that can be
executed, integrating the DL with the systems, and
achieving service guarantees/standards.
6) Determine prototype interfaces to “plug-in” specific
active curation systems into the DL, e.g., readers can
access papers in the DL and click on experiments in
papers and/or associated with DL metadata to access
original experiments, which can then be modified and
re-evaluated.
7) Using the prototype interfaces, integrate the exemplar
systems with the DL to refine and improve interfaces,
learn what does and does not work, and demonstrate the
benefits to the broader community.
8) Seek feedback from the community to refine initial
approaches and interfaces.</p>
      <p>Although the study’s aim is technically oriented towards
integrating DLs and active curation systems, policy and procedural
issues are inherently being considered to a limited extent.
Likewise, we recognize the importance of facilitating AE by
further integration between submission platforms, DLs and
active curation systems. Although these issues are outside this
study’s scope, they are important to consider in a future study.</p>
      <sec id="sec-2-1">
        <title>A. Study Outcomes</title>
        <p>Several case studies will be the primary outcome from the
study. The case studies will integrate selected exemplar active
curation systems with the DL in trial runs to access specific
digital objects that can be executed on specific computational
resources. The case studies will show how the exemplar
active curation systems can (1) package and deliver artifacts
and experiments as digital objects that can be modified and
executed in an interactive fashion, and (2) provide access and
interact with artifacts and experiments associated with papers.
The case studies will also give insight into how the end-to-end
process should be structured and what technical capabilities
are required.</p>
        <p>While these outcomes are useful to the ACM DL, more
broadly, upon completion of the study, the outcomes will
provide insight to software developers building active
curation systems and other publishers and providers of digital
library services (e.g., IEEE, SIAM, etc.) that wish to also
integrate the systems. The completed study will also assist
AE evaluators and authors in deciding how to package their
artifacts/experiments and what active curation systems are
appropriate to their particular situation. The study will give
understanding to the developers and maintainers of the ACM
DL and the curation systems about the technical capabilities
required to bring together AE, curating digital objects that
can be executed, and archiving that involves interoperability
of several systems. We note that many scientific communities
are facing the same challenges and opportunities as computer
science with software products and experimental results. The
outcomes from this study will be both informed by work in
these other communities, as well as providing insight to those
communities from our experience in experimental computer
science. Relevant outcomes will be made publicly available
for open use/modification by the broader community, including
the specific integrations developed for each use case.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>III. INITIAL WORK ON THE STUDY</title>
      <p>In this section, we give an overview of our initial work in
performing the study and the preliminary outcomes from that
work. We have examined the landscape of available active
curation systems to characterize their capabilities. With the
assistance of an Advisory Group appointed by ACM, we
determined common use cases where these systems might be
used by authors and readers to interact, modify or build upon
published results from a scholarly article in computer science.</p>
      <p>Below, we describe our use cases, why they were selected
and how we plan to implement them. The situations covered
by the use cases are not exhaustive; there are many other
ways in which authors and readers might want to deploy and
interact with software artifacts and experiments beyond these
particular case studies. They are, however, designed to cover
common ways in which readers of proceedings papers from
ACM Special Interest Group (SIG) conferences might access
and use artifacts and experimental results. This position paper
is intended to solicit feedback to improve and refine the use
cases and their implementation.</p>
      <p>Case Study 1: Repeating Comparisons with Varying Datasets</p>
      <p>In experimental CS, a typical evaluation compares different
algorithms (in the broadest sense, i.e., an algorithm might be a
“system”, a “software implementation”, etc.). The comparison
may consider different variants of an algorithm, or be a
comparison of a new technique versus a previous one. Many
metrics might be considered, such as performance, energy,
reliability, memory size, storage capacity, and topic-specific
ones. The purpose of the evaluation is to understand the
behavior, benefits and pitfalls of the different choices. The
evaluation may use several situations, such as different fault
injection scenarios for analysis of resilience. Authors usually
consider situations that achieve broad coverage and address
specific advantages/disadvantages of the algorithms.</p>
      <p>Given the possibly huge space of comparison, it is natural
that authors cannot cover every possibility, nor anticipate every
situation, including ones that are not yet known. After all,
research comparisons are usually about pointing to promising
directions (or avoiding false paths!), rather than specifics
or absolutes. Ideally, the algorithm implementation, with all
experimental situations, should be made available for other
researchers to study more carefully and build upon in new
ways. In this case study, we consider this scenario: a reader
of a published article wants to repeat comparisons under
the original published situations, as well as new ones for
the reader’s own work. We imagine that a reader comes to
the ACM DL, accesses a paper, and then decides to repeat
comparisons in the paper. Further, the reader wants to do the
comparisons with new data sets.</p>
      <p>There are many ways in which this capability might be
achieved. For this first case study, we consider what may
be the simplest approach: the reader downloads packaged
artifacts from the DL and re-runs comparisons on his/her
own computer. We make a twist to this simple situation to
further imagine that the reader works in high-performance
computing and has access to an HPC cluster to run a parallel
algorithm. This case study then covers 1) replicating algorithm
comparisons; 2) packaging, accessing and distributing artifacts
to local computational resources; 3) using unique hardware
resources (a cluster) to run packaged artifacts; and 4) repeating
experiments with different data sets.</p>
      <p>
        To implement this case study, we selected a candidate
application and active curation system. The application is a parallel
DNA assembly algorithm that processes very large graph
structures, which is similar to many big data applications [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
It is an interesting application because the associated article
describes a classic A vs. B algorithmic comparison. It also
operates on big data sets that exceed one terabyte, introducing
a challenge in setting up and repeating an experiment. The
software implementation of the algorithm is available and
uses the widely-adopted MPI parallel programming
environment. For the active curation system, we selected Collective
Knowledge (CK) [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. CK can “wrap” an experiment
into a packaged archive that, when extracted, automatically
re-runs experiments, pulling down data sets as required from
independent data repositories. The data sets required for the
DNA algorithm are contained in an open-access repository,
where CK can grab them. CK also easily facilitates using data
sets other than the ones used in the original evaluation. It is
open source software with strong support from its development
team. Lastly, because the artifact requires particular hardware
resources (a cluster with sufficient memory and core count)
and software requirements (MPI library), the ACM DL is
being enhanced to display the detailed re-run requirements.
      </p>
      <sec id="sec-3-1">
        <title>Case Study 2: Sharing and Modifying Experiments</title>
        <p>This second case study examines another common scenario:
modifying an experimental configuration to understand how
an innovation behaves in a situation different from what was
evaluated in the original paper. Access will be given to the
original software used for evaluation, along with all
experiment metadata (e.g., configurations, data sets, benchmarks,
etc.). A reader should then be able to modify and run the
experiments to test new conditions.</p>
        <p>This case study looks at the other end of the spectrum of
case study 1. It considers an active curation system that is
seamlessly and tightly integrated into the ACM DL. The DL
will extract and serve digital entities that can be manipulated
from a paper’s display web page. The extracted digital entities
might highlight critical results from the paper, such as a
summary graph of performance for some new algorithm. A
reader could then interact with that graph, changing initial
conditions and running new experiments, which would then
become part of the archive associated with that paper. The
provenance of experiments would be extracted from the active
curation system and kept with new experimental runs through
the DL. While the DL would provide this “overview access”,
to dig deeper, the reader would then access the artifacts
and experiments directly through the active curation system.
In essence, this case study considers an interactive paper
displayed and manipulated via the DL. Detailed access would
be supported by a hand-off to the active curation system
itself. Because the active curation system and the DL will
dynamically interact, the curation system needs to be online
(in the cloud), rather than locally hosted as in case study 1.</p>
        <p>
          We identified a candidate paper [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] from ACM SIGSAC
(security and access control) for this case study. The paper
describes probabilistic actor modeling to analyze the
expressiveness and overhead costs associated with various access
controls when used in specific applications. A simulator is
used that implements the probabilistic model. Because the
modeling has many design choices and the simulator generates
a huge number of results, it is a good candidate to illustrate
the benefits of interactive access. It is also a good candidate
due to the size of the design space; the paper cannot consider
nor anticipate every parameter choice that a reader may wish
to explore (which varies with the application of the access
control). For this case study, we need a active curation system
that supports interactivity, workflows, provenance, cloud
hosting, and results visualization. Our candidate is OCCAM [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ],
which is a prototype system to support the full life-cycle
of research from initial innovation, to paper reviewing, to
artifact evaluation, to preservation and subsequent access and
modification of experiments. The ACM DL will be enhanced
to interact with OCCAM to extract highlight results and allow
changes to parameters, which are passed back to OCCAM for
new runs of the simulator. Results would then be delivered
from OCCAM to the DL as newly added digital entities (e.g.,
graphs that can be manipulated) accessible from the paper’s
display page but clearly distinguished from the author’s own
work via their own provenance display.
        </p>
        <p>Case Study 3: Deriving a New Artifact from an Existing One</p>
        <p>In this third case study, we consider the situation where
an existing software artifact is modified to derive a new
software artifact. When access is given to an innovation, the
software implementation can be changed to build on it with
new capabilities. Alternatively, a new idea might be compared
against a previous one, tackling the same issue. The original
implementation can be changed with the new idea, allowing
for fair comparison in the same software framework. The new
implementation may then be deployed along with the original
one through an active curation system. The integration of case
study 3 will be similar to case study 2, except support will
now be required to allow modifying the software artifact itself,
tracking and displaying provenance of source changes, and
adding the new artifact to the digital library.</p>
        <p>This case study will use a prototype commercial system,
which can host a software artifact and associated experiments
from a paper; facilitate and track modifications to the source
code and experiments; and store/retrieve the new changes. The
system also permits comparing results before and after the
source code change, delivering digital entities to the DL for
interactive display with the paper.</p>
        <p>
          Our candidate software artifact for this case study comes
from the computer architecture community (SIGMICRO and
SIGARCH). The artifact is a simulator of a computer memory
sub-system. The simulator will be modified to support a
technique that permutes memory addresses to reduce contention
in the memory for improved performance. The technique
is based on a classic XOR-address remapping method [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ],
which requires only a small number of source code changes
to implement. The modification will be made to the simulator
and new experiments will be run, both of which will be
recorded and made available by the active curation system.
For the simulator, we plan to use DRAMsim2 [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ], which
can faithfully model a range of memory architectures, such
as DDR4 and DDR3 memory. Using the simulator, a new
memory sub-system will be created that has XOR-address
mapping. The simulator will be run with traces from several
benchmark programs to generate results that compare memory
performance without and with XOR remapping. Using the
commercial system, graphs will be generated and extracted
that can be incorporated into the ACM DL for interactivity and
further source and experiment modification and experimental
runs.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>IV. COMMUNITY ENGAGEMENT AND ASSESSMENT</title>
      <p>This effort is engaging the computer science community
through the ACM’s Publications Board and Special Interest
Group Governing Board. The effort is being conducted in
concert with an ACM Task Force on Data, Software and
Reproducibility in Publication, convened by the Publications
Board. This task force aims to investigate and understand
the needs for new services and capabilities in the ACM DL
to support the changing landscape of review and scholarly
dissemination practices and methods. Through this task force,
community feedback and guidance is being incorporated in
the study. The ACM SIG Governing Board also established a
second task force on issues about reproducibility in published
results, which has also influenced the study. Feedback from
the broader community will be also sought through the SIG
Governing Board and the chairs of SIGs. We expect the study’s
outcomes will be reviewed by both the Publications Board and
the SIG Governing Board.</p>
      <p>Importantly, we will make the pilot integrations for the case
studies available to members of several SIGs for community
feedback in “trial runs”, where SIG members are invited
to try out the case studies. Through these trials, we will
solicit feedback with a questionnaire. The questionnaire will
be designed to gather information useful for assessing how
much effort authors expect and are willing to expend to
encapsulate artifacts and experiments as digital objects for
active curation, the degree to which the authors view having
executable, interactive access will accelerate research agendas
and improve accountability, and the benefits that authors and
readers expect from the access. We will also solicit input
of survey participants on their views about what kinds of
capabilities are needed from new services in the DL that
incorporate digital objects that can be executed.</p>
      <p>Finally, we will make videos of demonstrations of the case
studies. The videos will show the perspective of both authors
and readers of papers carrying out actions in the case studies
(e.g., packaging materials into digital objects that can be
executed and interacting with the packaged digital objects).
We will use the videos to demonstrate the case studies for
further feedback from the ACM community.</p>
    </sec>
    <sec id="sec-5">
      <title>V. SUMMARY</title>
      <p>Traditional digital libraries and their publications will need
to change to accommodate new methods of identifying,
describing and disseminating research outcomes, including the
software, data sets, benchmarks, configurations and other
information that are used in experimental evaluation, right
along with user-generated content. These emerging methods
represent a type of “active curation system” that allows
interactively modifying, deriving and running new experiments
beyond the ones described by a paper. In this paper, we present
a study (currently underway) with the ACM Digital Library
to understand how to integrate active curation systems and
to demonstrate their benefits to the experimental computer
science research community. The study’s outcomes will
provide valuable insight to authors, developers of active curation
systems, and publishers for preparing articles of the future.</p>
    </sec>
    <sec id="sec-6">
      <title>VI. ACKNOWLEDGMENTS</title>
      <p>The ACM Advisory Group overseeing this effort includes
Jack Davidson, University of Virginia (ex officio); Juliana
Freire, New York University; Sheila Morrissey, Ithaka S+R;
Bernard Rous, ACM; Michela Taufer, University of Delaware;
and Alex Wade, Microsoft.</p>
      <p>The ACM Task Force on Data, Software and
Reproducibility in Publication is led by Michael Lesk of Rutgers and Alex
Wade of Microsoft Research. The ACM SIG Governing Board
Replication Taskforce is led by Simon Harper.</p>
      <p>The application for case study 1 was suggested by Michela
Taufer. Stephen Harrell, Paul Peltz and John Johnson also
gave helpful insight into this application for the case study.
Grigori Fursin provided software improvements to Collective
Knowledge that are useful for case study 1. Debashis Gangley,
Adam Lee, Daniel Mosse’, and Bill Garrison contributed to
the integration of the access control simulator with OCCAM,
which will be used in case study 2.</p>
      <p>This project is funded by the Alfred P. Sloan Foundation and
conducted in conjunction with the Association for Computing
Machinery. Some material in this document is based in part
upon work supported by the National Science Foundation
under grant numbers ACI-1535232 and CNS-1305220. Any
opinions, findings, and conclusions or recommendations
expressed in this material are those of the author(s) and do not
necessarily reflect the views of the NSF.</p>
    </sec>
    <sec id="sec-7">
      <title>VII. DISCLOSURE</title>
      <p>Bruce Childers and David Wilkinson are the principal
investigator and developer, respectively, of OCCAM. Because
OCCAM is part of the study, an Advisory Group appointed
by ACM is overseeing the effort to manage potential conflict
of interest.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>“Artifact</given-names>
            <surname>Evaluation</surname>
          </string-name>
          ,” http://www.artifact-eval.org.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Krishnamurthi</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Vitek</surname>
          </string-name>
          , “
          <article-title>The real software crisis: Repeatability as a core value,” Commun</article-title>
          . ACM, vol.
          <volume>58</volume>
          , no.
          <issue>3</issue>
          , pp.
          <fpage>34</fpage>
          -
          <lpage>36</lpage>
          , Feb.
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Oliveira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-C.</given-names>
            <surname>Petkovich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Reidemeister</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Fischmeister</surname>
          </string-name>
          , “
          <article-title>Datamill: Rigorous performance evaluation made easy,”</article-title>
          <source>in Proc. of the 4th ACM/SPEC International Conference on Performance Engineering (ICPE)</source>
          , Prague, Czech Republic,
          <year>April 2013</year>
          , pp.
          <fpage>137</fpage>
          -
          <lpage>149</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>[4] “Open Curation for Computer Architecture Modeling,” http://occam.cs. pitt.edu.</mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>“</surname>
            <given-names>EUDAT:</given-names>
          </string-name>
          <article-title>the collaborative Pan-European infrastructure providing research data services, training and consultancy for Researchers, Research Communities and Research Infrastructures</article-title>
          and Data Centers,” https://www.eudat.eu/.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D. D.</given-names>
            <surname>Roure</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Goble</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Stevens</surname>
          </string-name>
          , “
          <article-title>The design and realisation of the virtual research environment for social sharing of workflows,” Future Generation Computer Systems</article-title>
          , vol.
          <volume>25</volume>
          , no.
          <issue>5</issue>
          , pp.
          <fpage>561</fpage>
          -
          <lpage>567</lpage>
          ,
          <year>2009</year>
          . [Online]. Available: http://www.sciencedirect.com/science/article/ pii/S0167739X08000939
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>K.</given-names>
            <surname>Belhajjame</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Garijo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gamble</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Hettne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Palma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Mina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Corcho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Gmez-Prez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bechhofer</surname>
          </string-name>
          , G. Klyne, and
          <string-name>
            <given-names>C.</given-names>
            <surname>Goble</surname>
          </string-name>
          , “
          <article-title>Using a suite of ontologies for preserving workflow-centric research objects,”</article-title>
          <source>Web Semantics: Science, Services and Agents on the World Wide Web</source>
          , vol.
          <volume>32</volume>
          , pp.
          <fpage>16</fpage>
          -
          <lpage>42</lpage>
          ,
          <year>2015</year>
          . [Online]. Available: http://www.sciencedirect.com/science/article/pii/S1570826815000049
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <article-title>[8] “figshare - Credit for all your research</article-title>
          ,” https://figshare.com.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>“</given-names>
            <surname>Zenodo</surname>
          </string-name>
          . Research. Shared.” http://zenodo.org.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Crosas</surname>
          </string-name>
          , “
          <article-title>The dataverse network: An open-source application for sharing, discovering and preserving data</article-title>
          ,”
          <string-name>
            <surname>D-Lib</surname>
            <given-names>Magazine</given-names>
          </string-name>
          , vol. Volume
          <volume>17</volume>
          ,
          <year>2011</year>
          . [Online]. Available: http://www.dlib.org/dlib/january11/crosas/ 01crosas.html
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <article-title>“Open Science Framework - A scholarly commons to connect the entire research cycle</article-title>
          ,” http://osf.org.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Towns</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Cockerill</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dahan</surname>
          </string-name>
          , I. Foster,
          <string-name>
            <given-names>K.</given-names>
            <surname>Gaither</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Grimshaw</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Hazlewood</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lathrop</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lifka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. D.</given-names>
            <surname>Peterson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Roskies</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Scott</surname>
          </string-name>
          , and
          <string-name>
            <given-names>N.</given-names>
            <surname>Wilkens-Diehr</surname>
          </string-name>
          , “Xsede: Accelerating scientific discovery,
          <source>” Computing in Science and Engineering</source>
          , vol.
          <volume>16</volume>
          , no.
          <issue>5</issue>
          , pp.
          <fpage>62</fpage>
          -
          <lpage>74</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>R.</given-names>
            <surname>Ricci</surname>
          </string-name>
          , E. Eide, and The CloudLab Team, “
          <article-title>Introducing CloudLab: Scientific infrastructure for advancing cloud architectures and applications,” USENIX ;login:</article-title>
          , vol.
          <volume>39</volume>
          , no.
          <issue>6</issue>
          ,
          <string-name>
            <surname>Dec</surname>
          </string-name>
          .
          <year>2014</year>
          . [Online]. Available: https://www.usenix.org/publications/login/dec14/ricci
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14] “
          <article-title>Chameleon: A configurable experimental environment for large-scale cloud research</article-title>
          ,” https://www.chameleoncloud.org/.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>N.</given-names>
            <surname>Nystrom</surname>
          </string-name>
          , “Building Bridges to the Future,” insideHPC,
          <year>June 2016</year>
          . [Online]. Available: http://insidehpc.com/
          <year>2016</year>
          /06/ building-bridges
          <article-title>-to-the-future/</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>P.</given-names>
            <surname>Flick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Jain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Pan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Aluru</surname>
          </string-name>
          , “
          <article-title>A parallel connectivity algorithm for De Bruijn graphs in metagenomic applications,” in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, ser</article-title>
          .
          <source>SC '15</source>
          . New York, NY, USA: ACM,
          <year>2015</year>
          , pp.
          <volume>15</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>15</lpage>
          :
          <fpage>11</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>[17] “Collective Knowledge,” https://github.com/ctuning/ck.</mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>G.</given-names>
            <surname>Fursin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Miceli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lokhmotov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gerndt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Baboulin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. D.</given-names>
            <surname>Malon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Chamski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Novillo</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D. D.</given-names>
            <surname>Vento</surname>
          </string-name>
          , “Collective Mind:
          <article-title>Towards Practical and Collaborative Auto-Tuning,” Scientific Programming</article-title>
          , vol.
          <volume>22</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>309</fpage>
          -
          <lpage>329</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>W. C.</given-names>
            <surname>Garrison</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Lee</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T. L.</given-names>
            <surname>Hinrichs</surname>
          </string-name>
          , “
          <article-title>An actor-based, application-aware access control evaluation framework,”</article-title>
          <source>in Proceedings of the 19th ACM Symposium on Access Control Models and Technologies</source>
          , ser.
          <source>SACMAT '14</source>
          . New York, NY, USA: ACM,
          <year>2014</year>
          , pp.
          <fpage>199</fpage>
          -
          <lpage>210</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhu</surname>
          </string-name>
          , and
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , “
          <article-title>A permutation-based page interleaving scheme to reduce row-buffer conflicts and exploit data locality,”</article-title>
          <source>in Proceedings of the 33rd Annual ACM/IEEE International Symposium on Microarchitecture, ser. MICRO 33</source>
          . New York, NY, USA: ACM,
          <year>2000</year>
          , pp.
          <fpage>32</fpage>
          -
          <lpage>41</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosenfeld</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Cooper-Balis</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Jacob</surname>
          </string-name>
          , “
          <article-title>DRAMSim2: A cycle accurate memory system simulator,” Computer Architecture Letters</article-title>
          , vol.
          <volume>10</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>16</fpage>
          -
          <lpage>19</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>