<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Exposing Provenance Metadata Using Di erent RDF Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Gang Fu</string-name>
          <email>gang.fu@nih.gov</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Evan Bolton</string-name>
          <email>bolton@ncbi.nih.gov</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nuria Queralt-Rosinach</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Laura I. Furlong</string-name>
          <email>lfurlong@imim.es</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vinh Nguyen</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Amit Sheth</string-name>
          <email>amit@knoesis.org</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Olivier Bodenreider</string-name>
          <email>olivier@nlm.nih.gov</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michel Dumontier</string-name>
          <email>michel.dumontier@stanford.edu</email>
        </contrib>
      </contrib-group>
      <abstract>
        <p>A standard model for exposing structured provenance metadata of scienti c assertions on the Semantic Web would increase interoperability, discoverability, reliability, as well as reproducibility for scienti c discourse and evidence-based knowledge discovery. Several Resource Description Framework (RDF) models have been proposed to track provenance. However, provenance metadata may not only be verbose, but also signi cantly redundant. Therefore, an appropriate RDF provenance model should be e cient for publishing, querying, and reasoning over Linked Data. In the present work, we have collected millions of pairwise relations between chemicals, genes, and diseases from multiple data sources, and demonstrated the extent of redundancy of provenance information in the life science domain. We also evaluated the suitability of several RDF provenance models for this crowdsourced data set, including the N-ary model, the Singleton Property model, and the Nanopublication model. We examined query performance against three commonly used large RDF stores, including Virtuoso, Stardog, and Blazegraph. Our experiments demonstrate that query performance depends on both RDF store as well as the RDF provenance model.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Evidence and provenance are key aspects of a healthy scienti c discourse. A
standard model to provide structured and interoperable metadata linked to scienti c
assertions is of increasing interest [22,16]. The Resource Description Framework
(RDF), the lingua franca for the Semantic Web, o ers the building blocks by
which statements can be provided along with their metadata. Structured
metadata, such as whether the resource was manually curated or automatically text
mined from scienti c literature, is key to assessing quality of information. Hence,
a scalable and well-designed RDF-based metadata model is crucial for knowledge
integration.</p>
      <p>
        Specifying the provenance of a single entity can be easily achieved using
existing RDF terminologies such as PROV. However, it is the speci cation of the
provenance of a binary or n-ary relation which remains non-standard. Several
models for exposing the provenance metadata of the relations have been
proposed including adding provenance annotations to i) an instance of a class that
represents the n-ary relation (N-ary model) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]; ii) an instantiated property, i.e.
Singleton property (SP) model [18]; and iii) a graph that contains the relational
assertions, i.e. Nanopublication model [12]. In the life sciences, the N-ary model
has been used to capture the provenance information for protein-protein
interactions (i.e. iRefIndex database [20]) and text-mined gene-disease interactions
(i.e. DisGeNET [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]), while the recently proposed SP model [18] has been used
across elements of biomedical and material sciences. Despite their use to
represent various data, no study has yet been performed to examine the advantages
and disadvantages of all these models using a common dataset.
      </p>
      <p>
        In the present study, we aim to evaluate the consequence of using di erent
RDF models to capture provenance metadata for life science data. We examine
the number of triples generated and query performance on three RDF stores:
Virtuoso [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], StarDog [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], and BlazeGraph [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Regarding to the provenance
metadata of the relational assertions, we consider the data source, the supporting
scienti c publication, and the biological species where the given assertion holds
true. In addition to the three basic RDF models described above, we also
examine the implementions of the so-called cardinal assertion model that was rst
introduced by Nanopublications [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] on the N-ary and SP models, to create a
nonredundant network of assertions. This consideration is particularly important as
there exists substantive overlap in the assertions from multiple databases. For
instance, the asserted relation between dexamethasone (PubChem Compound
5743) and glucocorticoid receptor (GR) (NCBI Gene 2908) was mentioned by
four di erent data sources, but each data source cites an entirely di erent set
of scienti c publications in support of the assertion. This work is crucial for
the e cient implementation of scalable, interoperable, and extensible knowledge
models for open data sources including PubChemRDF [10], Bio2RDF [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], and
DisGeNET-RDF[19].
2
2.1
      </p>
    </sec>
    <sec id="sec-2">
      <title>Methods</title>
      <sec id="sec-2-1">
        <title>Dataset preparation</title>
        <p>
          We generated a reference dataset of pairwise relations between chemicals, genes,
and diseases from multiple data sources across life science domain. The
chemicaldisease relations were obtained from National Drug File Reference Terminology
(NDFRT) [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], CTD [9], KEGG [13], and SIDER [15]; chemical-gene relations
were obtained from CTD [9], DrugBank [14], KEGG [13], IUPHAR-DB [23], and
ChEMBL [11]; protein-protein relations were obtained from iRefIndex [20] and
BioGRID [24]; gene-disease were contributed by DisGeNET [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. All chemicals
were represented using PubChem Compound identi ers (CIDs), all genes were
represented using National Center for Biotechnology Information (NCBI) Gene
identi ers (GIDs), and all diseases were represented using the Uni ed Medical
Language System (UMLS) Concept Unique Identi ers. The pairwise relations
were normalized using the modi ed Semantic Network standard vocabulary [21].
The interrelations between biomedical entities (chemicals, genes, and diseases)
constitute a semantic network, and SPARQL queries were used to explore the
network topology on behalf of evidence-based hypothesis generation. However,
it is fairly common to collect the identical assertion from multiple sources, in
particular, for such a consolidated knowledge base. Hence, additional constraints
were applied in the searching strategies.
2.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>RDF model construction</title>
        <p>Five RDF models were studied, including N-ary model with and without cardinal
assertion (Fig. 1), SP model with and without cardinal assertion (Fig. 2), and the
Nanopublication model (Fig. 3). Only the assertion graphs and the provenance
graphs were considered in the Nanopublication model. In both N-ary and SP
cardinal assertion variants, a predicate cito:providesAssertionFor is used to
link the cardinal assertion of the pairwise relation to the multiple evidence (Fig.
1A and 2A). Without cardinal assertion, the pairwise relation would be asserted
redundantly by multiple data sources (Fig. 1B and 2B). In the Nanopublication
model variant A, one assertion graph may correspond with one or more than
one provenance graphs (Fig. 3). In the following comparative analysis, Model
I refers to the N-ary model with cardinal assertion, Model II refers to the
Nary model without cardinal assertion, Model III refers to the SP model with
cardinal assertion, Model IV refers to the SP model without cardinal assertion,
and Model V refers to the Nanopublication model.
2.3</p>
      </sec>
      <sec id="sec-2-3">
        <title>Query formulation</title>
        <p>An interesting research topic in drug discovery is to determine which proteins
are responsible for eliciting particular drug side e ects. We formulated SPARQL
queries to examine this question using di erent levels of complexity (Q1, Q2)
and provenance constraints (Q3, Q4). Q1 explores the hypothesis that if
chemical A inhibits gene B, and gene B interacts with gene C, and gene C is linked to
disease D, then the above path can be used to explain the disease/adverse side
e ect D caused by chemical A. It should be noted that the observed side e ect
can be explained in several ways: either the aforementioned three-step indirect
paths, or the two-step indirect path involving only the chemical-gene interaction
and gene-disease associations. Therefore, we have constructed another query,
i.e. Q2, to lter out the diseases that are associated with genes that directly
interact with the given chemical. The rst two queries do not take into account
the provenance metadata, and it is usually the case that only the integrated
assertions are considered on behalf of hypothesis generation and knowledge
discovery. Q3 narrows down the search results by applying data source constraints.
Q4 restricts by number of aggregated evidence on Q1: such that the query only
considers the pairwise relations in the indirect path that have more than one
supporting literature references.</p>
        <p>We carried out Q1 through Q4 on six chemicals that have extensive
biomedical annotations from multiple data sources: propranolol (CID4946), clotrimazole
(CID2812), mitoxantrone (CID4212), risperidone (CID5073), chlorpromazine
(CID2726), and haloperidol (CID3559). There are hundreds of similar compounds
in the integrated dataset and they are of key interest in the context of drug
repurposing and development.</p>
        <p>All queries were performed against three RDF stores without further tuning:
open source Virtuoso 7.1, Stardog 2.2, and Blazegraph 1.5. The con guration
allowed up to 16 GB memory for each RDF store to run queries, which were
performed on cold cache. The Log10 transformations of the execution time in
millisecond were illustrated in boxplot; the averages and standard deviations
of the execution time in seconds were summarized as well in the comparative
analysis.</p>
        <p>The data sets and the SPARQL queries are available at: http://figshare.
com/articles/Provenance_RDF_Models/1399197.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Results and Discussions</title>
      <sec id="sec-3-1">
        <title>Data set statistics</title>
        <p>We rst compared the total number of triples that each RDF model contains. The
most e cient RDF model is SP model without cardinal assertion (Model IV),
which contains 17,239,427 triples, and the cardinal assertion of SP model (Model</p>
        <p>III) increased the total number of triples by about 14% to 19,575,298. For N-ary
models, the cardinal assertion also increased the total number of triples by about
6%, from 21,445,348 (Model II) to 22,787,218 (Model I). The N-ary model
requires two triples (predicates sio:has-agent and sio:has-target) to represent
the agent and target in a biological process, while the SP model maintains the
previous binary relation structure in only one triple. Hence, with the cardinal
assertion, the N-ary model (Model I) contains 3,211,920 more triples ( 16%) in
comparison with the SP model (Model III), and without the cardinal assertion,
the N-ary model (Model II) contains even more triples (4,205,921 triples) in
contrast to the SP model (Model IV). The Nanopublication model is the most
verbose model in this regard, which contains 27,605,782 triples distributed in
8,251,238 graphs.</p>
        <p>We also studied the amount of evidence associated with each relational
assertion to illustrate the degree of redundancy with respect to the identical pairwise
relations in the life science domain. We only examine object property instances
representing the pairwise relations that were created in the SP models (Model
III and Model IV), as the degree of redundancy is same across other RDF
models. The total number of unique subjects in the SP models with and without
cardinal assertion are 7,654,605 (Model III) and 4,442,685 (Model IV),
respectively. The di erence between the two numbers accounts for the total number
of object property instances arbitrarily created for the cardinal assertions. If
there are multiple cases of evidence for a given assertion, the cardinal assertion
variant may reduce the total number of triples to express the same information,
however, if there is only one case of evidence for a given assertion, the cardinal
assertion will increase the total number of triples. Hence, whether the cardinal
assertion can reduce the total number of triples depends on the extent of
redundancy of the identical pairwise relations in the data set. Among 3,211,920
cardinal assertions, 2,800,124 ( 87%) of them are only associated with one
evidence, 238,558 ( 7%) of them are associated with two cases of evidence, 67,088
( 2%) of them are associated with three cases of evidence, and 98,625 ( 3%)
of them are associated with more than three cases of evidence. The pairwise
relations between PubChem compound CID5694 and NCBI gene GID5465 is
associated with the most number of cases of evidence (3,096). Although there
were many redundant assertions from multiple data sources, the majority have
only one supporting evidence. Hence, the increase in the total number of triples
were largely attributable to publication assertions.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Query performance evaluation</title>
        <p>We undertook a performance evaluation using three RDF databases (see Table
1). With Virtuoso, the SP models with and without cardinal assertion (Model III
and IV) largely outperformed the other models. Q1 and Q2 executed roughly
100 times faster on the SP models as compared to the N-ary models. Although
Model V yielded comparable performance with Model III and IV in Q1, the
additional ltering constraint made it much slower in Q2. In Q4, Model III,
IV, and V performed similarly, which are 10 times faster than Model II and
100 times faster than Model I. In general, Virtuoso performed best using the
SP models. With the Stardog RDF store, the N-ary models and the SP models
were comparable in performance, but they always outperformed
Nanopublication model. In particular, when the aggregated evidence was considered in Q4,
both N-ary and SP models with and without cardinal assertion were carried
out over 10 times faster than the Nanopublication model. Using Blazegraph, the
Nanopublication model generally outperformed other models. In particular, Q1
and Q2 were carried out over 10 times faster in Model V rather than in other
models.</p>
        <p>Without querying the provenance metadata, the models with cardinal
assertion (Model I and III) always yielded better performance in comparison with
the models without cardinal assertion (Model II and IV accordingly). Hence, if
we remove the redundant identical assertions from various data sources in both
N-ary and SP models, the graph traversal-like queries can be executed much
faster. If we think of conjunctive queries (i.e. graph traversal or inner join) as
performing Cartesian products, the computational costs go up exponentially as
the number of data items increase. Hence, the redundant pairwise relations cost
much more time rather than cardinal assertions in Q1 and Q2. However, if the
provenance restrictions were considered, the model without cardinal assertion
(Model II and IV) usually outperformed, except the Q3 of the SP models
executed in Stardog and Q4 of both N-ary and SP models executed in Blazegraph.
But the di erence of query performance were usually small, except for the Q4
of the N-ary models executed in Virtuoso, and the Q3 of both N-ary and SP
models executed in Blazegraph. So in general, if the provenance restrictions were
considered, the models with and without cardinal assertion were comparable.</p>
        <p>a The average execution times are in the rst line, and the standard
deviations are in the second line within parenthesis; the best performance has been
highlighted in bold.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>In this study, we evaluated three existing RDF models and two cardinal assertion
models for representing relations and exposing their provenance metadata. We
examined the e ect of each model on overall graph size and query time execution
across three di erent RDF databases. Since our integrated life science dataset
contained many duplicate assertions, graph traversal can be accomplished in a
much more e cient way using the cardinal assertion. The redundant assertions
add up a lot of computational overhead when searching through the integrated
knowledge base for evidence-based hypothesis exploration. Surprisingly, we found
that each RDF store performed the best using a di erent provenance model. It
has been demonstrated that SPARQL queries may be executed in a RDF store
speci c manner in a previous analysis [17]. Our results drew a similar conclusion
and may have contentious implications for the standardization of a provenance
model, which should ideally be software/platform/system agnostic. A more
extensive analysis with larger benchmark datasets and more query patterns would
be helpful in the future study.</p>
      <p>Acknowledgements This work was initiated at the 2014 BioHackathon in
Fukashima; This research was supported [in part] by the Intramural Research
Program of the National Library of Medicine; The research leading to these
results has received support from Instituto de Salud Carlos III-Fondo Europeo
de Desarrollo Regional (PI13/00082 and CP10/00524), the Innovative Medicines
Initiative Joint Undertaking under grant agreements n 115191 (Open PHACTS)],
resources of which are composed of nancial contribution from the European
Union's Seventh Framework Programme (FP7/2007-2013) and EFPIA
companies in kind contribution. The Research Programme on Biomedical Informatics
(GRIB) is a node of the Spanish National Institute of Bioinformatics (INB).
9. Davis, A.P., Grondin, C.J., Lennon-Hopkins, K., Saraceni-Richards, C., Sciaky,
D., King, B.L., Wiegers, T.C., Mattingly, C.J.: The comparative toxicogenomics
database's 10th year anniversary: update 2015. Nucleic Acids Res 43(Database
issue), D914{20 (2015)
10. Fu, G., Batchelor, C., Dumontier, M., Hastings, J., Willighagen, E., Bolton, E.:
Pubchemrdf: towards the semantic annotation of pubchem compound and
substance databases. J Cheminform 7, 34 (2015)
11. Gaulton, A., Bellis, L.J., Bento, A.P., Chambers, J., Davies, M., Hersey, A., Light,
Y., McGlinchey, S., Michalovich, D., Al-Lazikani, B., Overington, J.P.: Chembl: a
large-scale bioactivity database for drug discovery. Nucleic Acids Res 40(Database
issue), D1100{7 (2012)
12. Groth, P., Gibson, A., Velterop, J.: The anatomy of a nanopublication. Inf. Serv.</p>
      <p>Use 30(1-2), 51{56 (2010)
13. Kanehisa, M., Goto, S., Sato, Y., Furumichi, M., Tanabe, M.: Kegg for
integration and interpretation of large-scale molecular data sets. Nucleic Acids Res
40(Database issue), D109{14 (2012)
14. Knox, C., Law, V., Jewison, T., Liu, P., Ly, S., Frolkis, A., Pon, A., Banco, K., Mak,
C., Neveu, V., Djoumbou, Y., Eisner, R., Guo, A.C., Wishart, D.S.: Drugbank
3.0: a comprehensive resource for 'omics' research on drugs. Nucleic Acids Res
39(Database issue), D1035{41 (2011)
15. Kuhn, M., Campillos, M., Letunic, I., Jensen, L.J., Bork, P.: A side e ect resource
to capture phenotypic e ects of drugs. Mol Syst Biol 6, 343 (2010)
16. Machado, C.M., Rebholz-Schuhmann, D., Freitas, A.T., Couto, F.M.: The semantic
web in translational medicine: current applications and future directions. Brief
Bioinform 16(1), 89{103 (2015)
17. Mironov, V., Seethappan, N., Blonde, W., Antezana, E., Splendiani, A., Kuiper,
M.: Gauging triple stores with actual biological data. BMC Bioinformatics 13 Suppl
1, S3 (2012)
18. Nguyen, V., Bodenreider, O., Sheth, A.: Don't like rdf rei cation?: Making
statements about statements using singleton property. In: Proceedings of the 23rd
International Conference on World Wide Web. pp. 759{770. WWW '14, ACM,
Republic and Canton of Geneva, Switzerland (2014), http://dx.doi.org/10.1145/
2566486.2567973
19. Pinero, J., Queralt-Rosinach, N., Bravo, A., Deu-Pons, J., Bauer-Mehren, A.,
Baron, M., Sanz, F., Furlong, L.I.: Disgenet: a discovery platform for the dynamical
exploration of human diseases and their genes. Database (Oxford) 2015 (2015)
20. Razick, S., Magklaras, G., Donaldson, I.M.: ire ndex: a consolidated protein
interaction database with provenance. BMC Bioinformatics 9, 405 (2008)
21. Rosemblat, G., Shin, D., Kilicoglu, H., Sneiderman, C., Rind esch, T.C.: A
methodology for extending domain coverage in semrep. J Biomed Inform 46(6),
1099{107 (2013)
22. Sahoo, S., Nguyen, V., Bodenreider, O., Parikh, P., Minning, T., Sheth, A.: A
uni ed framework for managing provenance information in translational research.</p>
      <p>BMC Bioinformatics (2011)
23. Sharman, J.L., Benson, H.E., Pawson, A.J., Lukito, V., Mpamhanga, C.P.,
Bombail, V., Davenport, A.P., Peters, J.A., Spedding, M., Harmar, A.J.: Iuphar-db:
updated database content and new features. Nucleic Acids Res 41(Database
issue), D1083{8 (2013)
24. Stark, C., Breitkreutz, B.J., Reguly, T., Boucher, L., Breitkreutz, A., Tyers, M.:
Biogrid: a general repository for interaction datasets. Nucleic Acids Res 34(Database
issue), D535{9 (2006)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>1. Blazegraph. http://www.systap.com/rdf/</mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Rdf</surname>
          </string-name>
          n-ary. http://www.w3.org/TR/swbp-n-aryRelations/
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>3. Stardog. http://stardog.com/</mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>4. Virtuoso. http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/</mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5. A, G., JCJ, v.D., EA,
          <string-name>
            <surname>S.</surname>
          </string-name>
          ,
          <string-name>
            <surname>M</surname>
          </string-name>
          , R.,
          <string-name>
            <surname>B</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          :
          <article-title>Towards computational evaluation of evidence for scienti c assertions with nanopublications and cardinal assertions</article-title>
          .
          <source>In: 5th International Workshop on Semantic Web Applications and Tools for Life Sciences (SWAT4LS)</source>
          . pp.
          <volume>28</volume>
          {
          <fpage>30</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Bauer-Mehren</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rautschka</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sanz</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Furlong</surname>
            ,
            <given-names>L.I.</given-names>
          </string-name>
          :
          <article-title>Disgenet: a cytoscape plugin to visualize, integrate, search and analyze gene-disease networks</article-title>
          .
          <source>Bioinformatics</source>
          <volume>26</volume>
          (
          <issue>22</issue>
          ),
          <volume>2924</volume>
          {6 (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Belleau</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nolin</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tourigny</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rigault</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morissette</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>Bio2rdf: towards a mashup to build bioinformatics knowledge systems</article-title>
          .
          <source>Journal of biomedical informatics 41(5)</source>
          ,
          <volume>706</volume>
          {
          <fpage>716</fpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Brown</surname>
            ,
            <given-names>S.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Elkin</surname>
            ,
            <given-names>P.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosenbloom</surname>
            ,
            <given-names>S.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Husser</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bauer</surname>
            ,
            <given-names>B.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lincoln</surname>
            ,
            <given-names>M.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carter</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Erlbaum</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tuttle</surname>
            ,
            <given-names>M.S.:</given-names>
          </string-name>
          <article-title>Va national drug le reference terminology: a cross-institutional content coverage study</article-title>
          .
          <source>Stud Health Technol Inform 107(Pt 1)</source>
          ,
          <volume>477</volume>
          {
          <fpage>81</fpage>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>