<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Using GVF for Clinical Annotation of Personal Genomes</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Barry Moore</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shawn Rynearson</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fiona Cunningham</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Graham Ritchie</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Karen Eilbeck</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>. Department of Biomedical Informatics, University of Utah</institution>
          ,
          <addr-line>Salt Lake City, Utah</addr-line>
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>. Department of Human Genetics, University of Utah</institution>
          ,
          <addr-line>Salt Lake City, Utah</addr-line>
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>. Ensembl Variation Group, EMBL-EBI</institution>
          ,
          <addr-line>Genome Campus, Hinxton</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Accurately describing the contents of Next Generation Sequencing (NGS) results is vital to both research and clinical analysis of genomic data. Genomics and medicine use different, often incompatible terminologies and standards to describe sequence variants and their functional effects. This creates an information bottleneck that prevents efficient translation of genome scale nextgeneration sequence (NGS) information into the clinic. While the Variant Call Format (VCF) has met some of these challenges, with regards to describing the results of variant calling pipelines, it lacks the structure needed for detailed annotation of the consequences of sequence alterations. To incorporate genomic results into electronic health records (EHR), the results must also be defined in ways that are compatible with existing medical informatics systems. The Genome Variation Format (GVF) is an extension of the existing genome annotation format GFF3, which uses ontologies to capture the semantic nature of the information on sequence features. GVF uses the Sequence Ontology (SO) to define the type of sequence alteration, the genomic features that are changed and the effect of the change. We have extended and remodeled the Sequence Ontology to include and define more terms that describe the consequence of a variant upon genomic features in support of the Ensemble variation databases. GVF represents genome annotations for clinical applications using existing EHR standards as defined by the international standards consortium: Health Level 7. This means that GVF can describe the information that defines genetic tests, allowing seamless incorporation of genomic data into pre-existing EHR systems. Here we demonstrate the power of GVF to describe, to exchange, and to empower clinical interpretation of personal genome data through an extension of the GVF specification is called GVFClin. The Sequence Ontology Project maintains and updates the specification and provides the underlying structure that describes sequence features, sequence alterations and variant effects and their relationships to each other. The specification is available on the web at http://www.sequenceontology.org/resources/gvfclin.html.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Background</title>
      <p>
        Next generation sequencing (NGS) technologies have provided an enormous expansion in our
understanding of the landscape of genetic variation [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ] as well as the impact of that variation on
human health [
        <xref ref-type="bibr" rid="ref3 ref4 ref5">3–5</xref>
        ]. These datasets create a significant burden in computational analysis and data
storage, but established work-flows for analysis are emerging [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and well established data formats
exist for each stage of the process. The original base calls from the sequencer are converted to
FASTQ files [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] that contain the sequence data; the SAM format [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] captures the alignment of the
sequence to a reference genome and the Variant Call Format [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] has become widely adopted by
variant calling tools to report variants and the information needed to call them. However, knowing
the type and genomic location of a sequence change is just the first step in understanding it’s clinical
or biological consequences. Variant annotation then begins the process of adding additional
knowledge about the structural and functional consequences of those variants through their impacts
on other sequences features and ultimately on phenotype.
      </p>
      <p>
        In the context of medicine, variant data must flow smoothly and reliably from the sequencer to the
physician and formidable barriers currently exist to this flow of information. Significant efforts
have been undertaken to standardize the description of genetic variants [
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ] and the HGVS
nomenclature has done much to unify the notation of clinical variants in the literature. NGS
sequencing is providing a wealth of new information about the types of genetic variants that exist
[12] and the types of features that those variants impact [13] and thus ad hoc descriptions of variants
and their effects persist. There is a need for a file format that provides the structure necessary not
only to describe sequence variants, but also to bridge the gap between genomics and medicine by
providing the structure necessary to capture clinically relevant variant data in a format compatible
with EHR standards.
      </p>
      <p>The Genome Variation Format (GVF) [14] is a variant file format for the detailed annotation of
genetic variation. GVF is a community supported format that uses established ontologies such as the
Sequence Ontology [15] to describe the variant data. GVF does not replace existing variant
nomenclature systems such as HGVS [16] and ISCN [17] that provide effective ways to
unambiguously describe individual variants in the literature. GVF provides the infrastructure to
support inclusion of these nomenclatures along with detailed variant annotation in a format capable
of supporting genome scale variant data. GVF is used in the community for exchange of variant
annotations (Ensembl: ftp://ftp.ensembl.org/pub/release-67/variation/gvf/ and dbVar:
ftp://ftp.ncbi.nlm.nih.gov/pub/dbVar/data/Homo_sapiens/by_assembly/NCBI36/gvf/) and is
compatible with existing GFF3 software [18, 19] as well as emerging domain specific tools [20, 21].</p>
    </sec>
    <sec id="sec-2">
      <title>Implementation</title>
      <p>Structurally, GVF is a text based, tab-delimited format modeled on the existing and widely adopted
Generic Feature Format - GFF3. GVF describes both file-wide meta data through the use of
pragmas, as well as detailed information about individual variants. A very simple GVF file with a
single variant is shown in Figure 1. The first 4 lines contain file wide directives while the fifth line
describes an SNV on chromosome 16. Here, the reference sequence is C, with a heterozygous
variant genotype of T,C. The variant causes a missense_variant in the intersecting mRNA. The SO
is used three times to type this annotation; the sequence alteration (SNV), the effect
(nonsynonymous_codon) and the intersected feature (mRNA).</p>
      <sec id="sec-2-1">
        <title>Describing sequence alterations and their consequences</title>
        <p>Sequence alterations are the changes observed in biological sequence when compared to a reference
genome. In the SO there are 30 kinds of sequence _alteration ranging from very general types such
as substitution to very specific types such as purine_to_pyrimidine_transversion. The relationships
defined by the ontology allow users and software to infer more general terms from more specific
ones - for example a purine_to_pyrimidine_transversion is a SNV. Figure 2 shows a summary of the
most general sequence_alteration terms. The consequence of such alterations fall into two classes:
functional variants and structural variants. Annotating functional variants requires a deep
understanding of the underlying biology, while annotations of structural variants can typically be
inferred from the reference genome and other genomic features that intersect the variant. The
Ensembl Variation group has worked with the SO to produce a classification of the variants found in
their database that will allow their users to effectively search variants and their effects.
In the SO the sequence alteration and the effects of the alteration are separated in the ontology, and
in the annotation. For example, historically a small deletion may be referred to as a microindel,
where as a much larger deletion might be described as a copy number variant (CNV). In the SO
however a single term, deletion, is used to describe all instances where a region of sequence is
removed. The kinds of sequence_alteration are shown in Figure 2. The effect of the deletion on the
structure of the genome is either a kind of feature_variant where by the internals of the feature such
as an exon are changed, or a feature_ablation where a region comprising one or more features is
removed. Thus the effect of a small deletion is annotated using the appropriate child of
feature_variant such as frameshift_truncation and the effect of large deletions are annotated with the
appropriate child term of feature_ablation such as transcript_ablation. Sequence_variant and child
terms that categorize the effect of a sequence alteration are depicted in Figure 3.
The majority of the sequence alterations annotated by the EBI group cause feature_variants. These
feature variants are shown in Figure 4, where the terms used in EBI annotations are highlighted in
blue. There are four main subtypes: upstream_gene_variant, downstream_gene_variant,
gene_variant and regulatory_region_variant. Of these terms, gene_variant has 77 direct and
indirect subtypes and includes most of the terms that describe structural sequence variants caused by
substitutions and small inserts and deletions. This portion of the Sequence Ontology contains terms
with multiple parents, to allow for effective querying of the annotations. For example, the term
stop_retained_variant is both a synonymous_variant and a terminator_codon_variant. Users are
thus able to query the Ensembl databases for all terminator codon variants or all synonymous
variants. Variant genome annotations for 19 organisms, typed using SO and available in GVF are
available within the Ensembl databases (http://www.ensembl.org/) and for download
(ftp://ftp.ensembl.org/pub/release-67/variation/gvf/).</p>
      </sec>
      <sec id="sec-2-2">
        <title>Electronic health record compliant data with GVFClin</title>
        <p>GVF was initially developed for exchange of variant annotations in personal genomes. To empower
clinical use of personal genomic data we have specified the format to adhere to existing EHR
standards defined by the HL7 (http://www.hl7.org) clinical genomics working group including
LOINC® (http://www.loinc.org), and the SNOMED [22], RxNorm [23] and HGVS [24]
vocabularies and nomenclature. Use of Locus Reference Genomic (LRG) [25] sequences provide a
stable genomic sequence reference set within these standards which stabilizes the description of
variants relative to permanent sequence feature coordinates. We have added 14 additional attributes
(Table 1) to support annotation of clinical variants and refer to this extension of the standard as
GVFClin. The extensions which define a GVFClin document may be found online at
http://www.sequenceontology.org/resources/gvfclin.html.</p>
        <p>Clin_HGVS_protein=NP_001128727.1:p.Val209Ile;
An interpretation of the pathogenicity of a the given sequence_alteration with
regards to the assessed disease. With values constrained by the answer list
associated with the LOINC code 53037-8. Positive, Negative, Inconclusive, Failure
Clin_disease_interpret=Positive;
An interpretation of the metabolism rate due to a given sequence_alteration with
regards to the assessed drug. With values constrained by the answer list associated
with this LOINC code 53040-2. Ultrarapid metabolizer, Extensive metabolizer,
Intermediate metabolizer, Poor metabolizer
Clin_disease_interpret=Ultrarapid metabolizer;
An interpretation of the efficacy of a drug, due to the sequence_alteration. With
values constrained by the answer list associated with this LOINC code 51961-1.</p>
        <p>Resistant, Responsive, Presumed resistant, Presumed responsive, Unknown
Significance, Benign, Presumed Benign, Presumed non-responsive</p>
        <p>Clin_drug_efficacy_interpret=non-responsive;</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Conclusions</title>
      <p>Next generation sequencing technologies have provided unprecedented opportunities for low cost
and large-scale analysis of human genetic variation and it’s consequences. The ability of the
emerging field of personal genomics to provide genome wide information on genetic variation for
an individual promises more accurate and effective health care. The ability to deliver on this
promise is currently hampered by the inability of existing formats to annotate genome scale genetic
variation data in a way that is compatible with EHRs. The Genome Variation Format builds on an
established genome annotation standard, with additional structure for describing the genetic
variation in personal genomes.</p>
      <p>GVFClin provides and additional layer of constraints designed to make compliant documents
readily interpretable in a clinical context compatible with EHR standards. The terms used by GVF
and GVFClin to describe sequence alterations, their effects and the affected sequence features are
constrained by the Sequence Ontology through an open community supported process. The
Vertebrate Genomics group at the EBI and the dbVar group at the NCBI have adopted GVF for
distribution of genetic variation data. In addition to the existing software tools that support GFF3
format (and thus by extension support the fully compatible GVF format), domain specific software
tools have been published which natively support GVF files.</p>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgements</title>
      <p>This work was supported by the National Human Genome Research Institute [5R01HG004341 to
KE]. We would like to thank Matthew Hurles at the Welcome Trust Sanger Institute for his insight
on the annotation of large structural variants.</p>
    </sec>
    <sec id="sec-5">
      <title>References</title>
      <p>12. Scherer SW, Lee C, Birney E, Altshuler DM, Eichler EE, Carter NP, Hurles ME, Feuk L:
Challenges and standards in integrating surveys of structural variation. Nature Genetics 2007,
39:S7–15.
13. Myers RM, Stamatoyannopoulos J, Snyder M, Dunham I, Hardison RC, Bernstein BE, Gingeras
TR, Kent WJ, Birney E, Wold B, Crawford GE: A user’s guide to the encyclopedia of DNA
elements (ENCODE). PLoS Biology 2011, 9:e1001046.
14. Reese MG, Moore B, Batchelor C, Salas F, Cunningham F, Marth GT, Stein L, Flicek P,
Yandell M, Eilbeck K: A standard variation file format for human genome sequences. Genome
Biology 2010, 11:R88.
15. Eilbeck K, Lewis SE, Mungall CJ, Yandell M, Stein L, Durbin R, Ashburner M: The Sequence
Ontology: a tool for the unification of genome annotations. Genome Biology 2005, 6:R44.
16. Cotton RGH, Horaitis O: Human Genome Variation Society. In Nature Encyclopedia of the
Human Genome. London: Nature Publishing Group; 2003:361–362.
17. ISCN 2009: An International System for Human Cytogenetic Nomenclature. Basel: Karger AG;
2009:138.
18. Stajich JE, Block D, Boulez K, Brenner SE, Chervitz SA, Dagdigian C, Fuellen G, Gilbert JGR,
Korf I, Lapp H, Lehväslaiho H, Matsalla C, Mungall CJ, Osborne BI, Pocock MR, Schattner P,
Senger M, Stein LD, Stupka E, Wilkinson MD, Birney E: The Bioperl toolkit: Perl modules for
the life sciences. Genome Research 2002, 12:1611–8.
19. Generic Model Organism Database.
20. Song T, Hwang K-B, Hsing M, Lee K, Bohn J, Kong SW: gSearch: a fast and flexible general
search tool for whole-genome sequencing. Bioinformatics 2012.
21. Yandell M, Huff C, Hu H, Singleton M, Moore B, Xing J, Jorde LB, Reese MG: A probabilistic
disease-gene finder for personal genomes. Genome Research 2011, 21:1529–42.
22. Stearns MQ, Price C, Spackman K a, Wang a Y: SNOMED clinical terms: overview of the
development process and project status. Proceedings / AMIA Annual Symposium 2001:662–6.
23. Nelson SJ, Zeng K, Kilbourne J, Powell T, Moore R: Normalized names for clinical drugs:
RxNorm at 6 years. Journal of the American Medical Informatics Association 2011, 18:441–8.
24. Horaitis O, Cotton RGH: The challenge of documenting mutation across the genome: the
human genome variation society approach. Human Mutation 2004, 23:447–52.
25. Dalgleish R, Flicek P, Cunningham F, Astashyn A, Tully RE, Proctor G, Chen Y, McLaren
WM, Larsson P, Vaughan BW, Béroud C, Dobson G, Lehväslaiho H, Taschner PE, den Dunnen JT,
Devereau A, Birney E, Brookes AJ, Maglott DR: Locus Reference Genomic sequences: an
improved basis for describing human DNA variants. Genome Medicine 2010, 2:24.
26. Seal RL, Gordon SM, Lush MJ, Wright MW, Bruford EA: genenames.org: the HGNC
resources in 2011. Nucleic Acids Research 2011, 39:D514–9.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. 1000
          <string-name>
            <given-names>Genomes</given-names>
            <surname>Project</surname>
          </string-name>
          <article-title>Consortium: A map of human genome variation from population-scale sequencing</article-title>
          .
          <source>Nature</source>
          <year>2010</year>
          ,
          <volume>467</volume>
          :
          <fpage>1061</fpage>
          -
          <lpage>73</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>MacArthur</surname>
            <given-names>DG</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Balasubramanian</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frankish</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huang</surname>
            <given-names>N</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morris</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Walter</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jostins</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Habegger</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pickrell</surname>
            <given-names>JK</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montgomery</surname>
            <given-names>SB</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Albers</surname>
            <given-names>C a</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            <given-names>ZD</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Conrad</surname>
            <given-names>DF</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lunter</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zheng</surname>
            <given-names>H</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ayub</surname>
            <given-names>Q</given-names>
          </string-name>
          ,
          <string-name>
            <surname>DePristo</surname>
            <given-names>M a</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Banks</surname>
            <given-names>E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hu</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Handsaker</surname>
            <given-names>RE</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosenfeld</surname>
            <given-names>J</given-names>
          </string-name>
          a,
          <string-name>
            <surname>Fromer</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jin</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mu</surname>
            <given-names>XJ</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Khurana</surname>
            <given-names>E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ye</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kay</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saunders</surname>
            <given-names>GI</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Suner M-M</surname>
            , Hunt
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barnes</surname>
            <given-names>IH</given-names>
          </string-name>
          a,
          <string-name>
            <surname>Amid</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carvalho-Silva</surname>
            <given-names>DR</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bignell</surname>
            <given-names>AH</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Snow</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yngvadottir</surname>
            <given-names>B</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bumpstead</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cooper</surname>
            <given-names>DN</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xue</surname>
            <given-names>Y</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Romero</surname>
            <given-names>IG</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            <given-names>Y</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gibbs R</surname>
          </string-name>
          <article-title>a, McCarroll S a</article-title>
          ,
          <string-name>
            <surname>Dermitzakis</surname>
            <given-names>ET</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pritchard</surname>
            <given-names>JK</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barrett</surname>
            <given-names>JC</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Harrow</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hurles</surname>
            <given-names>ME</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gerstein</surname>
            <given-names>MB</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tyler-Smith</surname>
            <given-names>C</given-names>
          </string-name>
          :
          <article-title>A systematic survey of loss-of-function variants in human proteincoding genes</article-title>
          .
          <source>Science</source>
          <year>2012</year>
          ,
          <volume>335</volume>
          :
          <fpage>823</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bamshad</surname>
            <given-names>MJ</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ng</surname>
            <given-names>SB</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bigham</surname>
            <given-names>AW</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tabor</surname>
            <given-names>HK</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Emond</surname>
            <given-names>MJ</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nickerson</surname>
            <given-names>DA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shendure</surname>
            <given-names>J</given-names>
          </string-name>
          :
          <article-title>Exome sequencing as a tool for Mendelian disease gene discovery</article-title>
          .
          <source>Nature Reviews Genetics</source>
          <year>2011</year>
          ,
          <volume>12</volume>
          :
          <fpage>745</fpage>
          -
          <lpage>55</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Ng</surname>
            <given-names>SB</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buckingham</surname>
            <given-names>KJ</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bigham</surname>
            <given-names>AW</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tabor</surname>
            <given-names>HK</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dent</surname>
            <given-names>KM</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huff</surname>
            <given-names>CD</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shannon</surname>
            <given-names>PT</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jabs</surname>
            <given-names>EW</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nickerson</surname>
            <given-names>DA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shendure</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bamshad</surname>
            <given-names>MJ</given-names>
          </string-name>
          :
          <article-title>Exome sequencing identifies the cause of a mendelian disorder</article-title>
          .
          <source>Nature Genetics</source>
          <year>2010</year>
          ,
          <volume>42</volume>
          :
          <fpage>30</fpage>
          -
          <lpage>5</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Rope</surname>
            <given-names>AF</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Evjenth</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xing</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Johnston</surname>
            <given-names>JJ</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Swensen</surname>
            <given-names>JJ</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Johnson</surname>
            <given-names>WE</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moore</surname>
            <given-names>B</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huff</surname>
            <given-names>CD</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bird</surname>
            <given-names>LM</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carey</surname>
            <given-names>JC</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Opitz</surname>
            <given-names>JM</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stevens</surname>
            <given-names>C a</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jiang</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schank</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fain</surname>
            <given-names>HD</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Robison</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dalley</surname>
            <given-names>B</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chin</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>South</surname>
            <given-names>ST</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pysher</surname>
            <given-names>TJ</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jorde</surname>
            <given-names>LB</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hakonarson</surname>
            <given-names>H</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lillehaug</surname>
            <given-names>JR</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Biesecker</surname>
            <given-names>LG</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yandell</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arnesen</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lyon</surname>
            <given-names>GJ</given-names>
          </string-name>
          :
          <article-title>Using VAAST to identify an X-linked disorder resulting in lethality in male infants due to N-terminal acetyltransferase deficiency</article-title>
          .
          <source>American Journal of Human Genetics</source>
          <year>2011</year>
          ,
          <volume>89</volume>
          :
          <fpage>28</fpage>
          -
          <lpage>43</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Koboldt</surname>
            <given-names>DC</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ding</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mardis</surname>
            <given-names>ER</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wilson</surname>
            <given-names>RK</given-names>
          </string-name>
          :
          <article-title>Challenges of sequencing human genomes</article-title>
          .
          <source>Briefings in Bioinformatics</source>
          <year>2010</year>
          ,
          <volume>11</volume>
          :
          <fpage>484</fpage>
          -
          <lpage>98</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Cock</surname>
            <given-names>PJA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fields</surname>
            <given-names>CJ</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goto</surname>
            <given-names>N</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heuer</surname>
            <given-names>ML</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rice</surname>
            <given-names>PM</given-names>
          </string-name>
          :
          <article-title>The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants</article-title>
          .
          <source>Nucleic Acids Research</source>
          <year>2010</year>
          ,
          <volume>38</volume>
          :
          <fpage>1767</fpage>
          -
          <lpage>1771</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Li</surname>
            <given-names>H</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Handsaker</surname>
            <given-names>B</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wysoker</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fennell</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ruan</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Homer</surname>
            <given-names>N</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marth</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abecasis</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Durbin</surname>
            <given-names>R</given-names>
          </string-name>
          :
          <article-title>The Sequence Alignment/Map format and SAMtools</article-title>
          .
          <source>Bioinformatics</source>
          <year>2009</year>
          , 25:
          <fpage>2078</fpage>
          -
          <lpage>9</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Danecek</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Auton</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abecasis</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Albers</surname>
            <given-names>C a</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Banks</surname>
            <given-names>E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>DePristo</surname>
            <given-names>M a</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Handsaker</surname>
            <given-names>RE</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lunter</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marth</surname>
            <given-names>GT</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sherry</surname>
            <given-names>ST</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McVean</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Durbin</surname>
            <given-names>R</given-names>
          </string-name>
          :
          <article-title>The variant call format and VCFtools</article-title>
          .
          <source>Bioinformatics</source>
          <year>2011</year>
          ,
          <volume>27</volume>
          :
          <fpage>2156</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Ogino</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gulley</surname>
            <given-names>ML</given-names>
          </string-name>
          ,
          <string-name>
            <surname>den Dunnen</surname>
            <given-names>JT</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wilson</surname>
            <given-names>RB</given-names>
          </string-name>
          :
          <article-title>Standard mutation nomenclature in molecular diagnostics: practical and educational challenges</article-title>
          .
          <source>The Journal of Molecular Diagnostics</source>
          <year>2007</year>
          ,
          <volume>9</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Wildeman</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>van Ophuizen</surname>
            <given-names>E</given-names>
          </string-name>
          ,
          <string-name>
            <surname>den Dunnen</surname>
            <given-names>JT</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taschner</surname>
            <given-names>PEM</given-names>
          </string-name>
          :
          <article-title>Improving sequence variant descriptions in mutation databases and literature using the Mutalyzer sequence variation nomenclature checker</article-title>
          .
          <source>Human Mutation</source>
          <year>2008</year>
          ,
          <volume>29</volume>
          :
          <fpage>6</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>