<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>SemanticGenomeGraphs</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Swiss Institute of Bioinformatics</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>The current linear reference based methods of representing genomic variation are limiting our insights into the variation between genomes. Genome graphs are a set of techniques that can accurately represent large structural variation as well as single nucleotide polymorphism. As any graph can be serialized as an RDF (Resource Description Framework) one, we show some advantages and disadvantages of making a Genome Graph available on the Semantic Web in a FAIR (Findable Accessible Interoperable Reusable) way. Demonstrating how we can use SPARQL to drive visualizations and integrate with non genome graph knowledge.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>PREFIX r d f :&lt; http : / /www. w3 . org /1999/02/22 r d f syntax ns#&gt;
PREFIX vg:&lt; http : / / biohackathon . org / r e s o u r c e / vg#&gt;
SELECT</p>
      <p>DISTINCT ? node ? s e q u e n c e
WHERE f
f</p>
      <p>SELECT ? o t h e r p a t h (MIN( ? o t h e r p o s i t i o n ) AS ? m o r e o f f s e t )</p>
      <p>WHERE f
? s t e p vg : node ? sharednode ;
vg : p o s i t i o n ? p o s i t i o n ;</p>
      <p>vg : path &lt;$f pathg&gt; .
? s t e p 2 vg : node ? sharednode ;
vg : p o s i t i o n ? o t h e r p o s i t i o n ;
vg : path ? o t h e r p a t h .</p>
      <p>FILTER ( ! sameTerm ( ? otherpath , ? path ) )
FILTER( ? p o s i t i o n &gt;= ? o f f s e t &amp;&amp; ? p o s i t i o n &lt;= ? upto ) )
g GROUP BY ? o t h e r p a t h
g
? s t e p 3 vg : node ? node ;
vg : p o s i t i o n ? p o s i t i o n 3 ;
vg : path ? o t h e r p a t h .
? node r d f : v a l u e ? s e q u e n c e .</p>
      <p>FILTER( ? p o s i t i o n 3 &gt;= ? m o r e o f f s e t &amp;&amp; ? p o s i t i o n 3 &lt;= ? m o r e o f f s e t &lt; ? d i s t a n c e )
g VALUES ? path ? o f f s e t ? upto ? d i s t a n c e
Example 1: Find the nodes that are in the same linear area of di ering paths.
Where they are selected by a path section of reference e.g. Reference genome
chromosome X from it's 1000 upto 2000 nucleotide. This query will nd nodes
in the non reference genome also in that area of their chromosome Xs.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Beckett</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berners-Lee</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prud</surname>
          </string-name>
          'hommeaux, E.,
          <string-name>
            <surname>Carothers</surname>
          </string-name>
          , G.:
          <article-title>RDF 1.1 turtle. W3C recommendation</article-title>
          ,
          <source>W3C (Feb</source>
          <year>2014</year>
          ), https://www.w3.org/TR/2014/RECturtle-20140225/
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Beyer</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Novak</surname>
            ,
            <given-names>A.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hickey</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chan</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tan</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paten</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zerbino</surname>
            ,
            <given-names>D.R.</given-names>
          </string-name>
          :
          <article-title>Sequence tube maps: making graph genomes intuitive to commuters</article-title>
          . Oxford Bioinformatics btz597 (
          <year>2019</year>
          ), https://doi.org/10.1093/bioinformatics/btz597
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Garrison</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Siren</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Novak</surname>
            ,
            <given-names>A.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hickey</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eizenga</surname>
            ,
            <given-names>J.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dawson</surname>
            ,
            <given-names>E.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jones</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garg</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Markello</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>M.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paten</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Durbin</surname>
          </string-name>
          , R.:
          <article-title>Variation graph toolkit improves read mapping by representing genetic variation in the reference</article-title>
          .
          <source>Nature Biotechnology</source>
          <volume>36</volume>
          ,
          <issue>875</issue>
          {
          <fpage>879</fpage>
          (
          <year>2018</year>
          ), https://doi.org/10.1038/nbt.4227
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>