<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>RDF2FS { A Unix File System RDF Store</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Michael Sintek</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gunnar Aastrand Grimnes</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>DFKI GmbH, Knowledge Management Department Kaiserslautern</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>{ All resources s in the graph become sub-directories of the root directory: r=s. { For each triple &lt; s; p; o &gt;:</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Motivation</title>
      <p>RDF2FS is a script that translates an arbitrary RDF graph into a directory
structure in a (Unix) le system, which enables Semantic Web applications without
the need of having a dedicated RDF store. There are multiple reasons why such
an approach makes sense:
1. To access the RDF store (incl. simple queries), one can use the (scripting
and any other) language of one's choice, as long as it has proper le system
support (incl. handling of symbolic links).
2. The RDF store can be browsed (and edited) with a normal le browser (see
screenshot).
3. Subversion showed that people are more comfortable with storing data in a
le system than in a dedicated database.
4. Our approach can also have a nice educational e ect: \normal" le system
users and also \hackers" will be able to understand RDF more easily, without
ever seeing an RDF le.
5. If RDF is used as metadata for documents, one usually has two places where
metadata is stored: in the hierarchical directory structure where the
documents are placed, and in the RDF metadata graph. This means that
documents have to be \annotated" twice, by ling the documents into the
intended folders, and by creating the RDF metadata with a separate tool. Our
approach allows this do be done at one place and with only one tool that
everyone is using anyway: a le browser.
RDF2FS maps an RDF graph to a target root directory r as follows:
if p is literal-valued, we create the le r=s=p (if it does not yet exist) and
append o to this le (i.e., each line of o is a value for the property p for
subject s)
if p is object-valued, we create a directory r=s=p, and the object o
becomes a symbolic link r=s=p=o ! ::=::=o, i.e., it links back to that
resource on the root level (properties that are both literal- and
objectvalued are not yet supported; this could easily be done with some naming
convention, but would make queries slightly more di cult)
{ The names of the les and directories are not the full resource URIs but of
the format namespace-abbrev#localname. RDF2FS keeps a list of commonly
used namespace pre xes, such as rdfs, foaf, dc, etc., and will generate new
mappings for unknown namespaces.
3</p>
    </sec>
    <sec id="sec-2">
      <title>Querying the Store</title>
      <p>The appendix shows that all of the simple queries are easily supported using
standard POSIX le system utilities (such as nd, grep, ...), including all
combinations of statement queries, path expressions, and conjunctive queries.
4</p>
    </sec>
    <sec id="sec-3">
      <title>Future Work</title>
      <p>In a future version, RDF2FS should support to translate back from a le system
RDF store to an RDF le. Real les that correspond to resources and that
should be stored in these resource directories (e.g., in some "." le), are not yet
supported. Furthermore, mounting an RDF le via FUSE instead of translating
to an existing le system would allow more e cient storage (esp. in case of le
systems that do not handle many small les well), better handling of queries,
and checking for illegal operations.
5</p>
    </sec>
    <sec id="sec-4">
      <title>Demo Details</title>
      <p>The mapping is implemented twice, as a Java program and a Python script.
Either version will translate a set of RDF les into a bash-script that will
create the necessary directories, les, and symbolic links. Please download
http://www.dfki.uni-kl.de/ sintek/SFSW2008/RDF2FS.tgz and follow the
instructions in README.</p>
      <p>Appendix: Bash Queries
# Literal-valued properties
# triple queries with one variable:
# s p ?o -&gt; cat s/p
echo '** danbri-foaf.rdf#danbri foaf#name ?o'
cat danbri-foaf.rdf#danbri/foaf#name
# ?s p o -&gt; grep -l o */p | $FIRST
# (where $FIRST is an awk call that selects the first part of a path)
echo -e '\n** ?s foaf#name "Dan Brickley"'
grep -l "Dan Brickley" */foaf#name | $FIRST
# s ?p o -&gt; grep -l o s/* | $SECOND
# (where $SECOND is an awk call that selects the second part of a path)
echo -e '\n** danbri-foaf.rdf#danbri ?p "Dan Brickley"'
grep -l "Dan Brickley" danbri-foaf.rdf#danbri/* | $SECOND
# triple queries with two variables:
# ?s ?p o -&gt; grep -l o */*
echo -e '\n** ?s ?p "Dan Brickley"'
grep -l "Dan Brickley" */*
# ?s p ?o -&gt; ls -1 */p
# plus retrieval of literals (simply with cat)
echo -e '\n** ?s foaf#name ?o'
for f in $(ls -1 */foaf#name ); do</p>
      <p>echo $( echo $f | $FIRST ) \"$( cat $f )\"
done
# s ?p ?o -&gt; file pattern s/* plus retrieval of literals (and resources)
echo -e '\n** danbri-foaf.rdf#danbri ?p ?o'
for f in danbri-foaf.rdf#danbri/*; do
if [ -f $f ]; then</p>
      <p>echo $( echo $f | $SECOND ) \"$( cat $f )\"
else # resources are just the files (links) in $f</p>
      <p>echo $( echo $f | $SECOND ) $( ls $f )
fi
done
# Object-valued properties
# triple queries with one variable:
# s p ?o -&gt; ls s/p
echo -e '\n** danbri-foaf.rdf#danbri/foaf#knows ?o'
ls danbri-foaf.rdf#danbri/foaf#knows
... (left as excercise to the reader :-) )
# Path expressions:
# s p1 _ p2 ?o -&gt; cat s/p1/*/p2
echo -e '\n** path: danbri-foaf.rdf#danbri foaf#knows ?_ foaf#name ?o'
cat danbri-foaf.rdf#danbri/foaf#knows/*/foaf#name
# ?s p1 _ p2 o -&gt; grep -l o */p1/*/p2
echo -e '\n** path: ?s foaf#knows ?_ foaf#name "Libby Miller"'
grep -l "Libby Miller" */foaf#knows/*/foaf#name | $FIRST
# Conjunction:
# ?s p1 o1 AND ?s p2 o2
# -&gt;
# using intersection (realized with sort and uniq)
# grep -l o1 */p1 | $FIRST | sort -u &gt; tmp1
# grep -l o2 */p2 | $FIRST | sort -u &gt; tmp2
# sort -m tmp1 tmp2 | uniq -d # = intersection
echo -e '\n** ?s foaf#plan "Save the world" AND ?s uranai#bloodtype "A+"'
grep -l "Save the world" */foaf#plan | $FIRST | sort -u &gt; tmp1
grep -l "A+" */uranai#bloodtype | $FIRST | sort -u &gt; tmp2
sort -m tmp1 tmp2 | uniq -d
rm tmp1 tmp2
Output of this script:
** danbri-foaf.rdf#danbri foaf#name ?o
Dan Brickley
** ?s foaf#name "Dan Brickley"
anon-64848a97%3A1187e661172%3A-7ffb
danbri-foaf.rdf#danbri
** danbri-foaf.rdf#danbri ?p "Dan Brickley"
foaf#name
** ?s ?p "Dan Brickley"
anon-64848a97%3A1187e661172%3A-7ffb/foaf#name
danbri-foaf.rdf#danbri/foaf#name
** ?s foaf#name ?o
anon-64848a97%3A1187e661172%3A-7fde "Pastor N Pizzor"
...
anon-64848a97%3A1187e661172%3A-7ffb "Dan Brickley"
card#i "Tim Berners-Lee"
** path: danbri-foaf.rdf#danbri foaf#knows ?_ foaf#name ?o
Damian Steer
...</p>
      <p>Tim Berners-Lee
** path: ?s foaf#knows ?_ foaf#name "Libby Miller"
danbri-foaf.rdf#danbri
** ?s foaf#plan "Save the world" AND ?s vocab#uranaibloodtype "A+"
danbri-foaf.rdf#danbri</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>