<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>RDFohloh, a RDF wrapper of Ohloh</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sergio Fernandez</string-name>
          <email>sergio.fernandez@fundacionctic.org</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Fundacion CTIC Gijon</institution>
          ,
          <addr-line>Asturias</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <fpage>2</fpage>
      <lpage>4</lpage>
      <abstract>
        <p>Data on the Semantic Web is modeled and represented in RDF. In the Social Web people usually do not give a further thought about this kind of formalism, whereas they do take care about the content. That is why it could be useful to have tools capable to export that amount of content in machine-readable formats, such as RDF. In this demo paper we present RDFohloh, a RDF wrapper of Ohloh, a Web 2.0 open source directory.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        The original vision of the Semantic Web [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], as a layer on top of the current
Web, requires that data is published on the Web, and ideally linked with other
useful resources. During the last years the Semantic Web community has made a
big e ort to make available more and more RDF datasets. Data can come from
legacy sources, relational databases, or just making Web scraping; but also from
social sources. Web 2.0 applications commonly provide some of its content via
public APIs; so it is another big opportunity where data can be extracted and
exposed as Linked Open Data [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
Ohloh1 is an open source directory. Its main goal is to aggregate projects and
developers from any Web site. By retrieving data from revision control
repositories (such as CVS, SVN, or Git), Ohloh provides statistics about the longevity of
projects, their licenses (including license con ict information) and software
metrics such as lines of source code and commit statistics. At this moment2 Ohloh
lists 15,532 projects and 23,430 developers. Another goal of Ohloh is providing
a public RESTful API3 with the most important information and many of that
metrics.
      </p>
      <sec id="sec-1-1">
        <title>1 http://www.ohloh.net/</title>
        <p>2 Data retrieved on September 18th, 2008
3 http://www.ohloh.net/api</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>RDFohloh, wrapping the wrapper</title>
      <p>
        RDFohloh4 comes to ful ll the requirement previously mentioned with Ohloh:
consuming the data provided by its API and publishing it in RDF [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] as Linked
Data. So since Ohloh could be considered a Web 2.0 wrapper for open source
projects and developers, RDFohloh could be deemed a wrapper of the wrapper;
the n-layers architecture in pure state.
3.1
      </p>
      <sec id="sec-2-1">
        <title>Related work</title>
        <p>
          Obviously the idea behind RDFohloh is nothing really new. There are several
applications that export FOAF or SIOC5 from other Web 2.0 applications. But
for DOAP it could be summarized mainly in two:
{ doap:store [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] is an online DOAP directory of computing projects,
collaboratively built, where people do not need to register to the service, because it
constantly retrieves decentralized projects description to build its database
thanks to Ping The Semantic Web service.
{ DOAPspace6 is a registry/repository that contains DOAP scrapped data
from several sources including SourceForge, Freshmeat and the Python
Package Index.
3.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Data and links</title>
        <p>
          RDFohloh mainly uses three popular ontologies: SIOC [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] and FOAF [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] for users
and DOAP [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] for projects. At the moment of this writing, RDFohloh is
publishing 23,430 instances of sioc:User/foaf:Person and 15,532 of doap:Project.
The result dataset has skos:subject links with DBpedia [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] concepts (for the
moment only with the programming language of projects) and owl:sameAs links
with DOAPspace projects.
3.3
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>Publication</title>
        <p>
          One of the details specially attended in RDFohloh was how to publish the
data. Using cool URIs [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] and content negotiation [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], it provides three views
(RDF/XML, N3 and XHTML+RDFa) of each resource. All the data is published
attending the best practice recipes [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], and the nal result was successfully tested
with Vapour [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
        </p>
        <sec id="sec-2-3-1">
          <title>4 http://rdfohloh.wikier.org/ 5 http://sioc-project.org/applications 6 http://doapspace.org/</title>
          <p>
            RDFohloh comes to expand the actual horizon of the Linked Data planet,
providing social data from a rich source of information. However it is necessary to
improve the project including some new features:
{ Providing dumps of the all the data, properly described with Semantic
Sitemaps [
            <xref ref-type="bibr" rid="ref8">8</xref>
            ], but rst it is necessary to nd how to cope with the
limitations of the number of requests per day of Ohloh's API. With that dumps,
it would be easier to provide also a SPARQL endpoint to query the dataset.
{ Including source code metrics from Ohloh that now are missing in the RDF
export, allowing possible semantic analysis of it.
{ Improving actual links and add new ones to other open datasets.
          </p>
          <p>All that features are in the roadmap of the project from its beginning, so
hopefully it will be soon available.</p>
        </sec>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>S.</given-names>
            <surname>Auer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          , G. Kobilarov,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lehmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cyganiak</surname>
          </string-name>
          , and
          <string-name>
            <surname>Z. Ives.</surname>
          </string-name>
          <article-title>DBpedia: A Nucleus for a Web of Open Data</article-title>
          . In Aberer et al. (Eds.):
          <source>The Semantic Web, 6th International Semantic Web Conference, 2nd Asian Semantic Web Conference, ISWC 2007 + ASWC</source>
          <year>2007</year>
          , volume
          <volume>4825</volume>
          of Lecture Notes in Computer Science, pages
          <volume>722</volume>
          {
          <fpage>735</fpage>
          ,
          <string-name>
            <surname>Busa</surname>
          </string-name>
          , Korea,
          <year>November 2007</year>
          . Springer 2007.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>T.</surname>
          </string-name>
          Berners-Lee.
          <article-title>Linked Data Design Issues</article-title>
          . Available at http://www.w3.org/ DesignIssues/LinkedData.html,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>T.</given-names>
            <surname>Berners-Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hendler</surname>
          </string-name>
          , and
          <string-name>
            <given-names>O.</given-names>
            <surname>Lassila</surname>
          </string-name>
          .
          <source>The Semantic Web. Scienti c American</source>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>D.</given-names>
            <surname>Berrueta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Fernandez</surname>
          </string-name>
          ,
          <string-name>
            <surname>and I. Frade. Cooking</surname>
          </string-name>
          <article-title>HTTP content negotiation with Vapour</article-title>
          .
          <source>In Proceedings of 4th workshop on Scripting for the Semantic Web</source>
          <year>2008</year>
          (
          <article-title>SFSW2008). co-located with ESWC2008, Tenerife</article-title>
          , Spain,
          <year>June 2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>D.</given-names>
            <surname>Berrueta</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Phipps</surname>
          </string-name>
          .
          <article-title>Best Practice Recipes for Publishing RDF Vocabularies</article-title>
          . Working Draft,
          <year>W3C</year>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>U.</given-names>
            <surname>Bojars</surname>
          </string-name>
          and
          <string-name>
            <given-names>J. G.</given-names>
            <surname>Breslin. SIOC Core</surname>
          </string-name>
          <article-title>Ontology Speci cation</article-title>
          .
          <source>Member submission, W3C</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>D.</given-names>
            <surname>Brickley</surname>
          </string-name>
          and
          <string-name>
            <given-names>L.</given-names>
            <surname>Miller. FOAF Vocabulary</surname>
          </string-name>
          <article-title>Speci cation</article-title>
          .
          <source>Technical report</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>R.</given-names>
            <surname>Cyganiak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Delbru</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Tummarello. Semantic Web</surname>
          </string-name>
          <article-title>Crawling: A Sitemap Extension</article-title>
          .
          <source>Technical Report</source>
          , DERI,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>E.</given-names>
            <surname>Dumbill</surname>
          </string-name>
          . DOAP:
          <article-title>Description of a Project</article-title>
          . http://usefulinc.com/doap/.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>K.</given-names>
            <surname>Holtman</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Mutz</surname>
          </string-name>
          .
          <article-title>Transparent Content negotiation in HTTP</article-title>
          . RFC, IETF,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. G. Klyne and
          <string-name>
            <given-names>J. J.</given-names>
            <surname>Carroll</surname>
          </string-name>
          .
          <article-title>Resource Description Framework (RDF): Concepts and abstract syntax</article-title>
          .
          <source>Technical report, W3C Recommendation</source>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>A.</given-names>
            <surname>Passant</surname>
          </string-name>
          .
          <article-title>A user-friendly interface to browse and nd DOAP project with doap:store</article-title>
          .
          <source>In Proceedings of the 3rd workshop on Scripting for the Semantic Web (SFSW2007)</source>
          <article-title>, co-located with ESWC2007, Innsbruck</article-title>
          , Austria, May
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>L.</given-names>
            <surname>Sauermann</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Cyganiak</surname>
          </string-name>
          .
          <article-title>Cool URIs for the Semantic Web</article-title>
          . Interest Group Note, W3C,
          <year>March 2007</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>