=Paper= {{Paper |id=Vol-1766/om2016_poster5 |storemode=property |title=Lion's Den: feeding the LinkLion |pdfUrl=https://ceur-ws.org/Vol-1766/om2016_poster5.pdf |volume=Vol-1766 |authors=Mohamed Ahmed Sherif,Mofeed M. Hassan,Tommaso Soru,Axel-Cyrille Ngonga Ngomo,Jens Lehmann |dblpUrl=https://dblp.org/rec/conf/semweb/SherifHSNL16 }} ==Lion's Den: feeding the LinkLion== https://ceur-ws.org/Vol-1766/om2016_poster5.pdf
                                    Lion’s Den
                              Feeding the LinkLion

 Mohamed Ahmed Sherif, Mofeed M. Hassan, Tommaso Soru, Axel-Cyrille Ngonga
                        Ngomo, and Jens Lehmann

      Department of Computer Science, University of Leipzig, 04109 Leipzig, Germany
       {sherif,mounir,tsoru,ngonga,lehmann}@informatik.uni-leipzig.de


Introduction Over the last years, several tools have been developed with the aim of
efficiently supporting the link discovery process [5,7]. This process consisting of two
steps: (1) Discovering a Link Specifications (LS) for retrieving high-quality links (i.e.
achieve high precision and recall). (2) Carry out the LS to compute the actual links.
Several frameworks such as LIMES [3] and SILK [1] have been developed to create
such links between the different knowledge bases (KB). While the importance of links
between datasets is unequivocal, only few efforts have aimed at making LS available.
Such a link repository would however enable a large number of applications, including
transfer learning for LS, the provision of provenance and justification information for
links, fuzzy inferences on Linked data sets and many more. The importance of links
is further underlined by the community efforts have already led to the creation of link
repositories such as LinkLion and sameAs.org. In view of the dispersed availability
of LS in different formats (scripts, XML, RDF), we created Lion’s Den as a compan-
ion project to LinkLion. LinkLion is a store for the publication, retrieval and use of
links between KB. The portal provides functionality for the upload and the storage of
discovered links, as well as meta-information about these links. With Lion’s Den, we
introduce an extension of such meta-information by letting the portal user upload files
describing LS. We published the Lion’s Den dataset on the LinkLion link discovery
portal so as to make them accessible and queryable via a SPARQL endpoint.1 .

The Lion’s Den Dataset The dataset is now hosted within the LinkLion project at
http://linklion.org. Currently, Lion’s Den contains 436 LS that are described by
15 457 triples including the ontology. Metadata on the Lion’s Den dataset is available
on DataHub.2

Ontology To represent the LS in RDF and OWL, we developed the Lion’s Den vo-
cabulary dubbed LDEN3 . LDEN was specified with the aim of supporting any type of LS
regardless of the way it was created. in its current version, LDEN contains a set of ten
classes. Each LS is an instance of the LinkSpecs class. The LinkSpecs class pro-
vides properties that allow referencing the five basic components of any LS which are
the source and target datasets, the metric used for linking as well as the acceptance
 1
   for more details see the extended paper in the project web site https://svn.aksw.org/
   papers/2016/ISWC_OM_LionDen/public.pdf
 2
   http://datahub.io/dataset/lionsden
 3
   http://www.linklion.org/lden/
and reviewing criteria. In addition, the LinkSpecs class provides metadata such as the
source LS’s URL and creator, publisher, license and provenance information. Currently,
our ontology contains three classes derived from the LinkSpecs class (LimesSpecs,
SilkSpecs and ScriptSpecs), where each of the three classed contains special at-
tributes related to the framework it represents.
Data Sources Lion’s Den original LS were collected from four different sources: (1)
The LATC project provides the interlinking 24/7 Platform4 . (2) LinkedGeoData5 is a
project to convert spatial information provided by OpenStreetMap to the Web of Data.
(3) DBpedia-links6 is a repository that contains links, LS and link extraction scripts. (4)
The Limes7 Link discovery framework supports manual configuration for linking tasks
through XML based specification files.
Conversion Process As the original configuration files for both SILK and LIMES were
in XML format, we built a specialized XML to RDF converter for each of them. The
source code of the dataset converters is available at the project repository8 .
Provenance The LinkLion dataset reuses properties and classes from the PROV W3C
recommendation9 to keep track of data provenance.
Use Cases Having the LS of Lion’s Den together with the links of LinkLion in a
machine readable format and serving them from one portal offers a lot of opportunities,
including, but not limited to: benchmarking link discovery algorithms, automatic linked
data enrichment [6], key discovery [8], unification of LS, LS tansfer learning [2] and
Link Discovery over n Knowledge Bases [4].

References
1. R. Isele, A. Jentzsch, and C. Bizer. Efficient Multidimensional Blocking for Link Discovery
   without losing Recall. In WebDB, 2011.
2. A.-C. N. Ngomo, J. Lehmann, and M. Hassan. Transfer learning of link specifications. In
   Seventh IEEE International Conference on Semantic Computing (ICSC), 2013.
3. A. N. Ngomo. A time-efficient hybrid approach to link discovery. In Proceedings of the 6th
   International Workshop on Ontology Matching, Bonn, Germany, October 24, 2011, 2011.
4. A.-C. Ngonga Ngomo, M. A. Sherif, and K. Lyko. Unsupervised link discovery through
   knowledge base repair. In Extended Semantic Web Conference (ESWC 2014), 2014.
5. G. Papadakis, E. Ioannou, C. Niederèe, T. Palpanasz, and W. Nejdl. Eliminating the redun-
   dancy in blocking-based entity resolution methods. In JCDL, 2011.
6. M. Sherif, A.-C. Ngonga Ngomo, and J. Lehmann. Automating RDF dataset transformation
   and enrichment. In 12th Extended Semantic Web Conference, Portoroz, Slovenia, 31st May -
   4th June 2015. Springer, 2015.
7. J. Sleeman and T. Finin. Computing foaf co-reference relations with rules and machine learn-
   ing. In Proceedings of the Third International Workshop on Social Data on the Web, 2010.
8. T. Soru, E. Marx, and A.-C. Ngonga Ngomo. ROCKER – a refinement operator for key
   discovery. In Proceedings of the 24th International Conference on World Wide Web, 2015.

 4
   https://www.assembla.com/wiki/show/silk/Link_Specification_Language
 5
   http://linkedgeodata.org/
 6
   https://github.com/dbpedia/dbpedia-links/
 7
   https://github.com/AKSW/LIMES
 8
   https://github.com/AKSW/LionDen
 9
   http://www.w3.org/ns/prov#