=Paper=
{{Paper
|id=None
|storemode=property
|title=Towards Interoperable Metadata Provenance
|pdfUrl=https://ceur-ws.org/Vol-670/paper_3.pdf
|volume=Vol-670
|dblpUrl=https://dblp.org/rec/conf/semweb/EckertPV10
}}
==Towards Interoperable Metadata Provenance==
<pdf width="1500px">https://ceur-ws.org/Vol-670/paper_3.pdf</pdf>
<pre>
     Towards Interoperable Metadata Provenance
                   Kai Eckert, Magnus Pfeffer                                          Johanna Völker
                      University Library                                  KR&KM Research Group
                   University of Mannheim                                 University of Mannheim
                    Mannheim, Germany                                       Mannheim, Germany
          Email: eckert/pfeffer@bib.uni-mannheim.de              Email: voelker@informatik.uni-mannheim.de


    Abstract—Linked data has finally arrived. But with the     Library (Deutsche Nationalbibliothek, DNB) provides
availability and actual usage of linked data, data from        skos:match statements to LoC subject headings (LCSH).
different sources gets quickly mixed and merged. While
there is a lot of fundamental work about the provenance            A bad thing is that the service is totally source-
of metadata and the commonly recognized demand for ex-         agnostic (apart from the data-space notion). For example,
pressing provenance information, there still is no standard    the DNB states on its website that the data is provided
or at least best-practice recommendation. In this paper,
we summarize our own requirements based on experi-             only as a prototype, should only be used after a consulta-
ences at the Mannheim University Library for metadata          tion and not for commercial applications. The LCSH data
provenance, examine the feasibility to implement these         is public domain and freely available. But also within
requirements with currently available (de-facto) standards,    our triple store, there are different datasets. The MUL
and propose a way to bridge the missing gaps. By this          catalog is currently provided without a specific license,
paper, we hope to obtain additional feedback, which we
will feed back into ongoing discussions within the recently    as questions about the proper licensing still are discussed.
founded DCMI task-group on metadata provenance.                The data from the Cologne University Library has been
                                                               processed by us and the processed data is provided by a
                       I. I NTRODUCTION                        the creative commons CC-0 license, too.

     At the Mannheim University Library (MUL), we              A. Motivation
recently announced a Linked Data Service1 (LDS). Our
complete catalog with about 1.4 million is made available          Our predicament: We do want the LDS to be source-
as RDF, with proper dereferenceable URIs and a human-          agnostic. But at the same time we want to know about
readable presentation of the data as HTML pages. The           the license of the data that is displayed to the user, and
title records are linked to classification systems, subject    we want to present him with this information. Moreover,
headings and to other title records. The Cologne Uni-          besides license and source information, we also have
versity Library made its catalog data available under a        other information that we would like to make available to
creative commons CC-0 license, so we converted it to           the user or other applications in a reusable way. But the
RDF and made it available along our own catalog.               current state of the art is that this information is either
    The HTML view2 provides browsable pages for all            not made available within the RDF datasets yet – the
resources described in the RDF data. It fetches additional     case for DNB, LoC and our own data – or not in a
statements when users click on the URIs, provided that         consistent way. For example, the data from the OCLC
they are available by URI dereferencing. The resulting         service dewey.info3 contains licensing statements as part
statements are presented to the user within the LDS            of the RDF statements about a given resource (Ex. 1).
layout and cannot be easily distinguished from the data
                                                               <http://dewey.info/class/641/2009/08/about.en>
that is made available by the Mannheim University               a <http://www.w3.org/2004/02/skos/core#Concept>;
Library itself. There is only a note about the “data space”,    xhv:license
                                                                  <http://creativecommons.org/licenses/by-nc-nd/3.0/>;
basically indicating the domain where the dereferenced          cc:attributionName
URI resides.                                                      "OCLC Online Computer Library Center, Inc.";
                                                                cc:attributionURL <http://www.oclc.org/dewey/>;
                                                                ...
   A good thing is that the service is totally source-          skos:prefLabel "Food & drink";
agnostic and fetches and presents everything that is            skos:broader
                                                                  <http://dewey.info/class/64/2009/08/about.en>;
available. With two clicks, the user gets subject data          cc:morePermissions
from the library of congress (LoC), just because we use           <http://www.oclc.org/dewey/about/licensing/>.
the German subject headings and the German National
                                                                     Example 1: Provenance in dewey.info dataset
  1 http://data.bib.uni-mannheim.de
  2 currently implemented with Virtuoso RDF-Mapper               3 http://dewey.info
   As another example, the New York Times expresses             created, ... – in general additional statements to further
provenance outside the actual data record, more precisely       qualify and describe the metadata. Thus we will refer
by means of statements about the data record (Ex. 2).           to this kind of additional information unambiguously as
                                                                “metametadata”.
<http://data.nytimes.com/46234942819259373803.rdf>
  foaf:primaryTopic
    <http://data.nytimes.com/46234942819259373803>
  dcterms:rightsHolder
    "The New York Times Company"
  ...                                                           C. Metametadata Principles
  cc:license
    <http://creativecommons.org/licenses/by/3.0/us/>
  cc:attributionURL
    <http://data.nytimes.com/46234942819259373803>
  cc:attributionName                                                To achieve interoperability for accessing metameta-
    "The New York Times Company"                                data, choosing a representation of the metametadata is
<http://data.nytimes.com/46234942819259373803>                  only the first, merely technical step. In our opinion, the
  ...                                                           following principles and requirements have to be met to
  a <http://www.w3.org/2004/02/skos/core#Concept>
  ...                                                           achieve this type of interoperability:
  skos:prefLabel "Faircloth, Lauch"
  ...
                                                                   1) Arbitrary metametadata statements about a set of
 Example 2: Provenance in New York Times dataset                      statements.
                                                                   2) Arbitrary metametadata statements about single
    Our goal is to make this kind of information available            statements.
to the user in a consistent way. We respect all the                3) Metametadata on different levels for each state-
different licenses and do not want to make users believe              ment or sets of statements.
that all this data is provided by ourselves, without any           4) Applications to retrieve, maintain and republish the
licensing information.                                                metametadata without data loss or corruption.
    Besides provenance information, we also need to pro-           5) Data processing applications to store the metameta-
vide other information that further qualifies single state-           data about the original RDF data.
ments of the datasets. For example, in a past project we
automatically created classifications and subject headings          Requirements 1 - 3 address the technical requirements
for bibliographic resources. We provide this data also via      that have to be met by the metadata format(s) in use.
the LDS which is very convenient and greatly facilitates        They are met by RDF, but in RDF there are two distinct
the reuse of the data. But automatically created results        approaches that can be used to represent metametadata:
often lack the desired quality, moreover the processes
usually provide further information, like a weight, rank            Reification: RDF provides a means for the formula-
or other measures of confidence [1]. All this information       tion of statements about statements, called reification. In
should also be provided to the user in a well-defined way.      the RDF model, this means that a complete statement
                                                                consisting of subject, predicate and object becomes the
B. Data, Metadata, Metametadata, ...                            subject of a new statement that adds the desired infor-
                                                                mation.4
    Data provided as RDF is not necessarily metadata in
a strict sense; in general it is data about resources. But in
                                                                     Named Graphs: Another technique that can be used
many cases – and especially in the context of this paper
                                                                to provide statements about statements are the “Named
– the resources are data themselves, like books, articles,
                                                                Graphs”, introduced by Carroll et al. [2]. The Named
websites or databases. In the library domain, the term
                                                                Graphs are not yet officially standardized and part of
“metadata” is thus established for all the data about
                                                                RDF. They have to be considered work in progress,
the resources a librarian is concerned with – including,
                                                                but are already widely used by the community and can
but not restricted to bibliographic resources, persons and
                                                                already be considered as a kind of de-facto standard that
subjects. This is the reason, why one cannot distinguish
                                                                is likely to have a big impact on future developments in
easily between data and metadata in the context of RDF.
We therefore regard them as synonyms.
   Metadata is itself data and there are a lot of use-cases        4 As a statement cannot be identified uniquely in RDF beside the

where one wants to make further statements about meta-          notion of S, P and O, a reification statement refers to all triples with
                                                                the given S, P and O. In our context, this ambiguity has no substantial
data, just as well as metadata provides statements about        effects, as identical triples are semantically equivalent to duplicated
data: who created the metadata, how was the metadata            metadata that can be safely discarded as redundant information.
the RDF community.5 Named Graphs are an extension                       ones. Example 3 shows a DC metadata record with
of RDF, both on the model and syntax level. They allow                  subject annotations from different sources and additional
the grouping of RDF statements into a graph. The graph                  information about the assignments via RDF reification.
is a resource on its own and can thus be further described              Note that we present the triples in a table and give them
by RDF statements, just like any other resource. There                  numbers that are then used to reference them.
are extensions for SPARQL and N3 to represent and
                                                                              Subject                 Predicate    Object
query Named Graphs, but they are for example not                         1    ex:docbase/doc1         dc:subject   ex:thes/sub20
representable in RDF-XML.6                                               2    #1                      ex:source    ex:sources/autoindex1
                                                                         3    #1                      ex:rank      0.55
    To meet requirements 4 and 5, further conventions                    4    ex:docbase/doc1         dc:subject   ex:thes/sub30
                                                                         5    #4                      ex:source    ex:sources/autoindex1
among interoperable applications are needed that have                    6    #4                      ex:rank      0.8
to be negotiated on a higher level and are (currently)                   7    ex:docbase/doc1         dc:subject   ex:thes/sub30
                                                                         8    #7                      ex:source    ex:sources/pfeffer
beyond the scope of RDF. By virtue of the following use-                 9    #7                      ex:rank      1.0
cases, we demonstrate that the technical requirements are                10   ex:docbase/doc1         dc:subject   ex:thes/sub40
                                                                         11   #10                     ex:source    ex:sources/pfeffer
already met and that we only need some conventions to                    12   #10                     ex:rank      1.0
represent such information in an consistent way – at least               13   ex:sources/autoindex1   ex:type      ex:types/auto
                                                                         14   ex:sources/pfeffer      ex:type      ex:types/manual
as long as the official RDF standard does not address the
metametadata issue.                                                      Example 3: Subject assignments by different sources

                 II. E XAMPLE U SE -C ASES                                  There is one document (ex:docbase/doc1) with as-
                                                                        signed subject headings from two different sources. For
    The following use cases7 are meant to be illustrating               each subject assignment, we see that a source is specified
examples, especially to emphasize the need for the repre-               via a URI. Additionally, a rank for every assignment is
sentation of arbitrary information – not only provenance                provided, as automatic indexers usually provide such a
– about data on various levels, from whole datasets                     rank. For example, a document retrieval system can make
over records to single statements or arbitrary groups of                direct use of it for the ranking of retrieval results. For
statements.                                                             manual assignments, where usual no rank is given, this
                                                                        could be used to distinguish between high quality subject
    In this section, we develop a scenario where such
                                                                        assignments from a library and, for example, assignments
metametadata can be used to prevent information loss
                                                                        from a user community via tagging.
while merging subject annotations from different sources.
We show that this is the key to make transparent use                        The statements #13 and #14 are used to further
of different annotation sources without compromises                     qualify the source, more precisely, to indicate, if the
regarding the quality of your metadata. In line with our                assignments were performed manually (ex:types/manual)
argumentation in this paper, we propose the storage of                  or automatically (ex:types/auto).
metametadata to mitigate any information loss and allow
the usage of this information to achieve a better retrieval             A. Use-case 1: Merging annotation sets
experience for the users. With various queries, we show
that we can access and use the additional pieces of                         Usually, the statements from Example 3 are avail-
information to regain a specific set of annotations that                able from different sources (as indicated) and might
fulfills our specific needs.                                            also belong to different shells in the shell model. The
                                                                        integration requires to merge them in a single store. An
    This scenario focuses on the merging of manually                    interesting side-effect of the use of RDF and reification
assigned subject headings with automatically assigned                   is that the merged data is still accessible from every
   5 See http://www.w3.org/2004/03/trix/ for a summary. There are       application that is able to use RDF data, even if it is not
already further extensions or generalizations of Named Graphs, like     possible to make reasonable use of our metametadata.
Networked Graphs [3] that allow the expression of views in RDF          This is demonstrated by the first query in Example 4,
graphs in a declarative way. Flouris et al. propose a generalization
to maintain the information associated with graphs, when different
                                                                        which retrieves all subject headings that are assigned to
graphs are mixed [4]: Here, colors are used to identify the origin of   a document. As in RDF all statements are considered
a triple, instead of names. A notion of “Color1+Color2” is possible     identical that have the same subject, predicate and object,
and the paper demonstrates, how reasoning can be used together with
these colored triples. Gandon and Corby published a position paper
                                                                        every subject heading is returned that is assigned by at
[5] about the need for a mechanism like Named Graphs and a proper       least one source. In most cases, these completely merged
standardization as part of RDF.                                         statements are not wanted. As promised, we show with
   6 You can see the grouping of statements in a single RDF-XML file
                                                                        the second query in Example 4 that we are able to regain
as the notion of an implicit graph and use the URI of the RDF-XML
file to specify further statements about this graph, just like Ex. 2    all annotations that were assigned by a specific source
   7 First published at the DC 2009 conference [6].                     (here ex:sources/pfeffer).
SELECT ?document ?value WHERE {                              SELECT DISTINCT ?document ?subject WHERE {
    ?t rdf:subject ?document .                                ?t rdf:subject ?document .
  ?t rdf:predicate dc:subject .                               ?t rdf:predicate dc:subject .
  ?t rdf:object ?value .                                      ?t rdf:object ?subject .
}                                                             ?t ex:source ?source .
  document        subject                                     ?source ex:type ?type .
  ex:docbase/doc1 ex:thes/sub40                               FILTER ( ?type = <ex:types/manual> )
  ex:docbase/doc1 ex:thes/sub30                              }
  ex:docbase/doc1 ex:thes/sub20                                document        subject       type
SELECT ?document ?value WHERE {                                ex:docbase/doc1 ex:thes/sub40 http://example.org/types/manual
    ?t rdf:subject ?document .                                 ex:docbase/doc1 ex:thes/sub30 http://example.org/types/manual
  ?t rdf:predicate dc:subject .                              SELECT DISTINCT ?document ?subject WHERE {
  ?t rdf:object ?value .                                      ?t rdf:subject ?document .
  ?t ex:source <ex:sources/pfeffer> .                         ?t rdf:predicate dc:subject .
}                                                             ?t rdf:object ?subject .
  document        subject       source                        ?t ex:source ?source .
  ex:docbase/doc1 ex:thes/sub40 ex:sources/pfeffer            ?source ex:type ?type .
  ex:docbase/doc1 ex:thes/sub30 ex:sources/pfeffer            ?t ex:rank ?rank .
                                                              FILTER ( ?type = <ex:types/manual> || ?rank > 0.7 )
      Example 4: Querying the merged statements              }
                                                               document        subject       rank
                                                               ex:docbase/doc1 ex:thes/sub40 1.0
                                                               ex:docbase/doc1 ex:thes/sub30 1.0
B. Use-case 2: Extended queries on the merged annota-          ex:docbase/doc1 ex:thes/sub30 0.8
tions                                                        Example 5: Ranked assignments and additional source
                                                             information
    In the following we show two extended queries that
make use of the metametadata provided in our data store.
Usually, one does not simply want to separate annotation     enough to enable applications the automatic integration
sets that have been merged, but instead wants to make        of these information without proper knowledge, how the
further use of these merged annotations. For example,        information is actually represented from a data model
we can provide data for different retrieval needs.           perspective.
    The first query in Example 5 restricts the subject           An implementation with a clear semantic of metadata
headings to manually assigned ones, but they still can       provenance statements is included in the protocol for
originate from different sources. This would be useful if    metadata harvesting by the The authors (Rephrase with
we are interested in a high retrieval precision and assume   cite) in [10] (OAI-PMH). But the provenance information
that the results of the automatic indexers decrease the      can only be provided for a whole set of metadata and
precision too much.                                          there is no easy way to extend it with other additional
    The second query, on the other hand, takes automatic     information. The Open Archives Initiative provides with
assignments into account, but makes use of the rank that     Object Reuse and Exchange (ORE) another, more ab-
is provided with every subject heading. This way, we         stract approach that addresses the requirement of prove-
can decide to which degree the retrieval result should be    nance information for aggregations of metadata [11].
extended by lower ranked subject headings, be they as-       ORE particularly introduces and motivates the idea to
signed by untrained people (tagging) or some automatic       give metadata aggregations specific URIs to identify
indexer.                                                     them as independent resources. Essentially, ORE pos-
                                                             tulates the clear distinction between URIs identifying
                     III. R ELATED W ORK                     resources and URIs identifying the description of the
                                                             resources. This is in line with the general postulation
    Early initiatives to define a vocabulary and usage-      of “Cool URIs”[12] and the proposed solution to the so
guidelines for the provenance of metadata was the A-         called httpRange-14 issue8 .
Core [7] and based on it the proposal [8] for the               Hillmann et al. [13] considered the problem of meta-
DCMI Administrative Metadata Working Group (http:            data quality in the context of metadata aggregation.
//dublincore.org/groups/admin/). The working group fin-      While mainly focused on the practical problems of
ished its work in 2003 and presented the Administrative
Components (AC) in [9], that addressed metadata for             8 httpRange-14             (http://www.w3.org/2001/tag/issues.html#
the entire record, for update and change and for batch       httpRange-14) was one (the 14th) of several issues that the Technical
interchange of records. Both initiatives focused more on     Architecture Group (TAG) of the W3C had to deal with: “What
                                                             is the range of the HTTP dereference function?”Basically, the
the definition of specific vocabularies to describe the      problem is that if a URI identifies a resource other than a webpage
provenance of metadata. There was not yet a concise          (non-information resource), then under this URI, no information
model to relate the metametadata with the metadata.          about the resource can be provided, because in this case, the URI
                                                             would also be the identifier for this information. The solution
For example, there was only an example given, hot            is to use HTTP redirects in this case, as described in this mail:
to use the AC in an XML representation. This is not          http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html
aggregation, the paper addresses the aspect of subsequent          The most powerful means to dealing with metameta-
augmentation with subject headings and changes the             data in OWL is the use of higher-order logics, which
emphasis from the record to the individual statement.          is supported, e.g., by OWL Full. However, as this type
Noting provenance and means of creation on this level          of metamodeling comes at the expense of decidability
of detail is considered necessary by the authors. They         [20], weaker forms of metamodeling such as punning,
proposed an extension of OAI-PMH to implement their            a restricted way of using identical names for different
solution. [14] further expands on quality issues and           types of entities (e.g. classes and individuals), have been
note inconsistent use of metadata fields and the lack of       proposed by the OWL community. In OWL 2, annotation
bibliographic control among the major problems. Pre-           properties can be used to make statements about entities,
serving provenance information at the repository, record       axioms and even annotations, but as annotation properties
or statement level is one of the proposed methods to           do not have a defined semantics in OWL, integrated
ensure consistent metadata quality.                            reasoning over the various layers of metadata requires ad-
                                                               ditional implementation effort [21]. Vrandecic et al. [22]
    Currently, the W3C Provenance Incubator Group              discuss different metamodeling options by virtue of sev-
(Prov-XG, http://www.w3.org/2005/Incubator/prov/) ad-          eral use cases, including the representation of uncertainty
dresses the general issue of provenance on the web.            in ontology learning [1], as well as ontology evaluation
The requirements abstracted from various use-cases are         based on OntoClean (see also [23]). In addition to these
summarized and further explained in by Zhao et al.             application scenarios, weak forms of metamodeling in
[15]. The conclusion of this paper is basically ours:          OWL are used, e.g., for including linguistic information
We need further standardization for the representation         in ontologies [24], but only few of these approaches are
of provenance information for interoperable provenance-        able to leverage the full power of logical inference over
aware applications. They recommend that a possible next        both metadata and metametadata [25].
RDF standard should address the provenance issue.

    Lopes et al. [16] emphasize the need for additional                               IV. C ONCLUSION
information as well, they refer to them as annotations and
examine the need for annotations without consideration             This paper is meant as a discussion paper. We have
of the actual implementation - be it reification or named      proposed five principles for the proper representation of
graphs. They come up with five types of annotations            metametadata which, in our opinion, have to be met by
– time, spatial, provenance, fuzzy and trust – that can        all source-agnostic, yet provenance-aware, linked data
be seen as the most obvious use-cases for additional           applications.
information.
                                                                   We have demonstrated that the technical requirements
    A general model for the representation of provenance       can already been met, and that the remaining problem is
information as well as a review of provenance-related          concerned with the establishments of conventions which
vocabularies is provided by The authors (Rephrase with         define best-practice recommendations. In particular, these
cite) in [17]. The model aims to represent the whole           conventions should clarify how the metametadata is
process of data creation and access, as well as the            actually represented – so that an application can become
publishing and obtaining of the associated provenance          aware of this metametadata, retrieve, maintain and repub-
information.                                                   lish it in a proper way. Currently, there is no accepted
                                                               best-practice that follows our principles. We are involved
    With the Open Provenance Model (OPM, http://               in the Metadata Provenance Taskgroup of the Dublin
openprovenance.org/) exists a specification for a prove-       Core Metadata Initiative9 which aims to develop such
nance model that meets the following requirements [18]:        best-practice recommendations in an as-open-as-possible
Exchange of provenance information, building of appli-         way. This is why we are seeking for feedback, ideas
cations on top of OPM, definition of provenance indepen-       and contributions to the ongoing discussions and the
dent from a technology, general applicability, multiple        outcomes of this task group – because we want metadata
levels of descriptions. Additionally, a core set of rules is   provenance. Now!
defined that allow to identify valid inferences that can be
made on the provenance representation.                            Acknowledgements. Johanna Völker is financed by
                                                               a Margarete-von-Wrangell scholarship of the European
   Finally, a comprehensive survey about publications          Social Fund (ESF) and the Ministry of Science, Research
on provenance on the web was created by The authors            and the Arts Baden-Württemberg.
(Rephrase with cite) in [19], who also mentions ap-
proaches to modeling provenance in OWL ontologies.               9 http://dublincore.org/groups/provenance/)
                          R EFERENCES                                     [14] D. I. Hillmann, “Metadata Quality: From Evaluation to
                                                                               Augmentation,” Cataloging & Classification Quarterly, vol. 46,
                                                                               no. 1, 2008. [Online]. Available: http://ecommons.library.cornell.
 [1] P. Haase and J. Völker, “Ontology learning and reasoning – deal-         edu/bitstream/1813/7899/1/Metadata Quality rev.pdf
     ing with uncertainty and inconsistency,” in Uncertainty Reasoning
     for the Semantic Web I, ser. Lecture Notes in Computer Science.      [15] J. Zhao, C. Bizer, Y. Gil, P. Missier, and S. Sahoo, “Provenance
     Springer, 2008, vol. 5327, pp. 366–384.                                   Requirements for the Next Version of RDF,” in Proceedings of
                                                                               the W3C Workshop - RDF Next Steps, June 26-27 2010, hosted by
 [2] J. J. Carroll, C. Bizer, P. Hayes, and P. Stickler, “Named Graphs,        the National Center for Biomedical Ontology (NCBO), Stanford,
     Provenance and Trust,” in Proceedings of the 14th International           Palo Alto, CA, USA, 2010.
     Conference on World Wide Web (WWW) 2005, May 10-14, 2005,            [16] N. Lopes, A. Zimmermann, A. Hogan, G. Lukacsy, A. Polleres,
     Chiba, Japan, 2005, pp. 613–622.                                          U. Straccia, and S. Decker, “RDF Needs Annotations,” in Pro-
                                                                               ceedings of the W3C Workshop - RDF Next Steps, June 26-27
 [3] S. Schenk and S. Staab, “Networked Graphs: A Declarative                  2010, hosted by the National Center for Biomedical Ontology
     Mechanism for SPARQL Rules, SPARQL Views and RDF Data                     (NCBO), Stanford, Palo Alto, CA, USA, 2010.
     Integration on the Web,” in Proceedings of the 17th International
     Conference on World Wide Web (WWW) 2008, April 21-25, 2008,          [17] O. Hartig, “Provenance Information in the Web of Data,” in
     Beijing, China, 2008.                                                     Proceedings of the Workshop on Linked Data on the Web (LDOW)
                                                                               2009, April 20, 2009, Madrid, Spain. CEUR-WS, 2009, pp.
 [4] G. Flouris, I. Fundulaki, P. Pediaditis, Y. Theoharis, and                1–9. [Online]. Available: http://www.dbis.informatik.hu-berlin.
     V. Christophides, “Coloring rdf triples to capture provenance,”           de/fileadmin/research/papers/conferences/2009-ldow-hartig.pdf
     in ISWC ’09: Proceedings of the 8th International Semantic Web
     Conference. Berlin, Heidelberg: Springer-Verlag, 2009, pp. 196–      [18] L. Moreau, B. Clifford, J. Freire, Y. Gil, P. Groth, J. Futrelle,
     212.                                                                      N. Kwasnikowska, S. Miles, P. Missier, J. Myers, Y. Simmhan,
                                                                               E. Stephan, and J. V. den Bussche, “The Open Provenance
 [5] F. Gandon and O. Corby, “Name That Graph - or the                         Model Core Specification (v1.1),” 2009. [Online]. Available:
     need to provide a model and syntax extension to specify                   http://openprovenance.org/
     the provenance of RDF graphs,” in Proceedings of the
                                                                          [19] L. Moreau, “The Foundations for Provenance on the Web,”
     W3C Workshop - RDF Next Steps, June 26-27 2010, hosted
                                                                               Foundations and Trends in Web Science, November 2009.
     by the National Center for Biomedical Ontology (NCBO),
                                                                               [Online]. Available: http://eprints.ecs.soton.ac.uk/18176/
     Stanford, Palo Alto, CA, USA, 2010. [Online]. Available:
     http://www.w3.org/2009/12/rdf-ws/papers/ws06/                        [20] B. Motik, “On the properties of metamodeling in owl,” in
                                                                               International Semantic Web Conference, ser. Lecture Notes in
 [6] K. Eckert, M. Pfeffer, and H. Stuckenschmidt, “A Unified Ap-              Computer Science, Y. Gil, E. Motta, V. R. Benjamins, and M. A.
     proach For Representing Metametadata,” in DC-2009 Interna-                Musen, Eds., vol. 3729. Springer, 2005, pp. 548–562.
     tional Conference on Dublin Core and Metadata Applications,
     2009.                                                                [21] D. T. Tran, P. Haase, B. Motik, B. C. Grau, and I. Horrocks,
                                                                               “Metalevel information in ontology-based applications,” in Pro-
 [7] R. Iannella and D. Campbell, “The A-Core: Metadata about                  ceedings of the 23th AAAI Conference on Artificial Intelligence
     Content Metadata,” 1999, internet-Draft Document. [Online].               (AAAI), Chicago, USA, July 2008.
     Available: http://metadata.net/admin/draft-iannella-admin-01.txt
                                                                          [22] D. Vrandecic, J. Völker, P. Haase, D. T. Tran, and P. Cimiano,
 [8] J. Hansen and L. Andresen, “Administrative Dublin Core (A-                “A metamodel for annotations of ontology elements in owl dl,”
     Core) Element,” 2001.                                                     in Proceedings of the 2nd Workshop on Ontologies and Meta-
                                                                               Modeling. Karlsruhe, Germany: GI Gesellschaft für Informatik,
 [9] ——, “AC - Administrative Components: Dublin Core                          Oktober 2006.
     DCMI Administrative Metadata,” 2003, final release of the
                                                                          [23] C. Welty, “Ontowlclean: Cleaning owl ontologies with owl,” in
     Dublin Core Metadata Initiative Administrative Metadata
                                                                               Proceeding of the 2006 conference on Formal Ontology in Infor-
     Working Group. [Online]. Available: http://www.bs.dk/standards/
                                                                               mation Systems. Amsterdam, The Netherlands, The Netherlands:
     AdministrativeComponents.htm
                                                                               IOS Press, 2006, pp. 347–359.
[10] Open Archives Initiative, “The Open Archives Initiative Protocol     [24] P. Buitelaar, P. Cimiano, P. Haase, and M. Sintek, “Towards
     for Metadata Harvesting,” 2008, protocol Version 2.0 of                   linguistically grounded ontologies,” in ESWC, ser. Lecture Notes
     2002-06-14, Edited by Carl Lagoze, Herbert Van de Sompel,                 in Computer Science, L. Aroyo, P. Traverso, F. Ciravegna,
     Michael Nelson and Simeon Warner. [Online]. Available:                    P. Cimiano, T. Heath, E. Hyvönen, R. Mizoguchi, E. Oren,
     http://www.openarchives.org/OAI/openarchivesprotocol.html                 M. Sabou, and E. P. B. Simperl, Eds., vol. 5554. Springer,
                                                                               2009, pp. 111–125.
[11] ——, “Open Archives Initiative - Object Reuse and Exchange:
     ORE User Guide - Primer,” Open Archives Initiative, 2008,            [25] B. Glimm, S. Rudolph, and J. Völker, “Integrated metamodeling
     edited by: Carl Lagoze, Herbert van de Sompel, Pete Johnston,             and diagnosis in owl 2,” in Proceedings of the International
     Michael Nelson, Robert Sanderson and Simeon Warner. [Online].             Semantic Web Conference (ISWC), November 2006, to appear.
     Available: http://www.openarchives.org/ore/1.0/primer

[12] W3C SWEO Interest Group, “Cool URIs for the Semantic Web:
     W3C Interest Group Note 03 December 2008,” 2008, edited
     by: Leo Sauermann and Richard Cyganiak. [Online]. Available:
     http://www.w3.org/TR/cooluris/

[13] D. I. Hillmann, N. Dushay, and J. Phipps, “Improving Metadata
     Quality: Augmentation and Recombination,” in Proceedings of
     the International Conference on Dublin Core and Metadata
     Applications. Dublin Core Metadata Initiative, 2004. [Online].
     Available: http://hdl.handle.net/1813/7897

</pre>