<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>select * where { :I :trust :you } How to Trust Interlinked Multimedia Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Wolfgang Halb</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Hausenblas</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute of Information Systems &amp; Information Management, JOANNEUM RESEARCH</institution>
          ,
          <addr-line>Steyrergasse 17, 8010 Graz</addr-line>
          ,
          <country country="AT">Austria</country>
        </aff>
      </contrib-group>
      <fpage>59</fpage>
      <lpage>65</lpage>
      <abstract>
        <p>Finding, accessing and consuming multimedia content on the Web is still a challenge. In this position statement we discuss three still widely neglected issues arising when one is interacting with multimedia content in social media environments: provenance, trust, and privacy. We will introduce a generic model allowing us to identify potential risks and problems, further discuss this model regarding multimedia content and finally outline how Semantic Web technologies can help.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Motivation</title>
      <p>
        It is a trivial truth that, in order to use any kind of content or service on the Web, one
must know how to access it (that is, to know the URI). What is true for the Web of
Documents is equally true for the Web of Data. While with the rise of linked data [
        <xref ref-type="bibr" rid="ref1 ref2">1,
2</xref>
        ] the situation has changed—publishing and consuming data is now possible straight
forward—there are still a number of issues in the discovery process. For example, with
our multimedia interlinking demonstrator CaMiCatzee [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] we have identified issues
around trust and believe of information regarding linked data in general and
multimedia content in special. CaMiCatzee allows the FOAF-profile-based search for person
depictions on flickr. However, when looking at Fig. 1, how can we find out if one of the
depicted persons actually is Sir Tim Berners-Lee?
      </p>
      <p>Motivated by this observation we will address—from a practical point of view—
widely neglected issues arising when one is interacting with multimedia content in
social platforms:
– Provenance . Where does content stem from? Who provided annotations?
– Trust . Is a person that provided an annotation trustworthy? Is the interlinking
eligible?
– Privacy . When interacting with content—what are the consequences?
In the following, we will first introduce a generic model addressing the three above
listed issues. Based on our experience with interlinked multimedia data we will have a
detailed look at the consequences when this model is applied to multimedia content.
2</p>
    </sec>
    <sec id="sec-2">
      <title>The Abstract Provenance-Trust-Privacy Model</title>
      <p>In order to identify issues with the usage of content on the Web, we have developed the
“Provenance-Trust-Privacy” (PTP) model (Fig. 2). Basically, two aspects are covered by
this model: the real life and the online world, the Web. In the PTP model we deal with
three orthogonal, nevertheless interdependent dimensions, (i) the social dimension, (ii)
the interaction dimension, and (iii) the content dimension.</p>
      <p>The Social Dimension. The dotted arrows in Fig. 2 represent social relations between
humans, either in the real life or online. While it is straight-forward to deal with the
case when people know each other in real life (and maybe continue this relation in the
online world), the other way round can cause substantial problems. For example, only
because I have added someone to my “buddy list” on a social platform such as LinkedIn
does not mean that I really know the person and that this person also (wants to) know
me.</p>
      <p>The Interaction Dimension. In the realm of the PTP model we understand the
interaction dimension of taking place online-only. Again, referring to Fig. 2, everything that
happens between the user-agent (being instructed by a human) and other participants
(server, etc.) on the Web. In general, two interaction patterns for the discovery and
access of resources on the Web can be observed:
– A direct access of the content. In this case the URI of the content is known in
advance by the end-user who instructs her user-agent to access the content. The URI
may originally stem from a newspaper advertisement, a friend may have pointed it
out in an e-mail, etc.
– An indirect access of the content by means of consulting an intermediate such as
a search engine, a recommendation system, etc.</p>
      <p>Based on these two interaction patterns, four possible paths can be identified:
1. The user-agent, equipped with the end-user’s profile and her desire consults an
intermediate. For example, a user enters a search string into Google and is presented
a list of URIs. The end-user happens to select a trustworthy source.
2. The user-agent, equipped with an URI from the end-user, accesses a trustworthy
source.
3. The user-agent, equipped with the end-user’s profile and her desire consults an
intermediate. This time, the end-user happens to select a troublesome source.
4. The user-agent, equipped with an URI from the end-user, accesses a troublesome
source.</p>
      <p>Obviously, the first two situations are desirable. The end-user has—for example
based on previous experiences or trust in the search results—found some content that
she can use and which is not causing her troubles (a virus, Trojan horse, etc.). However,
equally, we are after avoiding the latter two cases where the end-user actually finds
herself using content and/or services that are harmful and/or violate her privacy.
The Content Dimension. Regarding the content dimension one generally can state that
the more is known about the content, the easier it is to assess its usefulness and its
capabilities regarding a potential damage. Wherever possible, we are after self-descriptive
resources, that is we require a minimum level of metadata being available. In our case,
we focus on multimedia content. In the next section we will therefore initially discuss
this kind of content and along the metadata in greater detail.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Multimedia Content in the Provenance-Trust-Privacy Model</title>
      <p>In the position paper at hand we focus on multimedia content. We will in the
following discuss characteristics of spatio-temporal multimedia content in the context of the
emerging interlinking multimedia effort1. Further, we apply the above introduced PTP
model to multimedia content and try to derive requirements for it.</p>
      <p>
        Characteristics of Multimedia Content and Multimedia Metadata. Multimedia content
has some specific characteristics that allow and/or request special treatment. We have
reported on this in great detail elsewhere [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. A basic observation, however, with impact
on many parts of the interaction process is that with multimedia content we are dealing
almost always with spatio-temporal dimensions.
      </p>
      <p>From the prosumers point-of-view, multimedia content is cheap to produce and
available in high volumes (mobile phones, etc.). Further, most of the current content
in that regard is publicly and freely available (Flickr, youtube). Business models remain
vague. On the other hand, for professionally created content for very specific domains
such as broadcaster’s archives, adult content, etc. the fees are considerably high.</p>
      <p>Multimedia content is in general good for consumption in mobile environments (as
opposed to reading longish text on a mobile).</p>
      <p>In general it is hard and expensive to create good and detailed multimedia content
descriptions (for example in MPEG-7, etc.). This leads to a problem regarding the
finegrained search and automated summaries.</p>
      <p>In Table 1 the above discussed characteristics are summarised, and weighted
regarding the content itself on the one hand and the metadata on the other hand.</p>
      <p>Issue</p>
      <p>Content</p>
      <p>Metadata</p>
      <p>Remark
production (pro- ++
sumers)
production (pro- ++
fessionals)
consumption
search
summaries
- easy to produce high-volume
con</p>
      <p>tent (e.g., mobile phone)
- esp. high-level semantic content
de</p>
      <p>scriptions expensive
++ - easy to consume (also in mobile
en</p>
      <p>vironments)
- - - practically, only global descriptions</p>
      <p>are available
- - - - little automation possible</p>
      <p>Table 1. Overview on Multimedia Content Characteristics.</p>
      <p>Applying the PTP Model to Multimedia Content. With the above listed characteristics
of multimedia content in mind we claim the following regarding the application of the
PTP model:
–</p>
      <p>Any solution addressing PTP issues must at least avoid accessing “bad” content
and should support the discovery and consumption of “good” content.
1 http://www.interlinkingmultimedia.info/
– Existing and deployed multimedia metadata formats (such as Exif, ID3, etc.) have
to be taken into account.
– The solution at hand needs to scale to the size of the Web.
– It must be practically relevant in terms of availability in widely used platforms such
as Drupal, MediaWiki, etc. (for example as a plug-in, etc.; however it needs to be
integrated).
– Provider must be able to easily offer and administer it (e.g., “enable” it with little
configuration effort).
– Consumer must be able to use it in a non-disruptive way, for example as a part of
their everyday tools.</p>
      <p>In the next section we will report on how Semantic Web technologies can be used in
combination with other, deployed technologies (such as for identification and
authentication) in order to address the above listed requirements.
4</p>
    </sec>
    <sec id="sec-4">
      <title>How Semantic Web Technology Can Help</title>
      <p>We strongly believe that Semantic Web technologies can help to realise a PTP model for
multimedia content. Lots of research is already available2, however with little practical
impact to this end. Apart from avoiding unreliable or even malicious content, the main
aim of applying PTP to multimedia content is to help the user in finding trustworthy
information sources. In a first step, we consider all content created by a trusted person
or authority to be trustworthy. Solving this issue implies that there need to be techniques
that can ensure a content’s provenance and the content producer’s identity respectively.
Consequently means have to be made available that can decide which person to trust or
not.</p>
      <p>In the case of multimedia content it also has to be taken into account that
information associated with a single content item can potentially have a multitude of
contributors. A photo along with some metadata (title, description) on Flickr for instance might
be uploaded by the fictitious trusted user “T. Rustworthy”. Another user,“B. Adguy”,
could add a fake note about who is depicted in the photo. When accessing the photo and
the associated metadata it is thus not sufficient to only consider who contributed the
image but we would also need to be able to figure out who contributed the metadata about
it. Taking this further to video content it might also be relevant to take into account who
contributed which parts of a video (consider, for example, advertisements inserted into
a video stream).</p>
      <p>In the following we will discuss already available technologies that may be able
to address the PTP issues along the three identified dimensions. However, to date only
isolated solutions exist and there is still a lack of systems that incorporate all
available technologies for the user’s benefit. We envision a framework that would allow to
combine the below listed technologies and develop plug-ins for widely used platforms
(Flickr, Youtube, etc.) and systems (Drupal, etc.).
2 http://www4.wiwiss.fu-berlin.de/bizer/SWTSGuide/</p>
      <p>
        Social Dimension. For the identification as well as for the authentication several
technologies are available. A user can for example provide her OpenID3 to identify against a
service. Further, OAuth4 can be used for publishing and interacting with protected data.
Big players such as Google are already offering support for the above mentioned
technologies5. It seems advisable to build on this and contemplate what might be missing
to align it with the Web of Data, being based on RDF [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        While FOAF-based white listing and other related approaches [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ] are available,
there is still a need for an up-front agreed way to deploy it in widely used systems. The
same issue can be observed with privacy: there are proposals on the table (for example
P3P6) but no measurable uptake in terms of users, systems that offer it, etc. can be
stated.
      </p>
      <p>
        Interaction Dimension. Especially for data provenance it seems to us that named graphs [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]
offer a solid and scalable solution. With the rise of RDFa7 one can think of new
provenance mechanisms, as the hosting document can actually be understood as the “name”
of the graph. Just imagine Flickr (already offering licensing information in RDFa) to
include provenance information on both the content and the metadata, based on
vocabularies such as the “Semantic Web Publishing Vocabulary” [9, Chapter 6]. Finally, we
note that regarding the discovery and usage of linked (multimedia) data we are
currently working on VoiD, the “Vocabulary of Interlinked Datasets” 8—again, provenance
and trust issues are in scope, here.
      </p>
      <p>
        Content dimension. In our understanding, the content dimension of the PTP model
requires the most attention. Basic mechanisms proposed to represent the type of
multimedia content in RDF9 are available. Still, practical ways for creating and consuming rich
multimedia content descriptions are missing. Recently, we have proposed ramm.x [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
which allows to use existing multimedia metadata formats such as MPEG-7, Exif, ID3,
etc. in the Web of Data. However, we expect a fair amount of further research being
required to address provenance and trust issues properly and make tools and applications
available in a real-world setup.
3 http://openid.net/
4 http://oauth.net/
5 http://googledataapis.blogspot.com/2008/06/
      </p>
      <p>oauth-for-google-data-apis.html
6 http://www.w3.org/P3P/
7 http://www.w3.org/TR/xhtml-rdfa-primer/
8 http://community.linkeddata.org/MediaWiki/index.php?VoiD
9 http://www.w3.org/TR/Content-in-RDF/</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Heath</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ayers</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Raimond</surname>
          </string-name>
          .
          <article-title>Interlinking Open Data on the Web (Poster)</article-title>
          .
          <source>In 4th European Semantic Web Conference (ESWC2007)</source>
          , pages
          <fpage>802</fpage>
          -
          <lpage>815</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cyganiak</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Heath</surname>
          </string-name>
          .
          <article-title>How to Publish Linked Data on the Web</article-title>
          . http:// sites.wiwiss.fu-berlin.de/suhl/bizer/pub/LinkedDataTutorial/,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>M.</given-names>
            <surname>Hausenblas</surname>
          </string-name>
          and
          <string-name>
            <given-names>W.</given-names>
            <surname>Halb</surname>
          </string-name>
          .
          <article-title>Interlinking Multimedia Data</article-title>
          .
          <source>In Linking Open Data Triplification Challenge at the International Conference on Semantic Systems (I-Semantics08)</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>T.</surname>
          </string-name>
          <article-title>B u¨rger and M. Hausenblas. Why Real-World Multimedia Assets Fail to Enter the Semantic Web</article-title>
          . In International Workshop on Semantic Authoring,
          <article-title>Annotation and Knowledge Markup (SAAKM07), Whistler</article-title>
          , Canada,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>G.</given-names>
            <surname>Klyne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. J.</given-names>
            <surname>Carroll</surname>
          </string-name>
          , and
          <string-name>
            <surname>B. McBride.</surname>
          </string-name>
          <article-title>RDF/XML Syntax Specification (Revised)</article-title>
          .
          <source>W3C Recommendation</source>
          , RDF Core Working Group,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>J.</given-names>
            <surname>Golbeck</surname>
          </string-name>
          .
          <article-title>Combining Provenance with Trust in Social Networks for Semantic Web Content Filtering</article-title>
          .
          <source>In Provenance and Annotation of Data</source>
          ,
          <source>International Provenance and Annotation Workshop</source>
          , IPAW 2006, Chicago, IL, USA, May 3-
          <issue>5</issue>
          ,
          <year>2006</year>
          , Revised Selected Papers, volume
          <volume>4145</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>101</fpage>
          -
          <lpage>108</lpage>
          . Springer,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>A.</given-names>
            <surname>Harth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Polleres</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Decker</surname>
          </string-name>
          .
          <article-title>Towards a social provenance model for the Web</article-title>
          .
          <source>In Workshop on Principles of Provenance (PrOPr)</source>
          , Edinburgh, Scotland,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>J. J.</given-names>
            <surname>Carroll</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Hayes</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Stickler. Named Graphs</surname>
          </string-name>
          ,
          <article-title>Provenance and Trust</article-title>
          . In
          <source>In WWW 05: Proceedings of the 14th international conference on World Wide Web</source>
          , pages
          <fpage>613</fpage>
          -
          <lpage>622</lpage>
          . ACM Press,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          .
          <article-title>Quality-Driven Information Filtering in the Context of Web-Based Information Systems</article-title>
          .
          <source>PhD thesis</source>
          , Freie Universita¨t Berlin,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>M. Hausenblas</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Bailer</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <article-title>Bu¨rger, and</article-title>
          <string-name>
            <given-names>R.</given-names>
            <surname>Troncy</surname>
          </string-name>
          .
          <article-title>Deploying Multimedia Metadata on the Semantic Web (Poster)</article-title>
          .
          <source>In 2nd International Conference on Semantics And digital Media Technologies (SAMT 07)</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>