<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Lifting Media Fragment URIs to the next level</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Thomas Kurz</string-name>
          <email>thomas.kurz@salzburgresearch.at</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Harald Kosch</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Media Fragment URIs 1.0</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Salzburg Research</institution>
          ,
          <addr-line>Salzburg</addr-line>
          ,
          <country country="AT">Austria</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Passau</institution>
          ,
          <addr-line>Passau</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The Media Fragment URI specification was released in 2012 and has been taken up by research and industry to some extend. Nevertheless the impact is weak in comparison to other W3C recommendations. Missing features, under-specified parts and a weak integration into common standards could be a key issues for that. In this paper we describe possible extensions that strengthen the specification in this points in order to make the Media Fragment URIs attractive for a broader community.</p>
      </abstract>
      <kwd-group>
        <kwd>Linked Media</kwd>
        <kwd>Media Fragment URI</kwd>
        <kwd>Media</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>handled directly on the media server. For Media Fragments this is important in
particular as any preprocessing like e.g. spatial-temporal cropping can
drastically decrease the size of file that has to be transmitted to the client.
The working group describes four fragment dimensions, namely temporal,
spatial, track and id. The temporal dimension is specified as "an interval with a
begin time and an end time" and can be given in Normal Play Time (npt),
SMPTE timecodes and as real-world clock time. E.g. video.mp4#t=1,10 defines
a temporal fragment of video.mp4 that starts at second 1 and ends at second
10.</p>
      <p>The spatial dimension "selects an area of pixels from visual media streams."
In the current version only the identification of rectangular regions is supported.
The units that are considered for spatial clipping are pixel and percentage. The
dimensions of track and id is specified very weak regarding semantics but can
be used to identify a specific media layer and/or a specific predefined, named
section of the media item. On this paper we focus on extensions for
spatialtemporal fragments. Fragment dimensions can be combined in several ways to
describe e.g. spatial-temporal fragments, like outlined in Figure 1.
Using Media Fragment URI standard for media section identification convinces
of the easy of use and its seamless integration into well known Web
infrastructure. Nevertheless the limitations often cause problems:
Inprecise spatial fragments: Spatial regions often cannot be sufficiently
specified with rectangles. This fact may cause problems in calculating relations
between fragments, e.g. if bounding boxes of spatial objects overlap, whereby
the objects itself don’t.</p>
      <p>Lacking support for moving objects: Spatial regions in videos rarely stay
on the same position during longer temporal sections (e.g. actors moving
around within a scene). To sufficiently describe such scenarios many short
spatial-temporal fragments have to be used, which leads to a big overhead
in data transfer and recombination.</p>
      <sec id="sec-1-1">
        <title>Missing styling support for fragment display: While common web com</title>
        <p>ponent styling is well supported via Cascading Style Sheets (CSS), this
support is currently lacking for Media Fragments. This limitation causes a major
programming overhead for projects and raises the barrier for using the Media
Fragment URI standard in productive web projects.
3</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Media Fragment URI Extensions</title>
      <p>In the following we present how Media Fragment URIs can be extended to various
directions. A demo based on a basic implementation is available on http://
tkurz.github.io/media-fragment-uris-ideas/.</p>
      <sec id="sec-2-1">
        <title>Shape Extension</title>
        <p>
          Currently Media Fragment URI’s spatial dimension is limited to rectangular
shapes (xywh). An extension to basic geometric shapes, like circles, ellipses, etc.
would allow a more fine-grained fragment description. In [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] we recommended,
inspired by SVG Basic Shape specification in [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] four shapes in addition or
substitution to xywh:
Rectangle: rect=x,y,w,h[,rx,ry]
The integers denote x, y, width, height and (optionally) the x and y radius (rx
and ry) of the ellipse used to round off the corners of the rectangle respectively.
Circle: circle=x,y,r
The integers denote x and y as the center of the circle and r as the radius.
Ellipse: ellipse=cx,cy,rx,ry
The integers denote cx, cy (the center of the ellipse) and rx, ry (the radius of
the ellipse).
        </p>
        <p>Polygon: polygon=x1,y2*(,xn,yn)
The value is composed by 2*n comma-separated integers (with n 2 N ). The
integers denote x1, y1 as starting point and xn, yn as points on the polyline that
borders the polygon; the polygon is closed.</p>
        <p>The value is an optional format pixel: or percent:, the defaulting format is
pixel. We give an example for an ellipse fragment in Figure 2, all the other shapes
work accordingly.</p>
      </sec>
      <sec id="sec-2-2">
        <title>Transformation Extension</title>
        <p>Even with this shape extensions the identification of spatial fragments is
limited. Additionally, with regard to further extensions for example animations, a
proper representation of shape transformation and translation is lacking. We aim
to overcome this limitations by introducing four shape transformations:
Translate: translate=x[,y] The integers denote x for horizontal and y
(optionally) for vertical translation.</p>
        <p>Scale: scale=x[,y] The integers denote x for horizontal and y (optionally)
for vertical scale.</p>
        <p>Rotate: rotate=a[,x,y] The integers denote a as rotation angle and x,y as
center of rotation. The default center is denoted by the center of the bounding
box of the region to rotate.</p>
        <p>Skew: skew=x[,y] The integers denote x for horizontal and y (optionally) for
vertical skew.</p>
        <p>Transformations in Media Fragment URIs are only considered if one and only
one shape is defined. Transformations can be stacked. If a transformation type
occurs more than once, only the first value is considered. Like for shapes, the
value has an optional format pixel: or percent:, whereby the defaulting format
is pixel. Figure 3 shows a transformed shape.</p>
      </sec>
      <sec id="sec-2-3">
        <title>Animated Transformation Extension</title>
        <p>The static shapes and transformations mainly focus on still images. But spatial
fragments often needs to transform over time e.g. for videos or interactive charts.
We introduce animated transformations as temporal extension to the static in
order to satisfy this need.</p>
        <p>Animated Translate: aTranslate=d1,x1[,y1]*[;dn,xn[,yn]]
The value is an optional format pixel: or percent: (defaulting to pixel) plus a
semicolon-separated list of comma-separated numbers. The first number of each
number set (d.) is defined as duration and may be defined in percent (for videos)
or milliseconds (for images). The other numbers represent the translation as
specified.</p>
        <p>Animated Scale: aScale=d1,x1[,y1]*[;dn,xn[,yn]]
Analogous to animated translate.</p>
        <p>Animated Rotate: aRotate=d1,r1[,x1,y1]*[;dn,rn[,xn,yn]]
Analogous to animated translate.</p>
        <p>Animated Skew: aSkew=d1,x1[,y1]*[;dn,xn[,yn]]
Analogous to animated translate.</p>
        <p>
          Animated Transformations in Media Fragment URIs are only considered if one
and only one shape is defined. Animated transformations can be stacked. If an
animated transformation type occurs more than once, only the first value is
considered. Figure 4 shows how a spatial fragment is animated over time in
both scale and translation. In this case there is no transformation until 45% of
the temporal fragment (3.5 seconds overall), in the next 10% of time the shape
translates to south-west and scales to 70%. During the remaining time there is
no transformation.
The current standard as well as the proposed extension still lack a proper formal
description, which makes it hard to apply set operations like intersection, union,
etc. to Media Fragments. In [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] we already worked out such a model, which can
be a basis for further specification.
4
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Related approaches</title>
      <p>On http://github.com/tomayac/dynamic-media-fragments the author describes,
how spatial Media Fragments xywh can be extended to temporal dynamics by
stringing together quadruples, whereby each one identifies a rectangular shape.
The shapes are equally distributed in time (represented by a temporal fragment
or the whole video play time). The approach is aligned with CSS transitions
and such fits smoothly into current browser animation implementations. To
extend the approach from equal to fixed distribution, the author suggested to
extend the quadruples to a micro syntax representing the time in percentage.
Another interesting approach is described on https://github.com/oaubert/
mediafragment-prototype. The author introduces a new fragment parameter
shape, which represents the spatial dimension and utilizes SVG path definition
as values. The main difference to our approach is the fact that shapes are not
first class entities (defined by a name-value pair) but are values of one spatial
dimension descriptor. For the temporal dynamic the author introduces a
trajectory parameter with an SVG path value, which makes the defined shape follow
the given path within a given temporal fragment. The author also suggest to
extend both the shape and the trajectory values to basic SVG shapes.</p>
    </sec>
    <sec id="sec-4">
      <title>Styling Media Fragments</title>
      <p>To make Media Fragment URIs more attractive for Web designers and
developers, an integration into common Web standards is essential. In this section we
sketch how CSS can be adopted to Media Fragments for both spatial and
temporal dimension. In our approach the fragment can be seen as an element, which
is contained within a layer on top the original media item (image or video). To
access this element we introduce a pseudo selector ::fragment. To allow access
to the layer itself without the fragment (e.g. in order to set the opacity of the
media item and not influence the styling of the fragment), we define a second
pseudo selector ::fragment-outer). In the example, we set the opacity of this
outer fragment and a red border to the fragment itself:
img {</p>
      <p>background c o l o r : w h i t e ;
}
img : : f r a g m e n t {</p>
      <p>b o r d e r : 1 px s o l i d r e d ;
}
img : fragment o u t e r {</p>
      <p>o p a c i t y : 0 . 5 ;
}
Figure 5 shows the result of the styling. A special case is the clipping of an
fragment, which means that the fragment becomes the first class entity regarding
display. This can be solved by setting the display attribute of the media item
to none while keeping the fragment shown with display: block. The pseudo
selector also allows to access and alter the fragment even programmatically e.g
with Javascript. As the fragment is handled like a common HTML element it is
possible to add sub-element like e.g. spans with description text.</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>In this paper we prosed extensions of the current Media Fragment URI
specification regarding additional shapes, transformation and dynamic. Additionally
we sketched an extension of the CSS standard to properly add style information
to Media Fragments. The whole work is still in an early phase but can be seen as
a starting point for discussion to present the opportunities of Media Fragment
URIs to a broader community and trigger a process to lift the standard to a
next level.
7</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>The paper is mainly inspired by the discussions at the WWW conference in
20153. This paper is developed within MICO, a research project partially funded
by the European Commission 7th FP (grant agreement no: 610480).
3 https://lists.w3.org/Archives/Public/public-media-fragment/2015May/0003.
html</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Patrick</given-names>
            <surname>Aichroth</surname>
          </string-name>
          , Johanna Björklund, Kai Schlegel, Thomas Kurz, and Thomas Köllmer.
          <article-title>Specifications and Models for Cross-Media Extraction</article-title>
          , Metadata Publishing, Querying and Recommendations - Final
          <string-name>
            <surname>Version</surname>
          </string-name>
          .
          <source>Technical report</source>
          , Media in Context - MICO,
          <year>December 2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Werner</given-names>
            <surname>Bailer</surname>
          </string-name>
          , Chris Poppe, WonSuk Lee,
          <string-name>
            <given-names>Martin</given-names>
            <surname>Höffernig</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Florian</given-names>
            <surname>Stegmaier</surname>
          </string-name>
          .
          <article-title>Metadata API for media resources 1.0. W3C recommendation, W3C</article-title>
          ,
          <year>March 2014</year>
          . http://www.w3.org/TR/2014/REC-mediaont
          <source>-api-1</source>
          .
          <fpage>0</fpage>
          -
          <lpage>20140313</lpage>
          /.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>T.</given-names>
            <surname>Berners-Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fielding</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Masinter</surname>
          </string-name>
          .
          <article-title>Uniform Resource Identifier (URI): Generic Syntax</article-title>
          .
          <source>RFC 3986</source>
          , Network Working Group,
          <year>January 2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Jon</given-names>
            <surname>Ferraiolo</surname>
          </string-name>
          .
          <article-title>Scalable vector graphics (SVG) 1.0 specification. W3C recommendation, W3C</article-title>
          ,
          <year>September 2001</year>
          . http://www.w3.org/TR/2001/REC-SVG-
          <volume>20010904</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Kurz</surname>
          </string-name>
          , Georg Güntner, Violeta Damjanovic,
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Schaffert</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Manuel</given-names>
            <surname>Fernandez</surname>
          </string-name>
          .
          <article-title>Semantic enhancement for media asset management systems</article-title>
          .
          <source>Multimedia Tools and Applications</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>27</lpage>
          ,
          <year>2012</year>
          .
          <volume>10</volume>
          .1007/s11042-012-1197-7.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Kurz</surname>
          </string-name>
          , Kai Schlegel, and
          <string-name>
            <given-names>Harald</given-names>
            <surname>Kosch</surname>
          </string-name>
          .
          <article-title>Enabling access to Linked Media with SPARQL-MM</article-title>
          .
          <source>In Proceedings of the 24nd international conference on World Wide Web (WWW2015) companion (LIME15)</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Lyndon J. B. Nixon</surname>
            , Matthias Bauer, Cristian Bara, Thomas Kurz,
            <given-names>and John</given-names>
          </string-name>
          <string-name>
            <surname>Pereira</surname>
          </string-name>
          . Connectme:
          <article-title>Semantic tools for enriching online video with web content</article-title>
          .
          <source>In Steffen Lohmann and Tassilo Pellegrini</source>
          , editors,
          <string-name>
            <surname>I-SEMANTICS</surname>
          </string-name>
          (
          <article-title>Posters &amp; Demos)</article-title>
          , volume
          <volume>932</volume>
          <source>of CEUR Workshop Proceedings</source>
          , pages
          <fpage>55</fpage>
          -
          <lpage>62</lpage>
          . CEUR-WS.org,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Robert</given-names>
            <surname>Sanderson</surname>
          </string-name>
          , Paolo Ciccarese, and
          <string-name>
            <given-names>Benjamin</given-names>
            <surname>Young</surname>
          </string-name>
          .
          <article-title>Web annotation data model</article-title>
          . Working draft, W3C Working Draft,
          <year>October 2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>Raphaël</given-names>
            <surname>Troncy</surname>
          </string-name>
          , Davy Van Deursen,
          <string-name>
            <surname>Erik Mannens</surname>
            , and
            <given-names>Silvia</given-names>
          </string-name>
          <string-name>
            <surname>Pfeiffer</surname>
          </string-name>
          .
          <source>Media Fragments URI 1</source>
          .0 (
          <issue>basic</issue>
          ).
          <source>W3C recommendation, W3C</source>
          ,
          <year>September 2012</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>