<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>XSLT Conversion between XLIFF and RDF</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Dimitra Anastasiou SFB/TR8 Computer Science/Languages Science University of Bremen Bremen</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <fpage>86</fpage>
      <lpage>91</lpage>
      <abstract>
        <p>This paper focuses on the conversion between the open standard XML Localisation Interchange File Format (XLIFF) and the Resource Description Framework (RDF). XLIFF is a localisation standard supported by proprietary and free and open source software (FOSS) localisation tools, while the latter is a standard model, basic ingredient in Semantic Web. We developed a converter based on Saxon XSLT Processor which translates XLIFF to RDF.</p>
      </abstract>
      <kwd-group>
        <kwd>Conversion</kwd>
        <kwd>Localisation</kwd>
        <kwd>Semantic Web</kwd>
        <kwd>Standards</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>Generally speaking, standards incorporate a solid body of knowledge and provide a
unified framework. In addition, when metadata is standardised, resources can be
identified, catalogued, and processed faster and more efficiently. Although standards
as such are a benefit for information management, in the last years we have seen too
many standards evolving in information science. In our opinion, the existence of too
many standards in tandem with their inflexible structure (of some standards) adds
complexity and leads to lack of interoperability; interoperability between Web
resources is crucial for communication between application components.</p>
      <p>This paper focuses on XLIFF1 and RDF2 and the conversion based on Saxon from
the former to the latter. Our work is motivated by the insight that Web resources
should be multilingual and XLIFF as a localisation standard is capable to help localise
ontologies and thus create multilingual linked data. A wider target range of users and
applications will then be reached. The automatic conversion from XLIFF into RDF
can be used as an API both by localisation tools and Semantic Web applications.</p>
      <p>
        In section 2 we describe some related work about combining multilinguality with
Semantic Web. In sections 3 and 4 some examples of XLIFF and RDF are provided.
Section 5 discusses the XLIFF-RDF interoperability and then we conclude the paper.
In 2004 [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] stated that Human Language Technology faces new multilingual and
multicultural challenges for the Semantic Web and presented relevant ongoing
initiatives. One year later, [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] pointed out the usefulness of a multilingual Semantic
Web, particularly to help translate websites through the use of ontologies, manage
group knowledge in multilingual form, and create international communication base
for industry and commerce. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] used the Universal Networking Language (UNL) as a
step between the process of acquiring knowledge from textual sources and translating
it into one of the state-of-the-art knowledge representation formalisms for building
multilingual ontologies.
      </p>
      <p>The Multilingual Semantic Web workshop started in 2010 and continues with
annual workshops; the same holds for the XLIFF International Symposium. Some
research projects: the Multilingual Web3, Flarenet4, META-NET5, and Monnet6 see
the symbiotic relationship between multilingual resources and Semantic Web.</p>
      <p>
        As far as the conversion between XLIFF and other standards is concerned, the
Okapi Framework provides XLIFF conversion utilities, e.g. to Translation Memory
eXchange (TMX). [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] describes how to convert documents to XLIFF and back to the
original format through text extraction, pre-translation, translation, reverse
conversion, and translation memory improvement. A framework which combines
many localisation standards is the MultiLingual Information Framework (MLIF) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ];
an overview about localisation standards can be found in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. A model that has been
proposed to associate linguistic data to ontologies is the ‘Linguistic Information
Repository’ (LIR) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], designed to account for cultural and linguistic differences
among languages. Lemon7 is another model sharing lexical information on the
Semantic Web; noteworthy is the converter between lemon and the Lexical Markup
Framework (LMF).
      </p>
      <p>
        Our main motivation for XLIFF2RDF conversion is the concept of ‘ontology
localization’, a term coined by [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]: “Ontology Localization is the adaptation of an
ontology to a particular language and culture”. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] state that ontology localisation is
an activity with both pragmatic and economic goals. The former can be seen in the
fostering reuse of ontologies already available for the domain in question instead of
building them from scratch, and the latter, a result of the former, is seen in the stage
of cost reduction compared to building a completely new ontology.
      </p>
    </sec>
    <sec id="sec-2">
      <title>3 XLIFF</title>
      <p>XLIFF is an open localisation standard supported by proprietary and FOSS
localisation tools. It is under the auspices of OASIS and is understood by many
3 http://www.multilingualweb.eu/en, 12/09/11
4 http://www.flarenet.eu/, 12/09/11
5 http://www.meta-net.eu/, 12/09/11
6 http://www.monnet-project.eu/Monnet/Monnet/English?init=true, 12/09/11
7 http://lexinfo.net/, 12/09/11
actors: software providers, localisation service providers, and localisation tools
providers. Semantic localisation metadata is very important in a localisation workflow
to distinguish between the responsibilities of each stakeholder (project manager,
engineer, translator, proofreader), between translatable and non-translatable content,
annotate (in the case of translatable content) the status of the strings and so on.
Particularly in software localisation, coordinates of menus dialogue boxes, version
control, count of screenshots belong to the most important metadata. The following
example contains an XLIFF file with three translation units (TUs). TU elements
include a &lt;source&gt;, &lt;target&gt; and associated elements.</p>
      <p>1. &lt;?xml version="1.0" encoding="UTF-8" ?&gt;
2. &lt;xliff version="1.2" xmlns="urn:oasis:names:tc:xliff:document:1.2"&gt;
3. &lt;file original="minimal_XLIFF.html" source-language="en-us" target-language="de-de"
datatype="html"&gt;
4. &lt;body&gt;
5. &lt;trans-unit id="#1"&gt;
6. &lt;source&gt;book&lt;/source&gt;
7. &lt;target&gt;Buch&lt;/target&gt;
8. &lt;/trans-unit&gt;
9. &lt;trans-unit id="#2"&gt;
10. &lt;source&gt;book publisher&lt;/source&gt;
11. &lt;target&gt;Buchverlag&lt;/target&gt;
12. &lt;/trans-unit&gt;
13. &lt;trans-unit id="#3"&gt;
14. &lt;source&gt;This book is good!&lt;/source&gt;
15. &lt;target&gt;Dieses Buch ist gut!&lt;/target&gt;
16. &lt;/trans-unit&gt;
17. &lt;/body&gt;
18. &lt;/file&gt;
19. &lt;/xliff&gt;</p>
      <p>Example 1. XLIFF file with three translation units. Line 1: XML declaration, Line 2: XML
schema, Line 3: file metadata, Lines 5-16: file data (three TUs).
4 RDF
RDF is family of W3C specifications which describe Web resources. Here is a brief
explanation of Resource, Property, and Property value by means of the XLIFF Ex.1:
• A Resource is anything that can have a URI, e.g. minimal_XLIFF.html;
• A Property is a Resource that has a name, such as trans-unit, source;
• A Property value is the value of a Property, such as This book is good!
The example 1 can be represented in an RDF graph as follows:</p>
      <sec id="sec-2-1">
        <title>Diagram 1. RDF graph of Example 1</title>
        <p>Accordingly, every XLIFF file can be represented in an RDF graph. The circles are
the resources, the labels on the arrows are the properties, and the content of the
rectangles are the property values. idX is a placeholder for a resource representing
the body.</p>
        <p>Building a bridge for interoperability between RDF and other standards is
something common: WSDL-RDF, RDF-Topic Maps, OWL-RDF, and others.
However, these standards, which RDF can be converted from and into, also come
from the Semantic Web world and not from the localisation scene.</p>
        <p>
          As far as the representation of multilingual information in RDF is concerned, RDF
used the RFC 3066 standard (published in 2001) for language tags for literals in
natural languages. The revision RFC3066bis included productive use of language,
country and script codes. [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] suggested a small change to the RDF model theory to
permit access to the language tag in the formal semantics, giving this ontology a
precise formal meaning; their approach defined a new property called rdflg:lang.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>5 Interoperability</title>
      <p>The greatest contribution of XLIFF is the nature of its content, i.e. the capture of
translation pairs, rather than the formalisation vehicle of the knowledge, be it XML or
RDF. We do not intend to reify XLIFF, but to make XLIFF portable to RDF. The
reasons why an XLIFF2RDF mapping and conversion are useful follow:
i. Any file format which can be converted into XLIFF can be then converted to RDF;
ii. RDF ontology labels can be translated using XLIFF;
iii. Web resources can be described by XLIFF metadata.</p>
      <p>
        A practical implementation of standards’ interoperability between XLIFF and
RDF(S) is distinguished between two parts: mapping XLIFF elements and attributes
to RDF and automatically converting from XLIFF into RDF. The mapping of three
XLIFF files has been described in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. In order to cover more than three use cases,
automatic conversion is needed. We created different types/use cases of XLIFF files
and accordingly incremental EXtensible Stylesheet Language Transformations
(XSLTs) to translate various XLIFF files: a file with 3 translation units, with file
processing metadata, with alternative translations, a document containing two files,
and a modularised file containing a lot of metadata and inline markup.
A sample of an XSLT follows:
1. &lt;xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:a="urn:oasis:names:tc:xliff:document:1.2"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:xliff="http://docs.oasis-open.org/xliff/xliff-core/xliff-core.html#"&gt;
2. &lt;xsl:template match="/"&gt;
3. &lt;rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"&gt;
4. &lt;xliff:file&gt;
5. &lt;xsl:attribute name="rdf:about"&gt;
6. &lt;xsl:value-of select="a:xliff/a:file/@original"/&gt;
7. &lt;/xsl:attribute&gt;
8. &lt;xsl:attribute name="source-language"&gt;
9. &lt;xsl:value-of select="a:xliff/a:file/@source-language"/&gt;
      </p>
      <p>Example 2. Sample of the XSLT</p>
      <p>It should be mentioned that there is discrepancy between interoperability between
data based on standards and interoperability between standards. Conversion between
standards plays a small part within the wider scope of interoperability which includes,
among others, supporting relevant standards and conforming with specifications.</p>
    </sec>
    <sec id="sec-4">
      <title>5.1 Converter</title>
      <p>The development of a conversion tool to translate from XLIFF into RDF automates
and thus accelerates the process. We used NetBeans IDE to create a GUI of the
conversion tool (see Screenshot 1). For our conversion utilities we used the Saxon
home edition 9.3 version8. The home edition is an open source product available
under the Mozilla Public License. It provides implementations of XSLT 2.0, XQuery
1.0, and XPath 2.0 and is available for both Java and .NET. The user can input one or
more XLIFF file(s) to the tool, convert them to RDF and preview them.</p>
      <sec id="sec-4-1">
        <title>Screenshot 1. XLIFF2RDF conversion tool The converter is under Google code hosting9 website. There users can freely get a local copy of the tool or create their own clone.</title>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>6 Discussion and Conclusion</title>
      <p>In this paper we discussed the interoperability between the localisation standard
XLIFF and RDF. We showed ongoing initiatives, projects, and tools combining
multilinguality with Semantic Web. We developed a converter from XLIFF to RDF
by using and adapting the Java API of the XSLT processor Saxon. We wrote some
sample XLIFF files and adopted a modular transitional file provided in the XLIFF
latest specifications in order to create corresponding XSLTs.</p>
      <p>In our opinion, localisation is often regarded only as a business strategy to increase
return on investment and not as a research field which can both enrich and gain from
the Semantic Web and Linked Data. Localisation standards and particularly XLIFF
has received little attention although it covers many actors’ needs.</p>
      <p>In Semantic Web context, it is an arbitrary decision in which natural language the
ontology labels are provided, and thus many researchers see the need for multilingual
ontologies; challenges, like cross-lingual mapping and translation follow the existence
8 http://saxon.sourceforge.net/, 12/09/11
9 http://code.google.com/p/xliff-rdf/, 28/03/11
of multilingual ontologies. Our conversion tool is a contribution to build a bridge
between localisation and Semantic Web resources, so that localisation tools can
localise ontologies and Semantic Web resources are populated with
localisationrelated metadata. After the XLIFF2RDF conversion, metadata can be reused in the
Semantic Web to represent multilingual ontologies. The XLIFF2RDF conversion tool
is hosted on Google code hosting website. There other users can freely get a local
copy of the tool; thus replication of the tool is allowed. The conversion tool fulfills its
basic requirements, i.e. XLIFF files are represented in RDF. Not only minimal XLIFF
examples with one TU, but with more TUs and also with file processing metadata,
alternative translations, etc. can be successfully converted. Five use cases have been
successfully tested, however more quantitative and qualitative examples are planned
to be converted. We plan to extend the conversion API for other standards. At first
place, we plan to translate from XLIFF into OWL. Also interoperability between
other localisation and internationalisation standards is also among future prospects. In
terms of quality assurance, existing validation tools will be part of our tool.
Acknowledgment. We gratefully acknowledge the support of the Deutsche
Forschungsgemeinschaft (DFG) through the Collaborative Research Center SFB/TR 8
Spatial Cognition - Subproject I5-DiaSpace.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Declerck</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buitelaar</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Calzolari</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Lenci</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Towards</surname>
          </string-name>
          <article-title>A Language Infrastructure for the Semantic Web</article-title>
          .
          <source>Proceedings of LREC</source>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Hahn</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Vertan</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <article-title>Challenges for the Multilingual Semantic Web</article-title>
          .
          <source>Proceedings of the International MT Summit X</source>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Cardeñosa</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gallardo</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Iraola</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>De la Villa</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <article-title>A New Knowledge Representation Model to Support Multilingual Ontologies. A Case study</article-title>
          .
          <source>Proceedings of the International Conference on Information and Knowledge Engineering</source>
          ,
          <fpage>313</fpage>
          -
          <lpage>319</lpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Raya</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <article-title>XML Localisation Interchange File Format as an intermediate file format</article-title>
          .
          <source>IBM developerWorks</source>
          (
          <year>2004</year>
          ) http://www.maxprograms.com/articles/xliff.html
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Cruz-Lara</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bellalem</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Ducret</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Kramer</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          <article-title>Standardizing the management and the representation of multilingual data: the MultiLingual Information Framework</article-title>
          .
          <source>International Workshop on Language Resources for Translation work, Research and Training</source>
          ,
          <fpage>35</fpage>
          --
          <lpage>38</lpage>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Anastasiou</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Morado</given-names>
            <surname>Vázquez</surname>
          </string-name>
          ,
          <string-name>
            <surname>L.</surname>
          </string-name>
          <article-title>Localisation Standards and Metadata</article-title>
          .
          <source>Proceedings of the 4th Metadata and Semantics Research Conference (MTSR</source>
          <year>2010</year>
          ),
          <source>Communications in Computer and Information Science</source>
          , Springer,
          <fpage>255</fpage>
          --
          <lpage>276</lpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Peters</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montiel-Ponsoda</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Aguado de Cea</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <article-title>Localizing Ontologies in OWL</article-title>
          .
          <source>Proceedings of the ISWC07 OntoLex workshop</source>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Suarez-Figueroa</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            and
            <surname>Gomez-Perez</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <article-title>First attempt towards a standard glossary of ontology engineering terminology</article-title>
          .
          <source>Proceedings of the 8th International Conference on Terminology and Knowledge Engineering</source>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Cimiano</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montiel-Ponsoda</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buitelaar</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Espinoza</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Gomez-Perez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <article-title>A note on ontology localization</article-title>
          .
          <source>Journal of Applied Ontology (JAO)</source>
          ,
          <volume>5</volume>
          (
          <issue>2</issue>
          ),
          <fpage>127</fpage>
          --
          <lpage>137</lpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Carroll</surname>
            ,
            <given-names>J.J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Phillips</surname>
            ,
            <given-names>A. Multilingual RDF</given-names>
          </string-name>
          and OWL.
          <source>The Semantic Web: Research and Applications, Lecture Notes in Computer Science</source>
          , Vol.
          <volume>3532</volume>
          /
          <year>2005</year>
          ,
          <fpage>15</fpage>
          --
          <lpage>19</lpage>
          (
          <year>2005</year>
          ). doi:
          <volume>10</volume>
          .1007/11431053_
          <fpage>8</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Anastasiou</surname>
            ,
            <given-names>D. XLIFF</given-names>
          </string-name>
          <article-title>Mapping to RDF</article-title>
          .
          <source>JIAL (The Journal of Internationalisation and Localisation)</source>
          , to appear (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>