<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>XKOS: Extending SKOS for Describing Statistical Classifications</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Franck Cotton</string-name>
          <email>franck.cotton@insee.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Daniel W. Gillman</string-name>
          <email>Gillman.Daniel@bls.gov</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yves Jaques</string-name>
          <email>jaques@unfpa.org</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institut National de la Statistique et des Études Économiques, Paris, France US Bureau of Labor Statistics</institution>
          ,
          <addr-line>Washington</addr-line>
          ,
          <country>USA UN Population Fund</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2004</year>
      </pub-date>
      <abstract>
        <p>A statistical classification scheme is used in statistics to place units in one and only one category from that scheme. These categories are used as units of analysis; dimensions in databases, tables, and time series; criteria for subdividing or aggregating populations; etc. SKOS provides a simple model for rendering a statistical classification scheme in Linked Open Data (LOD) format. However, it is too simple for some purposes, so XKOS (eXtended Knowledge Organization System) was designed to extend it in order to allow richer representations of statistical classifications. This paper describes the statistical needs to address and the results of the XKOS design work, including some examples on well-known statistical classifications. Concrete implementations of the vocabulary are also provided, with examples of real use cases.</p>
      </abstract>
      <kwd-group>
        <kwd />
        <kwd>SKOS</kwd>
        <kwd>XKOS</kwd>
        <kwd>classification</kwd>
        <kwd>scheme</kwd>
        <kwd>statistics</kwd>
        <kwd>linked data</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        This paper contains a brief description of the eXtended Knowledge Organization
System (XKOS) and a rationale for why it was developed. In particular, there is a focus
on describing statistical classifications with XKOS. XKOS is an extension of the
Simple Knowledge Organization System (SKOS) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] applicable to the needs of
statistical offices and social science data users. As we show in this paper, some limitations
in SKOS leave it inadequate to the task of describing statistical classifications.
XKOS is designed to fill these gaps.
      </p>
      <p>
        The original SKOS is used widely in LOD applications, as seen in the SKOS
Implementation Report1. As a result, a group was formed at the Dagstuhl Workshops
held at Schloβ Dagstuhl2 in Germany on Semantic Statistics for Social, Behavioural,
and Economic Sciences: Leveraging the DDI3 Model for the Linked Data Web in
September 20114 and October 20125to look at the suitability of using SKOS in the
statistical data community for LOD work [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>In this paper, we provide introductory remarks to set the stage for discussion,
provide a short primer on statistical classifications, describe limitations of SKOS to
statistical classifications, and lay out the extensions to SKOS that form the XKOS
specification. In particular, we show how the semantics of classification systems in our
own offices are represented more faithfully by extending SKOS with XKOS through
the use of examples. In a last section, we give examples of concrete implementations
of the vocabulary.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Statistical Classifications</title>
      <p>A statistical classification scheme (SCS) is representable as a SKOS Concept Scheme
and used for statistical purposes. In the next section, we illustrate the need for
extensive semantics to account for the meanings conveyable in SCSs. Here, we say what
an SCS is used for by statistical organizations.</p>
      <p>An SCS is a hierarchical skos:ConceptScheme which includes concepts, associated
codes (numeric string labels), short textual names (also labels), definitions, and longer
descriptions that include rules for their use. Examples of SCSs used in the US
statistical agencies are
• Standard Occupational Classification6 (SOC)
• North American Standard Industrial Classification7 (NAICS)
By a hierarchy, we mean a system of concepts where each has zero or one parents and
zero or more children. The root, or top concept, has no parent; the leaves, or bottom
concepts, have no children; and the rest have one parent and one or more children.</p>
      <p>An SCS may be a flat list (i.e., one level) or a hierarchy, and if it is a hierarchy
then it is with the added proviso that all its concepts are grouped into levels. In every
SCS, the root concept is implicit and is often referred to as the defining concept for
the SCS. All the concepts in one level are the same number of relationships away
from the root, and this number is known as the depth of the level. All the concepts at
each level are mutually exclusive and exhaustive (ME&amp;E), meaning each unit can be
classified to one and only one concept per level. For example, the first level in
NAICS is known as Sectors, and these sectors are the broadest industry categories.</p>
      <p>From the perspective of SKOS, the relation between a concept and its parent is one
of the broader than and narrower than kinds, depending which direction one is
look3 http://www.ddialliance.org
4 http://www.dagstuhl.de/en/program/calendar/evhp/?semnr=11372
5 http://www.dagstuhl.de/en/program/calendar/evhp/?semnr=12422
6 http://www.bls.gov/soc
7 http://www.census.gov/eos/www/naics/
ing. A parent is a broader concept, whereas a child is narrower. We have more to say
about this in the next section.</p>
      <p>Each level is defined by its own concept, as the collection of concepts at a
particular level have a common overall meaning. For instance, the first level of NAICS
below the root (i.e., industry or economic activity) is the sector, and Government is one
of the sectors.</p>
      <p>
        The most important use of a classification scheme is to classify and organize units
within some domain, for example business establishments by industry. NAICS is
used for this in the US, but due to limitations in coverage for some geographic areas,
establishment sizes, or NAICS concepts, not enough data might be available to report
meaningfully at all levels. So, the lowest level with meaningful data in most of the
concepts is used. However, what determines “meaningful” is a statistical
consideration, not germane to SKOS, and out of scope for this paper. The interested reader can
consult a text on statistical sampling [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>Practically speaking, the rows in tables8 and the dimensions used to specify
measures in a time series (for instance the US Consumer Price Index for specific
expenditures and municipalities9) are major uses of SCSs in data dissemination. Tables and
time series report aggregated data that are classified by SCSs. Some of the SCSs may
be hierarchies, so one level is chosen to report data. If the data are sparse at that level,
then there is a danger that data about individuals (people or businesses) is
recoverable. In this case, the next higher level is used, and the data are aggregated some more
into the broader categories. This is an important reason for the hierarchical design of
SCSs.</p>
      <p>Another use of SCSs is in data collection. The possible answers to questions on a
form or in a questionnaire are the categories in SCSs. When the range of answers
covers all that are possible, and since answer choices must be mutually exclusive, the
ME&amp;E criterion for SCS is satisfied. Another way SCSs are used in data collection is
through classifying textual responses. For instance, the US American Community
Survey asks respondents to briefly describe their jobs, and these descriptions are
classified to NAICS and SOC. The levels selected are based on the detail the questions
are designed to elicit. The US Survey of Occupational Injuries and Illnesses asks
respondents to describe incidents in their workplaces that resulted in loss of work
time. These incidents are classified into the nature of the injury or illness, the affected
body part, the source of the malady, and the event that caused it (see section 3.2).</p>
      <p>Statistical agencies that manage SCSs sometimes make versions to reflect changes
in the subject matter domain, and these versions are separate SCSs. However, they
belong to the same family, and that is known as a classification. For instance, NAICS
is updated every 5 years, and each version (a separate SCS) is known by its year.</p>
      <p>
        A model for describing and managing SCSs developed by the international
statistical community can be found in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
8 http://www.bls.gov/news.release/empsit.a.htm
9 http://www.bls.gov/cpi/cpid1305.pdf
      </p>
    </sec>
    <sec id="sec-3">
      <title>SKOS Limitation</title>
      <sec id="sec-3-1">
        <title>SKOS and What Is Missing</title>
        <p>SKOS provides a means for representing knowledge organization systems using RDF,
and this makes the use of SKOS applicable to SCSs. It is beyond the scope of this
paper to provide a detailed description of SKOS. We direct the interested reader to
the SKOS web site10. However, SKOS contains the following basic ideas, whose
definitions we paraphrase here:
• Concept Scheme – any knowledge organization system (including SCSs)
• Concept – any abstract idea or unit of thought
• Definition – formal statement conveying the meaning of a concept
• Label – lexical representation for a concept, may be preferred or alternate; provides
means to communicate the concept
• Notation – a symbolic notation for the concept (such as a code) that is typically
data-typed.
• Semantic Relation – broad category for relations between concepts, such as
broader than, narrower than, and related to (these relations can include relations to
concepts found in other concept schemes).</p>
        <p>The basic ideas listed above are the minimum required to describe an SCS. We can
account for the scheme itself (concept scheme), all its underlying concepts with
concept, what each concept means (definition), the labels and codes associated with a
category (label / notation), and relationships with its parent and all its children
(semantic relation).</p>
        <p>
          SKOS is based on ISO 25964-1 [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. This standard describes three basic kinds of
hierarchical relations between concepts: generic, partitive, and instantiation.
        </p>
        <p>Interestingly, SKOS provides only the more generic broader than and narrower
than, which are often referred to in more technical settings as super-ordinate and
subordinate, respectively. Both the generic and partitive relations are specializations of
broader than / narrower than. In the SKOS Primer11, this simplification is
acknowledged.</p>
        <p>
          SKOS also specifies an association relation between concepts, but this is not made
any more detailed. Possible detail might includesequential, temporal, and causal
relations. The sequential relation refers to ideas where one is the antecedent of the
other, either temporally or spatially, such as between production and consumption.
The specialized temporal relation is based on time, such as between spring and
summer. Finally, the causal relation relates cause and effect, such as the detonation of a
hydrogen bomb and nuclear fall-out. See [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] for further explanation of the
relationsdescribed here and above.
10 http://www.w3.org/2004/02/skos/
11 http://www.w3.org/TR/skos-primer/
        </p>
        <p>Because levels have a concept associated with them and they have depth (from the
root), there is no satisfactory way to account for them in SKOS. Thus, we need to add
the notion of level. It is a kind of skos:Collection.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Examples</title>
        <p>Below are some examples that illustrate the need for the extensions we have
identified above:
1. The US Standard Occupational Classification System12 (SOC – 2012)</p>
        <sec id="sec-3-2-1">
          <title>Take, for example 27-2000 – 27-2040 – 27-2042 –</title>
        </sec>
        <sec id="sec-3-2-2">
          <title>Entertainers and Performers, Sports and Related Workers</title>
          <p>Musicians, Singers, and Related Workers</p>
          <p>Musicians and Singers</p>
          <p>The appropriate relation between 27-2000 and 27-2040 is generic, i.e.Musicians,
Singers and Related Workers is a specialization of Entertainers and Performers,
Sports and Related Workers. The same relation is found between 27-2040 and
272042, i.e.Musicians and Singers is a specialization of Musicians, Singers and Related
Workers. So, the generic relation is needed to specify the semantics of the US SOC.
2. The US Occupational Injury and Illness Classification13 (OIICS – 2012)</p>
          <p>Occupational injury and illness is a four-facet classification: nature, body part,
source, and event. In the body part facet, for example
3 – Trunk
31 – Chest
313 – Heart
315 – Lungs
32 – Back, including spine, spinal cord
321 – Thoracic
322 – Lumbar</p>
          <p>Going from broad to lower detail in this snippet of the body part classification
illustrates the partitive relation. The chest and back are parts of the trunk. The heart and
lungs are part of the chest. Finally, the thoracic and lumbar regions are part of the
back and spine. Note that it would not be proper to use the generic relation here.
Therefore, the partitive relation is needed to specify the semantics of the US OIICS.
12 http://www.bls.gov/soc/
13 http://www.bls.gov/iif/oshoiics.htm</p>
          <p>The classification is a hierarchy, but some activity categories depend on what has
occurred before. For instance,
04 – Caring For &amp; Helping non-Household Members
0401 – Caring For &amp; Helping non-Household Children
040104 – Arts &amp; Crafts with non-Household Children
040112 – Dropping Off/Picking Up non-Household Children
Dropping off non-household children is a sequential activity related, in this
example, to having supervised arts-and-crafts activities (or some other activity in the 04
group) previously. So, there are associations between some pairs of activities within
this classification, though they are not intrinsic to the SCS. In this case, the
sequential or possibly the temporal relation is needed to convey the additional semantics that
some activities depend on the triggering of other prior activities.
4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>XKOS</title>
      <p>
        In the following, we list some of the extensions XKOS contains and guide the
interested to reader to another paper that contains more detail[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
• xkos:belongsTo is used to attach a classification scheme to its classification
• xkos:follows or its sub-property xkos:supercedes is used to relate classification
schemes that are successive versions
• xkos:classifiedUnder is used to indicate a unit is classified by some concept
• xkos:ClassificationLevel (subclass of skos:Collection) is the level; the levels of a
classification scheme are structured as an RDF List, starting with the most
aggregated, and the list attached to the classification scheme by the xkos:levels property.
• xkos:depth property expresses the distance of a given level from the root node of
the hierarchy
• xkos:organizedBy property can be used to record the generic name of the items of a
given level (e.g. “section”, “division”, etc.).
• explanatory notes use xkos:coreContentNote, xkos:additionalContentNote,
xkos:inclusionNote, and xkos:exclusionNote, which are sub-properties of
skos:scopeNoteand correspond to a typology of explanatory notes widely used for
statistical classifications
• xkos:ConceptAssociationis a class that can be used to represent correspondences
between classification items across SCSsthrough input or source skos:Concept(s)
and output or target skos:Concept(s).
• xkos:Correspondencegroups a set ofxkos:ConceptAssociation(s) to represent a
concordance or correspondence table between two classification schemes (for
example two versions of a classification).
14 http://www.bls.gov/tus/lexicons.htm
• xkos:specializes and xkos:generalizes represent each side of the generic relation.
• xkos:isPartOf and xkos:hasPart represent each side of the partitive relation.
• xkos:disjoint property is used to explicitly state that two given concepts do not
overlap
• xkos:causal is subdivided into the directional xkos:causes and xkos:causedBy, to
express causality
• xkos:sequential indicates that two concepts in a scheme are in a sequential
relationship
• xkos:succeeds and xkos:precedes are used when a sequence has a known order and
are further refined by xkos:previous and xkos:next, the immediate successor or
predecessor
• xkos:temporal is used when a sequence is of a temporal nature and is subdivided
into the directional xkos:before and xkos:after
5
      </p>
    </sec>
    <sec id="sec-5">
      <title>Implementing XKOS</title>
      <p>A first example of XKOS utilization can be found on INSEE15’s linked data site16.
The Institute published different statistical code lists and classifications there, and
notably the NAF, the French refinement of the European NACE, is one.</p>
      <p>This publication makes use of different XKOS features, for example classification
levels, maximum-length labels and explanatory notes. This allows making useful
requests which would be more difficult, or even impossible, using only a SKOS
representation. As illustration, the following SPARQL query gives the list of all the
NAF divisions, which is a very common use case for classifications.</p>
      <p>PREFIX skos:&lt;http://www.w3.org/2004/02/skos/core#&gt;
PREFIX xkos:&lt;http://purl.org/linked-data/xkos#&gt;
SELECT ?code ?label WHERE {
?item skos:notation ?code .
?item skos:prefLabel ?label .</p>
      <p>?item skos:inScheme
&lt;http://id.insee.fr/codes/nafr2/naf&gt; .</p>
      <p>?level skos:member ?item .</p>
      <p>?level xkos:organizedBy
&lt;http://id.insee.fr/concepts/nafr2/division&gt; .</p>
      <p>FILTER langMatches(lang(?label), 'en')
} ORDER BY ?code</p>
      <p>A second example was created especially for this paper and is based on the
international classifications published by the United Nations Statistics Division17. We chose
15 INSEE, Institut National de la Statistique et des Études Économiques, is the French National</p>
      <p>Statistical Institute.
16 http://rdf.insee.fr/
the ISIC (International Standard Industrial Classification), which plays a central role
in the international system of economic classifications18. The objective was to
represent in XKOS the last two revisions of the ISIC and the historical
correspondences between them.</p>
      <p>The main challenge for this operation was to be able to analyze the explanatory
notes and to split them into the specific categories defined by XKOS (core or
additional content, exclusions, etc.). This can be done by recognizing patterns in the note
text (“This group contains”, “This division excludes”, etc.), but, although the ISIC
notes are usually well structured, there are variations of these patterns and numerous
special cases, so that the process cannot be fully automated.For example19, the note
for division 23 combines in a single paragraph the descriptions of the core and
additional contents for this division.</p>
      <p>The ISIC is available in different formats on the UNSD web site, but for the
explanatory notes, the only possibilities are PDF, MS Access, and the online HTML
publication. Correspondences between ISIC revisions or with other classifications are also
available as HTML, or in downloadable plain text files. For the sake of simplicity and
coherence, we decided to use the HTML online publication as the reference source of
data.</p>
      <p>As a consequence, a simple Java application was developed in order to retrieve the
data from the web site, based on the Jericho HTML parser20. Though ad hoc, this
application was designed to be modular and should be easily adaptable to other use
cases.</p>
      <p>In a first step, which is repeated for ISIC revisions 3.1 and 4, the web pages
describing the classification items are requested recursively over HTTP, starting from
the page describing the top structure, and the relevant information about classification
hierarchy, codes, labels, and notes is extracted and copied in a simple XML file. This
first step is also the occasion to make an automated categorization of the notes based
on regular expressions.</p>
      <p>A paragraph starting with “This class also contains:” will be considered an XKOS
note of type “additional content”. Another starting with “This section includes” will
be a “core content note”, etc.Further pattern matching is then performed to verify that
two types of notes are not grouped in one paragraph (in the first example, we will
check if the text contains “exclude”, in the second we will also search if it contain
“also”). If any of these additional matches is positive, a log entry will be recorded and
manual verification will ensue.</p>
      <p>A paragraph that does not match any predefined expressions will take the type of
its predecessor if it has one, or be categorized as “general”. Here again, the
assumption will be logged for further human checking, because some exceptions may occur
17 http://unstats.un.org/unsd/cr/registry/regct.asp?Lg=1
18 This system is described for example in the official NACE Rev.2 publication at
http://epp.eurostat.ec.europa.eu/cache/ITY_OFFPUB/KS-RA-07-015/EN/KS-RA-07-015EN.PDF, pp 13-14.
19 All reference material for the example can be accessed through the ISIC main page at
http://unstats.un.org/unsd/cr/registry/regcst.asp?Cl=27&amp;Lg=1
20 http://jericho.htmlparser.net
(see for example Section B, where general notes can be found between the
descriptions of the included and excluded contents).</p>
      <p>The code below is an extract of the XML file produced by the Java program,
showing a simple example of note analysis:
&lt;Item code="1430" parent="143"&gt;</p>
      <p>&lt;Label lang="en"&gt;Manufacture of knitted and crocheted
apparel&lt;/Label&gt;
&lt;Notes lang="en"&gt;
&lt;Sequence&gt;C X&lt;/Sequence&gt;
&lt;Structure&gt;0:C 3:X&lt;/Structure&gt;
&lt;Elements&gt;
&lt;Element&gt;&lt;div&gt;This class includes:&lt;/div&gt;&lt;/Element&gt;
&lt;Element&gt;&lt;div class='item'&gt;- manufacture of
knitted or crocheted wearing apparel and other made-up
articles directly into shape: pullovers, cardigans,
jerseys, waistcoats and similar articles&lt;/div&gt;&lt;/Element&gt;
&lt;Element&gt;&lt;div class='item'&gt;- manufacture of
hosiery, including socks, tights and
pantyhose&lt;/div&gt;&lt;/Element&gt;
&lt;Element&gt;&lt;div&gt;This class excludes:&lt;/div&gt;&lt;/Element&gt;
&lt;Element&gt;&lt;div class='item'&gt;- manufacture of
knitted and crocheted fabrics, see 1391&lt;/div&gt;&lt;/Element&gt;
&lt;/Elements&gt;
&lt;/Notes&gt;
&lt;/Item&gt;</p>
      <p>The HTML code extracted from the web pages is placed in the
&lt;Elements&gt;node. In the ISIC case, and more generally for classifications published by
the UNSD, explanatory notes are organized in &lt;div&gt; tags with a class attribute
indicating the list level, but other organizations may use different conventions.</p>
      <p>The &lt;Sequence&gt; element gives an overall vision of the notes structure as
determined by the program: in this case a “core content” (C) note is followed by an
“exclusion note” (X). The &lt;Sequence&gt; element values can easily be queried by XPath to
detect anomalous structures. The &lt;Structure&gt; element completes the
&lt;Sequence&gt;information by giving the start index of each part in the &lt;Element&gt; list:
this will be used by the following processing step.</p>
      <p>The &lt;Structure&gt; and &lt;Sequence&gt; elements can be edited in the XML file to
correct possible errors made by the program. As explained above, this step can be
guided by the detection of unusual sequences and bythe warnings logged by the
program. For this first experimentation, the corrections were made manually with an
XML editor, but it is easy to see how a simple GUI could be developed to improve
this step.</p>
      <p>Once the XML files(one for ISIC Rev.4 and one for ISIC Rev.3.1) are correct, the
last step is to transform them intoXKOS representations of the two classification
schemes. XSLT transformations are the tool of choice here, and it is relatively
straightforward to design the general structure of a transformation that produces an
RDF/XML serialization of the expected result.</p>
      <p>A much more difficult question, though, is how to transform the HTML code used
in the notes into a RDF literal: should we take only the plain text content, or should
we keep in whole or in part the note structure as it is represented in HTML?</p>
      <p>See for example the explanatory notes for ISIC Rev.4 class 103021: the description
of the core content is organized in embedded unordered lists, and this structure is
important to understanding the description. The exclusions are shown in italics (this
can probably be left out), and include pointers to other classes (“… see 1061”) which
are not rendered as HTML links but could be desirable to capture in automated
processing oriented formats like RDF.</p>
      <p>
        For the time being, we chose simplicity and put only plain text in the explanatory
notes. INSEE’s publication of the NAF usesa more refined solution developed for the
EuroVoc thesaurus [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], where the notes are typed as rdf:XMLLiteral22 and
contain XHTML+RDFa fragments, with links to other concepts represented through a
sub-property of dcterms:references23 (see [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] for a more complete
description).
      </p>
      <p>The structure of explanatory notes and the links that they contain are a very
important feature for statisticians, and this question of how to represent themin LOD
formats is clearly one of the areas where common good practice should be developed in
the statistical community.</p>
      <p>Once the XML files corresponding to the two ISIC versions are produced, they can
be uploaded in a Sesame RDF triple store, completed by the information on
correspondences which is directly parsed on the UNSD web site with Jericho. The data is
then ready to be queried.Examples of interesting queries are:
PREFIX skos:&lt;http://www.w3.org/2004/02/skos/core#&gt;
PREFIX xkos:&lt;http://purl.org/linked-data/xkos#&gt;
SELECT ?code ?label WHERE {</p>
      <p>?class skos:inScheme
&lt;http://unstats.un.org/codes/isic/4/cs&gt; .</p>
      <p>?class skos:notation ?code .
?class xkos:coreContentNote ?note .</p>
      <p>FILTER regex(?note, "wholesale of office furniture")
}</p>
      <p>This query searches ISIC Rev.4 for “wholesale of office furniture” in a core
content note. It returns only the class that includes this activity (4659), whereas the
equivalent query would return two answers mixing inclusions and exclusions (4669)
if the notes were only generic SKOS scope notes.
21 http://unstats.un.org/unsd/cr/registry/regcs.asp?Cl=27&amp;Lg=1&amp;Co=1030
22 http://www.w3.org/TR/rdf-concepts/#section-XMLLiteral
23 See an example at http://id.insee.fr/codes/nafr2/sousClasse/27.11Z/noteExclusions
SELECT ?source ?target WHERE {</p>
      <p>? source skos:inScheme
&lt;http://unstats.un.org/codes/isic/3.1/cs&gt; .</p>
      <p>? target skos:inScheme
&lt;http://unstats.un.org/codes/isic/4/cs&gt; .</p>
      <p>?association xkos:sourceConcept ?source .
?association xkos:targetConcept ?target .
?association skos:note ?note .</p>
      <p>FILTER regex(?note, "Repair of weapons")
}</p>
      <p>This query searches the correspondences between Rev.3.1 and Rev.4 for what
happened to the repair of weapons activity (the answer is: it moved from class 2927 to
class 3311). This is an example of query on notes concerning classification
concordances, which is very useful for statisticians, but would not be easily doablewithout
the XKOS extensions.</p>
      <p>The figure belowsummarizes the overall logic of this example.</p>
      <p>It should be noted that once the data is in XKOS format, it is easy to produce
output formats like HTML or PDF.</p>
    </sec>
    <sec id="sec-6">
      <title>6 Conclusions</title>
      <p>We explained in this paper the rationale for XKOS and described how a simple
process could be developed to transform existing information on statistical
classifications in XKOS in order to be able to use this information in improved ways. This
process can easilybe adapted to other situations.</p>
      <p>The paper contains a description of a Statistical Classification Systems (SCS),
important ways they are used, limitations of SKOS in its ability to describe an SCS, and
the several ways we thought SKOS should be extended. From the exposition, it
should be clear that the extensions account for the limitations. It is SKOS and the
extensions that we call XKOS.</p>
      <p>Particular emphasis was placed on the ability to convey semantics, so we were
careful to add significantly more relations to XKOS. The examples in statistical
offices make clear the need for the additional semantics.</p>
      <p>The work to define XKOS is not completed however. Identifying new relations is
a priority as well as building a typology of them. The biggest hurdle is to persuade
the statistical agencies to use XKOS and build a base of applications as the statistical
offices around the world move to adopt Linked Open Data principles.</p>
      <p>Finally, we note that although XKOS was developed with the purpose of
representing statistical classifications, some elements of the vocabulary can be used
outside of the context of classifications. These new applications need further
exploration and possible refinement of XKOS.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgements</title>
      <p>The authors wish to thank the organizers of the Dagstuhl workshops – Richard
Cyganiak, Arofan Gregory, Wendy Thomas, and Joachim Wackerow – for their support
and encouragement in developing the XKOS ideas. The authors also wish to thank
the participants not already mentioned in the XKOS development group: Thomas
Bosh, Rob Grim,and Jannik Jensen.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Isaac</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Summers</surname>
          </string-name>
          , E.:
          <article-title>SKOS Simple Knowledge Organization System Primer</article-title>
          . Working Group Note,
          <volume>W3C</volume>
          (
          <year>2009</year>
          ), http://www.w3.org/TR/skos-primer/
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. XKOS - DDI/RDF Vocabularies,
          <source>Downloaded from the Web on 19 July</source>
          <year>2013</year>
          at http://www.ddialliance.org/Specification/RDF#xkos
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Sharon</given-names>
            <surname>Lohr</surname>
          </string-name>
          ,
          <source>Sampling: Design and Analysis</source>
          , Brooks/Cole (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Neuchâtel</given-names>
            <surname>Terminology</surname>
          </string-name>
          Model -
          <article-title>Classification database object types and their attributes</article-title>
          ,
          <source>version 2</source>
          .1,
          <string-name>
            <surname>Dowloaded</surname>
            <given-names>from</given-names>
          </string-name>
          <source>the web on 19 July</source>
          <year>2013</year>
          at http://www1.unece.org/stat/platform/download/attachments/1431993 0/Part+I+
          <article-title>Neuchatel_version+2_1</article-title>
          .pdf?
          <source>version=1&amp;modificationDate=12 65695896952</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5. ISO 25964-1
          <article-title>- Thesauri and interoperability with other Vocabularies, Part 1: Thesauri for information retrieval</article-title>
          ,
          <source>ISO</source>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6. ISO 1087-1:
          <fpage>2000</fpage>
          -
          <article-title>Terminology work - Vocabulary, Part 1: Theory and application</article-title>
          ,
          <source>ISO</source>
          (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Cotton</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gillman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Jaques</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          (
          <year>2013</year>
          )
          <article-title>XKOS - An RDF Vocabulary for Describing Statistical Classifications, IASSIST Quarterly (to appear)</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <article-title>8. EuroVoc, the EU's multilingual thesaurus</article-title>
          ,
          <source>Downloaded from the Web on 19 July</source>
          <year>2013</year>
          at http://eurovoc.europa.eu/drupal/
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>De Smedt</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vatant</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <string-name>
            <surname>The EUROVOC Thesaurus Ontology</surname>
            <given-names>Schema</given-names>
          </string-name>
          ,
          <source>Downloaded from the Web on 19 July</source>
          <year>2013</year>
          athttp://lists.w3.org/Archives/Public/publicesw-thes/2010Feb/att-0023/Ontology.html
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>