<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>EUROSENTIMENT: Linked Data Sentiment Analysis</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>J. Fernando Sánchez-Rada</string-name>
          <email>jfernando@gsi.dit.upm.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gabriela Vulcu</string-name>
          <email>gabriela.vulcu@insight-centre.org</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Carlos A. Iglesias</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paul Buitelaar</string-name>
          <email>paul.buitelaar@insight-centre.org</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dept. Ing. Sist. Telemáticos, Universidad Politécnica de Madrid</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Insight, Centre for Data Analytics at National University of Ireland</institution>
          ,
          <addr-line>Galway</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Sentiment and Emotion Analysis strongly depend on quality language resources, especially sentiment dictionaries. These resources are usually scattered, heterogeneous and limited to specific domains of application by simple algorithms. The EUROSENTIMENT project addresses these issues by 1) developing a common language resource representation model for sentiment analysis, and APIs for sentiment analysis services based on established Linked Data formats (lemon, Marl, NIF and ONYX) 2) by creating a Language Resource Pool (a.k.a. LRP) that makes available to the community existing scattered language resources and services for sentiment analysis in an interoperable way. In this paper we describe the available language resources and services in the LRP and some sample applications that can be developed on top of the EUROSENTIMENT LRP.</p>
      </abstract>
      <kwd-group>
        <kwd>Language Resources</kwd>
        <kwd>Sentiment Analysis</kwd>
        <kwd>Emotion Analysis</kwd>
        <kwd>Linked Data</kwd>
        <kwd>Ontologies</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>This paper reports our ongoing work in the European R&amp;D project
EUROSENTIMENT, where we have created a multilingual Language Resource Pool (LRP)
for Sentiment Analysis based on a Linked Data approach for modelling linguistic
resources.</p>
      <p>Sentiment Analysis requires language resources such as dictionaries that
provide a sentiment or emotion value to each word. Just as words have different
meanings in different domains, the associated sentiment or emotion also varies.
Hence, every domain has its own dictionary. The information about what each
domain represents or how the entries for each domain are related is usually
undocumented or implied by the name of each dictionary. Moreover, it is common
that dictionaries from different providers use different representation formats.
Thus, it is very difficult to use different dictionaries at the same time.</p>
      <p>
        In order to overcome these limitations, we have defined a Linked Data Model
for Sentiment and Emotion Analysis, which is based on the combination of
several vocabularies: the NLP Interchange Format (NIF) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], to represent
information about texts, referencing text in the web with unique URIs; the Lexicon
Model for Ontologies (lemon) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], to provide lexical information, and
differentiate between different domains and senses of a word; Marl [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], to link lexical
entries or senses with a sentiment; and Onyx [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], that adds emotive information.
      </p>
      <p>The use of a semantic format not only eliminates the interoperability issue,
but it also makes information from other Linked Data sources available for the
sentiment analysis process. The EUROSENTIMENT LRP generates language
resources from legacy corpora, linking them with other Linked Data sources, and
shares this enriched version with other users.</p>
      <p>In addition to language resources, the pool also offers access to sentiment
analysis services with a unified interface and data format. This interface builds on
the NIF Public API, adding several extra parameters that are used in Sentiment
Analysis. Results are formatted using JSON-LD and the same vocabularies as
for language resources. The NIF-compatible API allows for the aggregation of
results from different sources.</p>
      <p>The project documentation3 contains further information about the
EUROSENTIMENT format, APIs and tools.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Language Resources</title>
      <p>The EUROSENTIMENT LRP contains a set of language resources (lexicons and
corpora). The available EUROSENTIMENT language resources can be found
here.4 The user can see the domain and the language of each language resource.
At the moment the LRP contains resources for electronics and hotel domains in
six languages (Catalan, English, Spanish, French, Italian and Portuguese) and
we are currently working on adding more language resources from other domains
like telco, movies, food and music. Table 1 shows the number of reviews in each
available corpus and the number of lexical entries in each available lexicon.</p>
      <p>
        A detailed description of the methodology for creating the domain-specific
sentiment lexicons and corpora to be added in the EUROSENTIMENT LRP
was presented at LREC 2014 [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>The EUROSENTIMENT demonstrator5 shows how users can benefit from
the LRP, including an interactive SPARQL query editor to access the resources
and a faceted browser.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Sentiment Services</title>
      <p>In addition to a model for language resources, EUROSENTIMENT also provides
an API for sentiment and emotion analysis services. Several already existing
ser</p>
      <sec id="sec-3-1">
        <title>3 http://eurosentiment.readthedocs.org</title>
        <p>4 http://portal.eurosentiment.eu/home_resources
5 http://eurosentiment.eu/demo</p>
        <p>Lexicons
Language Domains #Entities
German General 107417
English Hotel,Electronics 8660
Spanish Hotel,Electronics 1041
Catalan Hotel,Electronics 1358
Portuguese Hotel,Electronics 1387
French Hotel,Electronics 651</p>
        <p>Table 1. Summary of the resources in the LRP</p>
        <p>Corpora
Language Domains
English Hotel,Electronics
Spanish Hotel,Electronics
Catalan Hotel,Electronics
Portuguese Hotel,Electronics
French Electronics
vices in different languages have been adapted to expose this API. Any user can
benefit from these services, which are conveniently listed in the
EUROSENTIMENT portal. At the moment, the following services are provided in several
languages: language detection, domain detection, sentiment and emotion
detection, and text analysis.
To demonstrate the capabilities of the EUROSENTIMENT LRP, we
opensourced the code of several applications that make use of the services and
resources of the EUROSENTIMENT LRP. The applications are written in
different programming languages and are thoroughly documented. Using these
applications as a template, it is straightforward to immediately start consuming
the services and resources. The code can be found on the EUROSENTIMENT
Github repositories.6</p>
      </sec>
      <sec id="sec-3-2">
        <title>6 http://github.com/eurosentiment</title>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgements</title>
      <p>This work has been funded by the European project EUROSENTIMENT under
grant no. 296277</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Hellmann</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Auer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nitzschke</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Nif combinator: Combining nlp tool output</article-title>
          .
          <source>In: Knowledge Engineering and Knowledge Management</source>
          , pp.
          <fpage>446</fpage>
          -
          <lpage>449</lpage>
          . Springer (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>McCrae</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Spohr</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cimiano</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Linking lexical resources and ontologies on the semantic web with lemon</article-title>
          . In: Antoniou,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Grobelnik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Simperl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Parsia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Plexousakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>De Leenheer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <surname>J</surname>
          </string-name>
          . (eds.)
          <source>The Semantic Web: Research and Applications, Lecture Notes in Computer Science</source>
          , vol.
          <volume>6643</volume>
          , pp.
          <fpage>245</fpage>
          -
          <lpage>259</lpage>
          . Springer Berlin Heidelberg (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Sánchez-Rada</surname>
            ,
            <given-names>J.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Iglesias</surname>
            ,
            <given-names>C.A.</given-names>
          </string-name>
          :
          <article-title>Onyx: Describing emotions on the web of data</article-title>
          .
          <source>In: ESSEM@ AI* IA</source>
          . pp.
          <fpage>71</fpage>
          -
          <lpage>82</lpage>
          .
          <string-name>
            <surname>Citeseer</surname>
          </string-name>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Vulcu</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buitelaar</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Negi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pereira</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arcan</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Coughlan</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>SánchezRada</surname>
            ,
            <given-names>J.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Iglesias</surname>
            ,
            <given-names>C.A.</given-names>
          </string-name>
          :
          <article-title>Generating Linked-Data based Domain-Specific Sentiment Lexicons from Legacy Language and Semantic Resources</article-title>
          . In: th International Workshop on EMOTION,
          <string-name>
            <surname>SOCIAL</surname>
            <given-names>SIGNALS</given-names>
          </string-name>
          ,
          <article-title>SENTIMENT &amp; LINKED OPEN DATA, co-located with</article-title>
          <source>LREC</source>
          <year>2014</year>
          ,. LREC2014, Reykjavik, Iceland (May
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Westerski</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Iglesias</surname>
            ,
            <given-names>C.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tapia</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Linked Opinions: Describing Sentiments on the Structured Web of Data</article-title>
          .
          <source>In: Proceedings of the 4th International Workshop Social Data on the Web</source>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>