<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Developing a Semantic Content Analyzer for L'Aquila Social Urban Network</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Cataldo Musto</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giovanni Semeraro</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pasquale Lops</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marco de Gemmis</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fedelucio Narducci</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mauro Annunziato</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Luciana Bordoni</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Claudia Meloni</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Franco F. Orsucci</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giulia Paoloni</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Bari \A. Moro"</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Humanities and Territory Sciences, University of Chieti-Pescara</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Department of Psychology and Language Sciences, University College London</institution>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>ENEA - Unita' Tecnica Tecnologie Avanzate per l'Energia e l'Industria - Roma</institution>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Murex CS s.r.l</institution>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>University of Milano - Bicocca</institution>
        </aff>
      </contrib-group>
      <fpage>34</fpage>
      <lpage>38</lpage>
      <abstract>
        <p>This paper7 presents the preliminary results of a joint research project about Smart Cities. This project is adopting a multidisciplinary approach that combines arti cial intelligence techniques with psychology research to monitor the current state of the city of L'Aquila after the dreadful earthquake of April 2009. This work focuses on the description of a semantic content analysis module. This component, integrated into L'Aquila Social Urban Network (SUN), combines Natural Language Processing (NLP) and Arti cial Intelligence (AI) to deeply analyze the content produced by citizens on social platforms in order to map social data with social indicators such as cohesion, sense of belonging and so on. The research carries on the insight that social data can supply a lot of information about latent people feelings, opinion and sentiments. Within the project, this trustworthy snapshot of the city is used by community promoters to proactively propose initiatives aiming at empowering the social capital of the city and recovering the urban structure which has been disrupted after the 'diaspora' of citizens in the so called "new towns".</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Smart Cities may be applied. Speci cally, at ENEA an interdisciplinary team
(researchers, architects and engineers) is working on the design of a Social Urban
Network (SUN), a hybrid city model [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] developed with the aim of monitoring
and revitalizing the traditional social capital and urban heritage with new plans
for an integrated future.
      </p>
      <p>
        SUN architecture is shown in Figure 1. The input for the whole pipeline is
the analysis of the content produced by the citizens on social platforms such as
Facebook, Twitter and so on. Next, through a Semantic Content Analyzer all the
content is deeply processed and analyzed, in order to be map user's contributions
with some well-de ned social capital indicators [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] (see Figure 2). Finally, given
a snapshot of the current social capital obtained through the semantic analysis
module, a community promoter can identify activities or speci c interventions
aimed towards a recovery and empowerment of the social capital of the city
through social events, Web portal or a Smart Node, an interactive installation
placed in a key area of the city with the aim of creating a meeting place for
people that want to share (and to get) information about their town.
2
      </p>
      <p>A semantic content analyzer for the SUN
The semantic analysis module is the core of the whole SUN: it takes as input
the information coming from social networks and tries to organize the plethora
of data produced by citizens to provide the community promoters with valuable
information about the current state of the town. The general architecture of the
module is shown in Figure 3.</p>
      <p>
        First, a Social Extractor exploits Social APIs, such as Facebook8 and
Twitter's9 ones, to feed a database of community contributions. This database is feed
8 http://developed.facebook.com
9 http://dev.twitter.com
according to speci c heuristics (e.g. all the tweets containing speci c hashtags
or coming from a speci c geo-location, all the posts crawled from speci c
Facebook pages, and so on). Next, the database of contributions is processed through
three enrichment steps (highlighted with a red dashed arrow): in the rst one, a
Semantic Tagger tries to associate to each piece of content the topic it is about.
For this step we will implement an hybrid approach that combines techniques
such as LDA with approaches exploiting open knowledge sources (Wikipedia,
DBpedia, etc.) such as Tag.me [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] or DBpedia Spotlight [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The goal of these
techniques is to extract from the text some high-level concept the content is
about. As an example, we can consider the tweet in Figure 4. In this case we
will process the content of the tweet and we will extract high-level
meaningful concepts such as earthquake (terremoto, in Italian), suicides (suicidi) and
researchers (ricercatori). Moreover, it is also possible to further connect the
concepts with the Wikipedia categories: in this case, for example, 'earthquake' can
be connected with the Wikipedia category 'natural disaster'. In this way it is
possible to build abstract and high-level connections between di erent pieces of
content written by the community.
      </p>
      <p>
        Next, these semantically-enriched pieces of text are processed through
sentiment analysis techniques. For this step we will combine machine learning
techniques with lexicon-based models that exploit annotated vocabularies (e.g.
SentiWordNet [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]) that associate a polarity (positive, negative or neutral) to all the
terms of a language. Thanks to these lexicons it is easy to assign a sentiment
to the extracted tweets, since it is typically calculated as the weightedsum of
the polarity of the terms contained in it. Finally, the Social Capital Mapper
builds a classi cation model able to map each tweet with the social indicators
(described in 2) the tweet refers to. Clearly, the social indicator is in uenced
according to the sentiment conveyed by the tweet. The more positive the
sentiment, the higher the social indicator score. Such a simple pipeline, based on the
combination of several state-of-the-art machine learning techniques can provide
a valuable, meaningful and trustworthy information about people sentiment and
opinions.
In this work we sketched the preliminary design of a semantic content
analysis module developed for the SUN of the city of L'Aquila. We gured out a
framework where the combined use of techniques for semantic representation
and sentiment analysis can help community promoters to rapidly react to
people feelings and to design the best initiatives to improve the quality of life for
l'Aquila's citizens. However, the project is still ongoing so there is a lot of space
for future work: in the next steps most of the e ort will be focused on the
mapping between the content produced by citizens and the social indicators de ned
by the psychologists, in order to provide the community promoters with the
best possible snapshot of the current situtation of the city. Furthermore, we will
also work on the comparison of di erent (semantic) content representation, in
order to identify the one able to better represent and convey user sentiments and
opinions. Finally, we will integrate a semantic indexer module that implements
a word sense disambiguation algorithm [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and compare the performance of the
canonical keyword-based indexing to a more sophisticated semantic one which
addresses typical problems related to natural language processing, such as
synonymy, polysemy and multi-word expressions.
      </p>
      <p>Acknowledgments. This work full ls the research objectives of the project
PON 01 00850 ASK-Health (Advanced System for the interpretation and
sharing of knowledge in health care) funded by the Italian Ministry of Universty and
Research (MIUR)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>P.</given-names>
            <surname>Basile</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. de Gemmis</surname>
            ,
            <given-names>A.L.</given-names>
          </string-name>
          <string-name>
            <surname>Gentile</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Lops</surname>
            , and
            <given-names>G. Semeraro.</given-names>
          </string-name>
          <article-title>UNIBA: JIGSAW algorithm for Word Sense Disambiguation</article-title>
          .
          <source>In Proc. of the 4th ACL 2007 Int. Workshop on Semantic Evaluations (SemEval-2007)</source>
          , Prague, Czech Republic, pages
          <volume>398</volume>
          {
          <fpage>401</fpage>
          . Association for Computational Linguistics,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Andrea</given-names>
            <surname>Esuli</surname>
          </string-name>
          and
          <string-name>
            <given-names>Fabrizio</given-names>
            <surname>Sebastiani</surname>
          </string-name>
          .
          <article-title>SentiWordNet: A publicly available lexical resource for opinion mining</article-title>
          .
          <source>In Proceedings of LREC</source>
          , volume
          <volume>6</volume>
          , pages
          <fpage>417</fpage>
          {
          <fpage>422</fpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Paolo</given-names>
            <surname>Ferragina</surname>
          </string-name>
          and
          <string-name>
            <given-names>Ugo</given-names>
            <surname>Scaiella</surname>
          </string-name>
          . TAGME:
          <article-title>on-the- y annotation of short text fragments (by wikipedia entities)</article-title>
          .
          <source>In Proceedings of the 19th ACM international conference on Information and knowledge management</source>
          , pages
          <volume>1625</volume>
          {
          <fpage>1628</fpage>
          . ACM,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Pablo N Mendes</surname>
          </string-name>
          ,
          <article-title>Max Jakob, Andres Garc a-Silva, and Christian Bizer</article-title>
          .
          <article-title>DBpedia spotlight: shedding light on the web of documents</article-title>
          .
          <source>In Proceedings of the 7th International Conference on Semantic Systems</source>
          , pages
          <fpage>1</fpage>
          <article-title>{8</article-title>
          . ACM,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Franco</given-names>
            <surname>Orsucci</surname>
          </string-name>
          , Giulia Paoloni, Mario Fulcheri, Mauro Annunziato, and
          <string-name>
            <given-names>Claudia</given-names>
            <surname>Meloni</surname>
          </string-name>
          .
          <article-title>Smart Communities: social capital and psycho-social factors in Smart Cities</article-title>
          .
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Norbert</given-names>
            <surname>Streitz</surname>
          </string-name>
          .
          <article-title>Smart hybrid cities: progettare ambienti urbani a prova di futuro</article-title>
          .
          <source>Fondazione Ugo Bordoni</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>