Developing a Semantic Content Analyzer for
           L’Aquila Social Urban Network

 Cataldo Musto13 , Giovanni Semeraro1 , Pasquale Lops1 , Marco de Gemmis1 ,
Fedelucio Narducci23 , Mauro Annunziato4 , Luciana Bordoni4 , Claudia Meloni4 ,
                   Franco F. Orsucci5 , and Giulia Paoloni6
           1
             Department of Computer Science, University of Bari “A. Moro”
                           2
                             University of Milano - Bicocca
                                  3
                                    Murex CS s.r.l.
    4
       ENEA - Unita’ Tecnica Tecnologie Avanzate per l’Energia e l’Industria - Roma
     5
       Department of Psychology and Language Sciences, University College London
    6
       Department of Humanities and Territory Sciences, University of Chieti-Pescara


        Abstract. This paper7 presents the preliminary results of a joint re-
        search project about Smart Cities. This project is adopting a multi-
        disciplinary approach that combines artificial intelligence techniques with
        psychology research to monitor the current state of the city of L’Aquila
        after the dreadful earthquake of April 2009. This work focuses on the
        description of a semantic content analysis module. This component, in-
        tegrated into L’Aquila Social Urban Network (SUN), combines Natural
        Language Processing (NLP) and Artificial Intelligence (AI) to deeply
        analyze the content produced by citizens on social platforms in order
        to map social data with social indicators such as cohesion, sense of be-
        longing and so on. The research carries on the insight that social data
        can supply a lot of information about latent people feelings, opinion and
        sentiments. Within the project, this trustworthy snapshot of the city is
        used by community promoters to proactively propose initiatives aiming
        at empowering the social capital of the city and recovering the urban
        structure which has been disrupted after the ’diaspora’ of citizens in the
        so called ”new towns”.


1     L’Aquila Social Urban Network: a hybrid city model
The city of L’Aquila is located in the mountain region of Abruzzi, in central
Italy. In 2009 the life of this town has been seriously modified by an earthquake
that caused damage to between 3.000 and 11.000 buildings in the medieval city of
L’Aquila. 297 people died in the earthquake. The severe trauma to physical and
psycho-social structures is still in the phase of recovery, since the urban centre
has not been fully restored and almost half of all citizens still are displaced
from their pre-earthquake homes. The specificity of the situation represent a
perfect scenario where the principles and the insights behind the concept of
7
    This paper summarizes the results already presented in the AI*IA 2013 workshop
    on Smart Cities - URL: http://ai.unibo.it/SmartCityAIIA2013


                                            34
Smart Cities may be applied. Specifically, at ENEA an interdisciplinary team
(researchers, architects and engineers) is working on the design of a Social Urban
Network (SUN), a hybrid city model [6] developed with the aim of monitoring
and revitalizing the traditional social capital and urban heritage with new plans
for an integrated future.


                  Fig. 1. Social Urban Network (SUN) Architecture


   SUN architecture is shown in Figure 1. The input for the whole pipeline is
the analysis of the content produced by the citizens on social platforms such as
Facebook, Twitter and so on. Next, through a Semantic Content Analyzer all the
content is deeply processed and analyzed, in order to be map user’s contributions
with some well-defined social capital indicators [5] (see Figure 2). Finally, given
a snapshot of the current social capital obtained through the semantic analysis
module, a community promoter can identify activities or specific interventions
aimed towards a recovery and empowerment of the social capital of the city
through social events, Web portal or a Smart Node, an interactive installation
placed in a key area of the city with the aim of creating a meeting place for
people that want to share (and to get) information about their town.


2     A semantic content analyzer for the SUN
The semantic analysis module is the core of the whole SUN: it takes as input
the information coming from social networks and tries to organize the plethora
of data produced by citizens to provide the community promoters with valuable
information about the current state of the town. The general architecture of the
module is shown in Figure 3.
    First, a Social Extractor exploits Social APIs, such as Facebook8 and Twit-
ter’s9 ones, to feed a database of community contributions. This database is feed
8
    http://developed.facebook.com
9
    http://dev.twitter.com


                                        35
                Fig. 2. Social Capital Indicators


Fig. 3. The architecture of the Semantic Content Analysis Module


                               36
according to specific heuristics (e.g. all the tweets containing specific hashtags
or coming from a specific geo-location, all the posts crawled from specific Face-
book pages, and so on). Next, the database of contributions is processed through
three enrichment steps (highlighted with a red dashed arrow): in the first one, a
Semantic Tagger tries to associate to each piece of content the topic it is about.
For this step we will implement an hybrid approach that combines techniques
such as LDA with approaches exploiting open knowledge sources (Wikipedia,
DBpedia, etc.) such as Tag.me [3] or DBpedia Spotlight [4]. The goal of these
techniques is to extract from the text some high-level concept the content is
about. As an example, we can consider the tweet in Figure 4. In this case we
will process the content of the tweet and we will extract high-level meaning-
ful concepts such as earthquake (terremoto, in Italian), suicides (suicidi) and
researchers (ricercatori). Moreover, it is also possible to further connect the con-
cepts with the Wikipedia categories: in this case, for example, ’earthquake’ can
be connected with the Wikipedia category ’natural disaster’. In this way it is
possible to build abstract and high-level connections between different pieces of
content written by the community.


                          Fig. 4. A tweet about L’Aquila


   Next, these semantically-enriched pieces of text are processed through senti-
ment analysis techniques. For this step we will combine machine learning tech-
niques with lexicon-based models that exploit annotated vocabularies (e.g. Sen-
tiWordNet [2]) that associate a polarity (positive, negative or neutral) to all the
terms of a language. Thanks to these lexicons it is easy to assign a sentiment
to the extracted tweets, since it is typically calculated as the weightedsum of
the polarity of the terms contained in it. Finally, the Social Capital Mapper
builds a classification model able to map each tweet with the social indicators
(described in 2) the tweet refers to. Clearly, the social indicator is influenced
according to the sentiment conveyed by the tweet. The more positive the senti-
ment, the higher the social indicator score. Such a simple pipeline, based on the
combination of several state-of-the-art machine learning techniques can provide
a valuable, meaningful and trustworthy information about people sentiment and
opinions.


                                        37
3    Conclusions and Future Work
In this work we sketched the preliminary design of a semantic content analy-
sis module developed for the SUN of the city of L’Aquila. We figured out a
framework where the combined use of techniques for semantic representation
and sentiment analysis can help community promoters to rapidly react to peo-
ple feelings and to design the best initiatives to improve the quality of life for
l’Aquila’s citizens. However, the project is still ongoing so there is a lot of space
for future work: in the next steps most of the effort will be focused on the map-
ping between the content produced by citizens and the social indicators defined
by the psychologists, in order to provide the community promoters with the
best possible snapshot of the current situtation of the city. Furthermore, we will
also work on the comparison of different (semantic) content representation, in
order to identify the one able to better represent and convey user sentiments and
opinions. Finally, we will integrate a semantic indexer module that implements
a word sense disambiguation algorithm [1] and compare the performance of the
canonical keyword-based indexing to a more sophisticated semantic one which
addresses typical problems related to natural language processing, such as syn-
onymy, polysemy and multi-word expressions.

Acknowledgments. This work fullfils the research objectives of the project
PON 01 00850 ASK-Health (Advanced System for the interpretation and shar-
ing of knowledge in health care) funded by the Italian Ministry of Universty and
Research (MIUR)


References
1. P. Basile, M. de Gemmis, A.L. Gentile, P. Lops, and G. Semeraro. UNIBA: JIG-
   SAW algorithm for Word Sense Disambiguation. In Proc. of the 4th ACL 2007 Int.
   Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic, pages
   398–401. Association for Computational Linguistics, 2007.
2. Andrea Esuli and Fabrizio Sebastiani. SentiWordNet: A publicly available lexical
   resource for opinion mining. In Proceedings of LREC, volume 6, pages 417–422,
   2006.
3. Paolo Ferragina and Ugo Scaiella. TAGME: on-the-fly annotation of short text
   fragments (by wikipedia entities). In Proceedings of the 19th ACM international
   conference on Information and knowledge management, pages 1625–1628. ACM,
   2010.
4. Pablo N Mendes, Max Jakob, Andrés Garcı́a-Silva, and Christian Bizer. DBpe-
   dia spotlight: shedding light on the web of documents. In Proceedings of the 7th
   International Conference on Semantic Systems, pages 1–8. ACM, 2011.
5. Franco Orsucci, Giulia Paoloni, Mario Fulcheri, Mauro Annunziato, and Claudia
   Meloni. Smart Communities: social capital and psycho-social factors in Smart Cities.
   2012.
6. Norbert Streitz. Smart hybrid cities: progettare ambienti urbani a prova di futuro.
   Fondazione Ugo Bordoni, 2010.


                                          38