Workshop "From Objects to Agents" (WOA 2019)


                       Intelligent Collection and Analysis
                               of Citizens’ Reports
                     Giulio Angiani, Paolo Fornacciari, Gianfranco Lombardo, Monica Mordonini,
                                          Agostino Poggi, Michele Tomaiuolo
                                           Dipartimento di Ingegneria e Architettura
                                                 Università degli Studi di Parma
                                                          Parma, Italy
   {giulio.angiani, paolo.fornacciari, gianfranco.lombardo, monica.mordonini, agostino.poggi, michele.tomaiuolo}@unipr.it


   Abstract—The great and capillary diffusion of technology                 text and image with the app he prefers by interacting with
between citizens is actually creating the ideal conditions for              various bots dedicated to different messaging systems.
realizing the “Smart Community” concept. In this kind of socio-                Furthermore, the user is not responsible for categorizing
technical context, it is possible to create distributed applications
for the administration of cities and neighborhood, using data               the report, because this information is derived from an AI
provided directly by citizens. In particular, it is possible to             system through text-analysis, image-recognition and object-
connect users and integrate their actions into the whole system,            detection. Depending on the inserted text and the content of the
thanks to wide available instant messaging apps, used by the                attached image, this automatic system associates the message
greatest part of mobile users. The combined use of public                   with one of four predefined classes (environment, lighting,
APIs, Web systems and automated bots allowed us to build a
comprehensive framework for managing the reports sent to the                maintenance and security), corresponding to branches of the
local government by citizens through their already installed and            local administration.
well-known instant messaging apps, such as Whatsapp, Telegram                  The rest of the manuscript is structured in the following
and Messenger.                                                              way: Section I presents some of the most interesting works,
   In this paper we show the techniques used for retrieving and             conducted in this field; Section II describes the data collection
classifying texts and images of the reports, for their management
by the most appropriate branch of local administration. Our                 and methodology used for this study; results and discussions
results show that an automatic classification system of this kind           are presented in Section III; finally, Section IV provides some
can reach an accuracy of over the 90%.                                      concluding remarks.
   Index Terms—Text analysis, Image classification, Government
2.0, Chatbot.                                                                                    L ITERATURE REVIEW
                                                                               The widespread use of devices that are always connected
                       I. I NTRODUCTION                                     makes it possible for citizens to actively participate in the
   The widespread distribution of mobile devices and fast                   collection of local information and news, that has been
network coverage in recent years has created the conditions to              used for very different purposes. This changing landscape of
allow all citizens to access and send information at any time               technology-enabled engagement with communities, cities and
of the day. The concept of “smart community” arises precisely               spaces was first illustrated in the book From social butterfly
from the possibility, for users, to interact quickly and in real            to engaged citizen [1].
time on different aspects of real life.                                        In [2] and [3], authors show the first results of studies carried
   One of these is sending reports to local public administra-              out on the importance of citizen participation in the increase
tions. In almost all cases, this service is offered by private              of knowledge and the sharing of information in many fields,
companies that use proprietary apps to send and classify                    such as the tagging of road maps, the spread of diseases [4],
reports.                                                                    the dissemination of emergency news, the review of hotels
   Typically these apps allow users to send geo-localized text              and restaurants, the creation of 3D models based on user
and images after specifying the category to which they belong.              images. In [5] a mechanism of crowd sourcing and portable
The user who wants to use them must therefore download the                  smart devices is used to enable real time, location based crime
app from the appropriate store, register, learn how to use it               incident searching and reporting. This participatory approach,
correctly. In many cases this user overhead discourages the                 well presented in [6], clarifies the concept of Government 2.0,
use of the reporting system by citizens who are used to other               where knowledge is shared between citizens and institutions.
messaging channels.                                                            In [7] the authors explore recent research regarding the
   Our project has instead reversed the approach starting from              potential of ICT, social media and mobile technologies to
the use, by citizens, of free and widely used apps (Whatsapp,               foster citizen engagement and participation in urban planning.
Telegram and Messenger) to send the same information to                     ICT technologies and urban planning are important aspects of
public administrations. The user is therefore free to send the              the so-called socio-technical systems, in which the technology


                                                                       84
                                      Workshop "From Objects to Agents" (WOA 2019)

complexity is amplified by the organisational and procedural             assign the correct category among the 4 possible ones: envi-
complexity of the application domain [8] A survey on existing            ronment, lighting, maintenance and security. These categories
approaches for fostering citizen participation is presented in           are then useful to address each report to the most appropiate
[9].                                                                     brach of the local administation.
   In light of this, recently several e-Government services have            As a fundamental consequence of the nature of reports, a
been introduced by government and administrations in the                 requirement for the whole system is the ability to correctly
form of conversational chatbots. Singapore’s administration              interpret both the text and any attached image. The overall
operates a bot [10] that can answer a broad spectrum of                  architecture of the system is shown in Figure 1. It has
questions, providing citizens with links to their web portal             been realized over ActoDES, which is a software framework
like a traditional search engine. Another interesting case is            which adopts the actor model. In particular, it simplifies
the WienBot [11], operated by the Wien city administration.              the development of complex distributed systems [23] and it
Using a rather small knowledge base it provides efficient talk           already integrates modules for gathering online data from
capabilities but limited to its core domain. Some of limitations         social networks and for the automatic classification of such
of the reported cases are related to the absence of a Natural            data [24].
Language Processing step of the contents. Infact, one of
the most adavanced e-Government chatbot, the Burgeramter
chatbot [12], provided by the Berlin city administration, is
based on a multi-staged framework that combines Sentiment
Analysis and POS-Tagging of the questions and a knowledge
base. This bot provides only information and raccomendations
concerning public offices and services but do not accepts
citizens reports.                                                               Telegram              Whatsapp              Messenger
   It should also be noted that nowadays users are , virtually,                    bot                  bot                    bot
always connected to the network. Thus, they are able to send
and share information in various ways: via social networks
(Twitter, Facebook, Instagram, and others), or through dedi-
cated apps, or through instant messaging systems (Telegram,                                Intermediate representation
Whatsapp, Messenger). For example, in [13] a mobile apps                                         (text + images)
allows walkers to map urban accessibility barriers/facilities,
while wandering around. In [14] there is a description of a
mobile app which allows users to take geo-tagged photo of
                                                                                          Text                         Image
road fault reporting, attach a brief description, and submit the
                                                                                        analysis                      analysis
information as a maintenance request to the local government
organisation of their city. A number of cities in USA have
worked to create apps that allow users to interact in order
to report grafiti or, in general, code violations, such as My-                       Bag of words                   Image tags
Delaware app or Boston’s Citizens Connect mobile app: [15].
   In these cases, the information is typically sent through nat-
ural language or self-produced images. Thus, the application
needs to classify the reports based on the results of both text                                     Classiﬁcation
and image analysis. The analysis of natural language on social                                        algorithm
networks is carried out for different purposes including: (i)
analysis of users’ sentiment [16] [17] [18], (ii) analysis of
discussed topics [19], and (iii) analysis of the text structure
[20]. Image analysis and object-recognition are also used for                                      Message class
several goals, such as real-time object detection [21] or 3D-
modeling [22]. In our work, the object-detection process has                     Figure 1. Data flow representation of the whole system.
been used for understanding what a user would send to the
institutional partner.                                                      The first layer includes the implemention of specific bots
                                                                         related to the different messaging systems and the appropriate
                     II. M ETHODOLOGY                                    components to make the representations of the various mes-
                                                                         sages homogeneous. Then, each message is treated internally
  As we mentioned in the chapter I, the overall project                  with a representation in JSON format containing information
aimed at the automatic classification of reports sent via instant        on text, images, sender user data, date and time of the message,
messaging systems. Each sent message could contain text                  and any information related to geolocation. Reports are then
and images, analyzing which it was possible to automatically             managed in a Web-based system, which we have developed


                                                                    85
                                        Workshop "From Objects to Agents" (WOA 2019)

for playing the role of an ad-hoc Customer Relationship                  Forest (RF), Support Vector Machine (SVM), and K-Nearest
Management (CRM) system, as shown in Figure 2.                           Neighbors (KNN).
                                                                            Therefore, the next step is the creation of a four-way text
A. Dataset collection                                                    classifier as in fig. 3 whose output is the class to which a
   The first phase of the project is therefore focused on the            message belongs. In fact, the system also emits a dictionary
definition and implementation of the text classifier. For this           with the confidence values for each reported class. The process
purpose, 7758 citizens’ reports have been downloaded from the            is represented in Figure 3.
institutional websites of several Italian municipalities. These             An example of the classifier’s output regarding confidence
data represent the initial working dataset. These reports, pub-          for each class is as follows
licly visible on the administrations’ websites, are associated
                                                                                {
to categories directly by the users who have sent them or by
                                                                                     "environment": 0.8731,
the offices in charge.
                                                                                     "lighting": 0.1023,
   Since these data reside on different systems and are man-
                                                                                     "maintenance": 0.0092,
aged in a non-homogeneous way, we have counted 27 cate-
                                                                                     "security": 0.0154,
gories, to which the various reports are associated. Many of
                                                                                }
these categories however differ only by designation and not
by concept (i.e. “road safety” versus “road maintenance”).               B. Image classification
   The first operation is therefore to reduce the numerous                  The second branch of the project is focused on the analysis
classes to the four chosen for the project. These four categories        of the images present in the reports sent by the users. In
(Environment, Lighting, Maintenance, Security) have been                 order to associate the correct class to the entire reporting, it is
identified because conceptually connected with some corre-               necessary to understand which entities are represented within
sponding administrative offices that will manage the citizens’           the associated image.
reports themselves.                                                         This is done using the Clarifai 1 [25] object-recognition
   However, for comparing the diffent possible types of anal-            service. The service is available via Rest API and a convenient
ysis, we have limited the dataset to the massages containing             Python module.
both text and images. After manual analysis, we have found                  For each image associated with a message, a call is made
that the least represented category in the reduced dataset is            to the service to retrieve information related to the content
lightening, with little more than 200 instances. We have then            of the image. Among other information, the results returned
proceeded to balance the dataset using around 200 instances              by Clarifai in JSON format contain the list of entities (or
for each class, obtaining 804 instances in total, which we have          concepts) associated to the image. Additionally, a probability
used for further analysis and comparisons. Table I shows the             value is associated with each entity. As an example, a list of
number of reports associated with the different classes, after           entities like the following one can be obtained:
this selection.
                                                                            "entities":[
                              Table I                                                                {
            N UMBER OF SELECTED REPORTS PER CLASS .                                                    "name": "train",
                Class         Number of messages
                                                                                                       "value": 0.9989112
                Environment                  201                                                     },
                Lighting                     201                                                     {
                Maintenance                  200                                                       "name": "railway",
                Security                     202                                                       "value": 0.9975532
                                                                                                     },
   1) Text analysis:                                                                                 {
For the text analysis branch, a preprocessing operation is                                             "name": "station",
carried out on the text to eliminate characters not useful                                             "value": 0.992573
for classification, for example punctuation, reports without                                         },
information content, emoticons. The stemming operations is                                       ]
applied and the text is filtered through a list of stop-words.              Here, value is the confidence value of the entity indicated
Finally, the text is vectorized according the Bag of Words               in name being represented within the image.
approach.                                                                   After analyzing the response of the external service, we
   2) Training the classifier:                                           construct a sequence of words using only the entities with a
At this point the dataset is ready to be used for training a             confidence value greater than 0.9. This sequence of entities is
text classifier. As proposed in [18], the Multinomial Naive-             analyzed exactly in the same way as the bag of words obtained
Bayes classification algorithm is chosen for the analysis of             from the text. The same classification algorithms described
natural language. However, we also campare it with other well
known automatic classification algorithms, namely: Random                  1 Clarifai Inc., API Reference, https://clarifai.com/developer/reference/


                                                                    86
                                     Workshop "From Objects to Agents" (WOA 2019)


                                                  Figure 2. The Web-based interface.


above are compared also in the case of image analysis. The                                      III. R ESULTS
process is represented in Figure 4.
                                                                           After creating the whole dataset, we compare the three
                                                                        different approaches described in the previous section. In the
C. Classification approaches                                            following, these approaches are identified as: Text, based only
                                                                        on text analysis; Image, based only on image analysis; and
   After realizing the subsystems for the analysis of text and
                                                                        Text+Image, based on both text and image analyses.
images, we have performed a comparison between the text
classification results and that of the associated images. More             We also compare various well-known classification algo-
precisely, we have compared three cases:                                rithms, namely: RF, Random Forest; NBM, Naïve Bayes
                                                                        Multinomial; SMO, Sequential Minimal Optimization, based
                                                                        on the pinciples of support vectors; KNN, K-Nearest Neigh-
  • Classification through text analysis, only (using the bag           bors. In this latest algorithm, we use K=1 to gain its best
    of words approach)                                                  results.
  • Classification through image analysis, only (using the
                                                                           It can be observed in Figure 5 and Table II that all
    image entities)                                                     algorithms perform better on the image entities than on the
  • Classification through both text and image analyses (con-
                                                                        text. Moreover, all algorithms improve their results using all
    catenating their lists of features)                                 available features, with the exception of KNN, which is known
  In Section III, we show the analytical values of the accuracy         to work better on a limited set of features.
of the classifier itself, using various features and algorithms.           Overall, the best classification accuracy is obtained by the
For those evaluations, the dataset is splitted and then used for        NBM algorithm, using the features of both text and images.
both training and validating the classifier. For improving the          In this case, the classification is correct in over 90% of
consistency and reproducibility of results, we have adopted the         cases. However, using the RF algorithm, a very close value of
well-known ten-fold Cross Validation technique.                         accuracy can be obtained even using only the image features.


                                                                   87
                                          Workshop "From Objects to Agents" (WOA 2019)

                                                                              100.00%
                           Report text                                         90.00%

                                                                               80.00%

                                                                               70.00%

                                               - Puntuation                    60.00%
                                                                                                                                            RF
                               Text            - Stemming                      50.00%                                                       NBM
                                                                                                                                            SMO
                          preprocessing        - Stop words                    40.00%                                                       KNN
                                               - Vectorization                 30.00%

                                                                               20.00%

                                                                               10.00%
                          Bag of words                                          0.00%
                                                                                            Text              Image           Text+Image


                                                                       Figure 5. Accuracy of classifiers, using different features and different
                                Text                                   algorithms.
                           classiﬁcation
                                                                                                      Table II
                                                                                        N UMERICAL ACCURACY OF CLASSIFIERS .

       Environment                                     Security                                       Text     Image        Text+Image
                                                                                        RF         71.52%     88.18%            88.93%
                                                                                        NBM        80.22%     84.08%            90.67%
                                                                                        SMO        75.99%     86.07%            89.25%
                Lightnening              Mainentance
                                                                                        KNN        57.58%     81.21%            78.85%

                Figure 3. Text classication process.
                                                                       mis-classified as Lighting ones, since the two issues often
                                                                       coexhist.
                          Report image
                                                                                                    Table III
                                                                           C ONFUSION MATRIX FOR NBM ON T EXT +I MAGE ; CLASSES ARE :
                                                                          A =E NVIRONMENT, B =S ECURITY, C =M AINTENANCE , D =L IGHTING .
                               Image
                                               - Clarifai API
                              analysis                                                  Classified as         a        b      c        d
                                                                                        a                    175       15     11        0
                                                                                        b                      8      172      8       14
                                                                                        c                      1        6    193        0
                          Image entities                                                d                      0       10      2      189


                                                                                  IV. C ONCLUSIONS AND FUTURE WORKS
                              Image
                           classiﬁcation                                  Our project shows an implementation of an automatic classi-
                                                                       fication system for reporting citizens to public administrations
                                                                       through the use of instant messaging. This operation was
                                                                       carried out by analyzing separately the text and images of
       Environment                                     Security
                                                                       a report and then comparing the results of this analysis. The
                                                                       accuracy of the final classification has achieved results overall
                                                                       greater than 90%, using features from both text and associated
                Lightnening              Mainentance
                                                                       images. However, also using only the entities found through
                                                                       the image analysis, very similar results can be obtained. These
               Figure 4. Image classication process.                   results suggest that it is possible to collect citizens’ report
                                                                       in a very simplified way, receiving just geolocalized images,
                                                                       which can be classified automatically in most cases. The use
This way, it is possible to greatly ease the burden on users,          of automated bots for interacting with the users allow them to
when they have to issue reports about local problems.                  correct the wrong results in a very convenient way, only when
                                                                       necessary.
A. Accuracy of the classifiers
                                                                          The future developments of the project will concern differ-
  Finally, Table III represents the confusion matrix. It can           ent kinds of analysis, in cases of discrepancy of the classifiers.
be observed that few errors occurs. Among those, it is worth           Furthermore, after the deployment of the service in some
noting that: (i) 15 Environment reports are mis-classified as          public administrations, it will be possible to carry out content
Security ones, possibly due to the presence of instances about         analyzes that can also be based on data related to the map of
unsecure parks in the dataset; and (ii) 14 Security reports are        the territory and the history of reports.


                                                                  88
                                             Workshop "From Objects to Agents" (WOA 2019)

                     V. ACKNOWLEDGEMENTS                                              [10] “5 reasons to use the gov.sg bot, march 2017. blog of singapore gov-
                                                                                           ernment,” https://www.gov.sg/news/content/5-reasons-to-use-the-gov-sg-
   This project has been developed in collaboration and agree-                             bot.
ment with the local administration of Montecchio Emilia                               [11] “Magistrat der stadt wien. wienbot - der chatbot der stadt, june 2017,”
(Italy), following the concepts of the Government 2.0. The                                 https://www.wien.gv.at/bot/.
local administration has helped to better understand the man-                         [12] A. Lommatzsch, “A next generation chatbot-framework for the public
agement process for the main types of reports and has sug-                                 administration,” in Innovations for Community Services, M. Hodoň,
gested some guidelines for users’ operations, which can be                                 G. Eichler, C. Erfurth, and G. Fahrnberger, Eds.   Cham: Springer
                                                                                           International Publishing, 2018, pp. 127–141.
better handled by public offices.
                                                                                      [13] P. Salomoni, C. Prandi, M. Roccetti, V. Nisi, and N. J. Nunes,
                              R EFERENCES                                                  “Crowdsourcing urban accessibility: Some preliminary experiences with
                                                                                           results,” in Proceedings of the 11th Biannual Conference on Italian
 [1] M. Foth, L. Forlano, C. Satchell, and M. Gibbs, From social butterfly to              SIGCHI Chapter. ACM, 2015, pp. 130–133.
     engaged citizen: urban informatics, social media, ubiquitous computing,
                                                                                      [14] M. Foth, R. Schroeter, and I. Anastasiu, “Fixing the city one photo at
     and mobile technology to support citizen engagement. MIT Press, 2011.
                                                                                           a time: mobile logging of maintenance requests,” in Proceedings of the
 [2] A. Hermida, “Twittering the news: The emergence of ambient journal-
                                                                                           23rd Australian Computer-Human Interaction Conference. ACM, 2011,
     ism,” Journalism practice, vol. 4, no. 3, pp. 297–308, 2010.
                                                                                           pp. 126–129.
 [3] M. N. Kamel Boulos, B. Resch, D. N. Crowley, J. G. Breslin,
     G. Sohn, R. Burtner, W. A. Pike, E. Jezierski, and K.-Y. S. Chuang,              [15] J. Evans-Cowley, “There is an app for that: mobile applications for urban
     “Crowdsourcing, citizen sensing and sensor web technologies for                       planning,” International Journal of E-Planning Research (IJEPR), vol. 1,
     public and environmental health surveillance and crisis management:                   no. 2, pp. 79–87, 2012.
     trends, ogc standards and application examples,” International Journal
                                                                                      [16] B. Liu, “Sentiment analysis and opinion mining,” Synthesis lectures on
     of Health Geographics, vol. 10, no. 1, p. 67, Dec 2011. [Online].
                                                                                           human language technologies, vol. 5, no. 1, pp. 1–167, 2012.
     Available: https://doi.org/10.1186/1476-072X-10-67
                                                                                      [18] G. Angiani, L. Ferrari, T. Fontanini, P. Fornacciari, E. Iotti, F. Magliani,
 [4] M. N. K. Boulos, S. Wheeler, C. Tavares, and R. Jones, “How smart-
                                                                                           and S. Manicardi, “A comparison between preprocessing techniques for
     phones are changing the face of mobile and participatory healthcare: an
                                                                                           sentiment analysis in twitter.” in KDWeb, 2016.
     overview, with example from ecaalyx,” Biomedical engineering online,
     vol. 10, no. 1, p. 24, 2011.                                                     [19] C. C. Aggarwal and C. Zhai, Mining text data.          Springer Science &
 [5] S. Shah, F. Bao, C.-T. Lu, and I.-R. Chen, “Crowdsafe: crowd sourcing                 Business Media, 2012.
     of crime incidents and safe routing on mobile devices,” in Proceedings           [20] A. Athar, “Sentiment analysis of citations using sentence structure-based
     of the 19th ACM SIGSPATIAL International Conference on Advances in                    features,” in Proceedings of the ACL 2011 student session. Association
     Geographic Information Systems. ACM, 2011, pp. 521–524.                               for Computational Linguistics, 2011, pp. 81–87.
 [6] S. Chun, S. Shulman, R. Sandoval, and E. Hovy, “Government 2.0:
     Making connections between citizens, data and government,” Informa-              [21] P. Piccinini, A. Prati, and R. Cucchiara, “Real-time object detection and
     tion Polity, vol. 15, no. 1, 2, pp. 1–9, 2010.                                        localization with sift-based clustering,” Image and Vision Computing,
 [7] R. Kleinhans, M. V. Ham, and J. Evans-Cowley, “Using social media                     vol. 30, no. 8, pp. 573–587, 2012.
     and mobile technologies to foster engagement and self-organization in            [22] P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox, “Rgb-d mapping:
     participatory urban planning and neighbourhood governance,” Planning                  Using depth cameras for dense 3d modeling of indoor environments,”
     Practice & Research, vol. 30, no. 3, pp. 237–247, 2015.                               in Experimental robotics. Springer, 2014, pp. 477–491.
 [8] G. Cabri, M. Cossentino, E. Denti, P. Giorgini, A. Molesini, M. Mor-
     donini, M. Tomaiuolo, and L. Sabatucci, “Towards an integrated plat-             [23] F. Bergenti, A. Poggi, and M. Tomaiuolo, “An actor based software
     form for adaptive socio-technical systems for smart spaces,” in Enabling              framework for scalable applications,” Lecture Notes in Computer Science
     Technologies: Infrastructure for Collaborative Enterprises (WETICE),                  (LNCS), vol. 8729, pp. 26–35, 2015, proc. 7th International Conference
     2016 IEEE 25th International Conference on. IEEE, 2016, pp. 3–8.                      on Internet and Distributed Computing Systems (IDCS 2014); Calabria;
 [9] F. Salim and U. Haque, “Urban computing in the wild: A survey on large                Italy; 2014-09-22/24 [MT].
     scale participation and citizen engagement with ubiquitous computing,            [24] P. Fornacciari, M. Mordonini, A. Poggi, L. Sani, and M. Tomaiuolo,
     cyber physical systems, and internet of things,” International Journal of             “A holistic system for troll detection on twitter,” Computers in Human
     Human-Computer Studies, vol. 81, pp. 31–48, 2015.                                     Behavior, vol. 89, pp. 258–268, 2018.
[17] P. Fornacciari, M. Mordonini, and M. Tomaiuolo, “Social network and
     sentiment analysis on twitter: Towards a combined approach.” in KDWeb,           [25] C. Inc. (2018) Clarifa.com            web    site.   [Online].   Available:
     2015, pp. 53–64.                                                                      https://clarifai.com/


                                                                                 89