Workshop "From Objects to Agents" (WOA 2019) Intelligent Collection and Analysis of Citizens’ Reports Giulio Angiani, Paolo Fornacciari, Gianfranco Lombardo, Monica Mordonini, Agostino Poggi, Michele Tomaiuolo Dipartimento di Ingegneria e Architettura Università degli Studi di Parma Parma, Italy {giulio.angiani, paolo.fornacciari, gianfranco.lombardo, monica.mordonini, agostino.poggi, michele.tomaiuolo}@unipr.it Abstract—The great and capillary diffusion of technology text and image with the app he prefers by interacting with between citizens is actually creating the ideal conditions for various bots dedicated to different messaging systems. realizing the “Smart Community” concept. In this kind of socio- Furthermore, the user is not responsible for categorizing technical context, it is possible to create distributed applications for the administration of cities and neighborhood, using data the report, because this information is derived from an AI provided directly by citizens. In particular, it is possible to system through text-analysis, image-recognition and object- connect users and integrate their actions into the whole system, detection. Depending on the inserted text and the content of the thanks to wide available instant messaging apps, used by the attached image, this automatic system associates the message greatest part of mobile users. The combined use of public with one of four predefined classes (environment, lighting, APIs, Web systems and automated bots allowed us to build a comprehensive framework for managing the reports sent to the maintenance and security), corresponding to branches of the local government by citizens through their already installed and local administration. well-known instant messaging apps, such as Whatsapp, Telegram The rest of the manuscript is structured in the following and Messenger. way: Section I presents some of the most interesting works, In this paper we show the techniques used for retrieving and conducted in this field; Section II describes the data collection classifying texts and images of the reports, for their management by the most appropriate branch of local administration. Our and methodology used for this study; results and discussions results show that an automatic classification system of this kind are presented in Section III; finally, Section IV provides some can reach an accuracy of over the 90%. concluding remarks. Index Terms—Text analysis, Image classification, Government 2.0, Chatbot. L ITERATURE REVIEW The widespread use of devices that are always connected I. I NTRODUCTION makes it possible for citizens to actively participate in the The widespread distribution of mobile devices and fast collection of local information and news, that has been network coverage in recent years has created the conditions to used for very different purposes. This changing landscape of allow all citizens to access and send information at any time technology-enabled engagement with communities, cities and of the day. The concept of “smart community” arises precisely spaces was first illustrated in the book From social butterfly from the possibility, for users, to interact quickly and in real to engaged citizen [1]. time on different aspects of real life. In [2] and [3], authors show the first results of studies carried One of these is sending reports to local public administra- out on the importance of citizen participation in the increase tions. In almost all cases, this service is offered by private of knowledge and the sharing of information in many fields, companies that use proprietary apps to send and classify such as the tagging of road maps, the spread of diseases [4], reports. the dissemination of emergency news, the review of hotels Typically these apps allow users to send geo-localized text and restaurants, the creation of 3D models based on user and images after specifying the category to which they belong. images. In [5] a mechanism of crowd sourcing and portable The user who wants to use them must therefore download the smart devices is used to enable real time, location based crime app from the appropriate store, register, learn how to use it incident searching and reporting. This participatory approach, correctly. In many cases this user overhead discourages the well presented in [6], clarifies the concept of Government 2.0, use of the reporting system by citizens who are used to other where knowledge is shared between citizens and institutions. messaging channels. In [7] the authors explore recent research regarding the Our project has instead reversed the approach starting from potential of ICT, social media and mobile technologies to the use, by citizens, of free and widely used apps (Whatsapp, foster citizen engagement and participation in urban planning. Telegram and Messenger) to send the same information to ICT technologies and urban planning are important aspects of public administrations. The user is therefore free to send the the so-called socio-technical systems, in which the technology 84 Workshop "From Objects to Agents" (WOA 2019) complexity is amplified by the organisational and procedural assign the correct category among the 4 possible ones: envi- complexity of the application domain [8] A survey on existing ronment, lighting, maintenance and security. These categories approaches for fostering citizen participation is presented in are then useful to address each report to the most appropiate [9]. brach of the local administation. In light of this, recently several e-Government services have As a fundamental consequence of the nature of reports, a been introduced by government and administrations in the requirement for the whole system is the ability to correctly form of conversational chatbots. Singapore’s administration interpret both the text and any attached image. The overall operates a bot [10] that can answer a broad spectrum of architecture of the system is shown in Figure 1. It has questions, providing citizens with links to their web portal been realized over ActoDES, which is a software framework like a traditional search engine. Another interesting case is which adopts the actor model. In particular, it simplifies the WienBot [11], operated by the Wien city administration. the development of complex distributed systems [23] and it Using a rather small knowledge base it provides efficient talk already integrates modules for gathering online data from capabilities but limited to its core domain. Some of limitations social networks and for the automatic classification of such of the reported cases are related to the absence of a Natural data [24]. Language Processing step of the contents. Infact, one of the most adavanced e-Government chatbot, the Burgeramter chatbot [12], provided by the Berlin city administration, is based on a multi-staged framework that combines Sentiment Analysis and POS-Tagging of the questions and a knowledge base. This bot provides only information and raccomendations concerning public offices and services but do not accepts citizens reports. Telegram Whatsapp Messenger It should also be noted that nowadays users are , virtually, bot bot bot always connected to the network. Thus, they are able to send and share information in various ways: via social networks (Twitter, Facebook, Instagram, and others), or through dedi- cated apps, or through instant messaging systems (Telegram, Intermediate representation Whatsapp, Messenger). For example, in [13] a mobile apps (text + images) allows walkers to map urban accessibility barriers/facilities, while wandering around. In [14] there is a description of a mobile app which allows users to take geo-tagged photo of Text Image road fault reporting, attach a brief description, and submit the analysis analysis information as a maintenance request to the local government organisation of their city. A number of cities in USA have worked to create apps that allow users to interact in order to report grafiti or, in general, code violations, such as My- Bag of words Image tags Delaware app or Boston’s Citizens Connect mobile app: [15]. In these cases, the information is typically sent through nat- ural language or self-produced images. Thus, the application needs to classify the reports based on the results of both text Classification and image analysis. The analysis of natural language on social algorithm networks is carried out for different purposes including: (i) analysis of users’ sentiment [16] [17] [18], (ii) analysis of discussed topics [19], and (iii) analysis of the text structure [20]. Image analysis and object-recognition are also used for Message class several goals, such as real-time object detection [21] or 3D- modeling [22]. In our work, the object-detection process has Figure 1. Data flow representation of the whole system. been used for understanding what a user would send to the institutional partner. The first layer includes the implemention of specific bots related to the different messaging systems and the appropriate II. M ETHODOLOGY components to make the representations of the various mes- sages homogeneous. Then, each message is treated internally As we mentioned in the chapter I, the overall project with a representation in JSON format containing information aimed at the automatic classification of reports sent via instant on text, images, sender user data, date and time of the message, messaging systems. Each sent message could contain text and any information related to geolocation. Reports are then and images, analyzing which it was possible to automatically managed in a Web-based system, which we have developed 85 Workshop "From Objects to Agents" (WOA 2019) for playing the role of an ad-hoc Customer Relationship Forest (RF), Support Vector Machine (SVM), and K-Nearest Management (CRM) system, as shown in Figure 2. Neighbors (KNN). Therefore, the next step is the creation of a four-way text A. Dataset collection classifier as in fig. 3 whose output is the class to which a The first phase of the project is therefore focused on the message belongs. In fact, the system also emits a dictionary definition and implementation of the text classifier. For this with the confidence values for each reported class. The process purpose, 7758 citizens’ reports have been downloaded from the is represented in Figure 3. institutional websites of several Italian municipalities. These An example of the classifier’s output regarding confidence data represent the initial working dataset. These reports, pub- for each class is as follows licly visible on the administrations’ websites, are associated { to categories directly by the users who have sent them or by "environment": 0.8731, the offices in charge. "lighting": 0.1023, Since these data reside on different systems and are man- "maintenance": 0.0092, aged in a non-homogeneous way, we have counted 27 cate- "security": 0.0154, gories, to which the various reports are associated. Many of } these categories however differ only by designation and not by concept (i.e. “road safety” versus “road maintenance”). B. Image classification The first operation is therefore to reduce the numerous The second branch of the project is focused on the analysis classes to the four chosen for the project. These four categories of the images present in the reports sent by the users. In (Environment, Lighting, Maintenance, Security) have been order to associate the correct class to the entire reporting, it is identified because conceptually connected with some corre- necessary to understand which entities are represented within sponding administrative offices that will manage the citizens’ the associated image. reports themselves. This is done using the Clarifai 1 [25] object-recognition However, for comparing the diffent possible types of anal- service. The service is available via Rest API and a convenient ysis, we have limited the dataset to the massages containing Python module. both text and images. After manual analysis, we have found For each image associated with a message, a call is made that the least represented category in the reduced dataset is to the service to retrieve information related to the content lightening, with little more than 200 instances. We have then of the image. Among other information, the results returned proceeded to balance the dataset using around 200 instances by Clarifai in JSON format contain the list of entities (or for each class, obtaining 804 instances in total, which we have concepts) associated to the image. Additionally, a probability used for further analysis and comparisons. Table I shows the value is associated with each entity. As an example, a list of number of reports associated with the different classes, after entities like the following one can be obtained: this selection. "entities":[ Table I { N UMBER OF SELECTED REPORTS PER CLASS . "name": "train", Class Number of messages "value": 0.9989112 Environment 201 }, Lighting 201 { Maintenance 200 "name": "railway", Security 202 "value": 0.9975532 }, 1) Text analysis: { For the text analysis branch, a preprocessing operation is "name": "station", carried out on the text to eliminate characters not useful "value": 0.992573 for classification, for example punctuation, reports without }, information content, emoticons. The stemming operations is ] applied and the text is filtered through a list of stop-words. Here, value is the confidence value of the entity indicated Finally, the text is vectorized according the Bag of Words in name being represented within the image. approach. After analyzing the response of the external service, we 2) Training the classifier: construct a sequence of words using only the entities with a At this point the dataset is ready to be used for training a confidence value greater than 0.9. This sequence of entities is text classifier. As proposed in [18], the Multinomial Naive- analyzed exactly in the same way as the bag of words obtained Bayes classification algorithm is chosen for the analysis of from the text. The same classification algorithms described natural language. However, we also campare it with other well known automatic classification algorithms, namely: Random 1 Clarifai Inc., API Reference, https://clarifai.com/developer/reference/ 86 Workshop "From Objects to Agents" (WOA 2019) Figure 2. The Web-based interface. above are compared also in the case of image analysis. The III. R ESULTS process is represented in Figure 4. After creating the whole dataset, we compare the three different approaches described in the previous section. In the C. Classification approaches following, these approaches are identified as: Text, based only on text analysis; Image, based only on image analysis; and After realizing the subsystems for the analysis of text and Text+Image, based on both text and image analyses. images, we have performed a comparison between the text classification results and that of the associated images. More We also compare various well-known classification algo- precisely, we have compared three cases: rithms, namely: RF, Random Forest; NBM, Naïve Bayes Multinomial; SMO, Sequential Minimal Optimization, based on the pinciples of support vectors; KNN, K-Nearest Neigh- • Classification through text analysis, only (using the bag bors. In this latest algorithm, we use K=1 to gain its best of words approach) results. • Classification through image analysis, only (using the It can be observed in Figure 5 and Table II that all image entities) algorithms perform better on the image entities than on the • Classification through both text and image analyses (con- text. Moreover, all algorithms improve their results using all catenating their lists of features) available features, with the exception of KNN, which is known In Section III, we show the analytical values of the accuracy to work better on a limited set of features. of the classifier itself, using various features and algorithms. Overall, the best classification accuracy is obtained by the For those evaluations, the dataset is splitted and then used for NBM algorithm, using the features of both text and images. both training and validating the classifier. For improving the In this case, the classification is correct in over 90% of consistency and reproducibility of results, we have adopted the cases. However, using the RF algorithm, a very close value of well-known ten-fold Cross Validation technique. accuracy can be obtained even using only the image features. 87 Workshop "From Objects to Agents" (WOA 2019) 100.00% Report text 90.00% 80.00% 70.00% - Puntuation 60.00% RF Text - Stemming 50.00% NBM SMO preprocessing - Stop words 40.00% KNN - Vectorization 30.00% 20.00% 10.00% Bag of words 0.00% Text Image Text+Image Figure 5. Accuracy of classifiers, using different features and different Text algorithms. classification Table II N UMERICAL ACCURACY OF CLASSIFIERS . Environment Security Text Image Text+Image RF 71.52% 88.18% 88.93% NBM 80.22% 84.08% 90.67% SMO 75.99% 86.07% 89.25% Lightnening Mainentance KNN 57.58% 81.21% 78.85% Figure 3. Text classication process. mis-classified as Lighting ones, since the two issues often coexhist. Report image Table III C ONFUSION MATRIX FOR NBM ON T EXT +I MAGE ; CLASSES ARE : A =E NVIRONMENT, B =S ECURITY, C =M AINTENANCE , D =L IGHTING . Image - Clarifai API analysis Classified as  a b c d a 175 15 11 0 b 8 172 8 14 c 1 6 193 0 Image entities d 0 10 2 189 IV. C ONCLUSIONS AND FUTURE WORKS Image classification Our project shows an implementation of an automatic classi- fication system for reporting citizens to public administrations through the use of instant messaging. This operation was carried out by analyzing separately the text and images of Environment Security a report and then comparing the results of this analysis. The accuracy of the final classification has achieved results overall greater than 90%, using features from both text and associated Lightnening Mainentance images. However, also using only the entities found through the image analysis, very similar results can be obtained. These Figure 4. Image classication process. results suggest that it is possible to collect citizens’ report in a very simplified way, receiving just geolocalized images, which can be classified automatically in most cases. The use This way, it is possible to greatly ease the burden on users, of automated bots for interacting with the users allow them to when they have to issue reports about local problems. correct the wrong results in a very convenient way, only when necessary. A. Accuracy of the classifiers The future developments of the project will concern differ- Finally, Table III represents the confusion matrix. It can ent kinds of analysis, in cases of discrepancy of the classifiers. be observed that few errors occurs. Among those, it is worth Furthermore, after the deployment of the service in some noting that: (i) 15 Environment reports are mis-classified as public administrations, it will be possible to carry out content Security ones, possibly due to the presence of instances about analyzes that can also be based on data related to the map of unsecure parks in the dataset; and (ii) 14 Security reports are the territory and the history of reports. 88 Workshop "From Objects to Agents" (WOA 2019) V. ACKNOWLEDGEMENTS [10] “5 reasons to use the gov.sg bot, march 2017. blog of singapore gov- ernment,” https://www.gov.sg/news/content/5-reasons-to-use-the-gov-sg- This project has been developed in collaboration and agree- bot. ment with the local administration of Montecchio Emilia [11] “Magistrat der stadt wien. wienbot - der chatbot der stadt, june 2017,” (Italy), following the concepts of the Government 2.0. The https://www.wien.gv.at/bot/. local administration has helped to better understand the man- [12] A. Lommatzsch, “A next generation chatbot-framework for the public agement process for the main types of reports and has sug- administration,” in Innovations for Community Services, M. Hodoň, gested some guidelines for users’ operations, which can be G. Eichler, C. Erfurth, and G. Fahrnberger, Eds. Cham: Springer International Publishing, 2018, pp. 127–141. better handled by public offices. [13] P. Salomoni, C. Prandi, M. Roccetti, V. Nisi, and N. J. Nunes, R EFERENCES “Crowdsourcing urban accessibility: Some preliminary experiences with results,” in Proceedings of the 11th Biannual Conference on Italian [1] M. Foth, L. Forlano, C. Satchell, and M. Gibbs, From social butterfly to SIGCHI Chapter. ACM, 2015, pp. 130–133. engaged citizen: urban informatics, social media, ubiquitous computing, [14] M. Foth, R. Schroeter, and I. Anastasiu, “Fixing the city one photo at and mobile technology to support citizen engagement. MIT Press, 2011. a time: mobile logging of maintenance requests,” in Proceedings of the [2] A. Hermida, “Twittering the news: The emergence of ambient journal- 23rd Australian Computer-Human Interaction Conference. ACM, 2011, ism,” Journalism practice, vol. 4, no. 3, pp. 297–308, 2010. pp. 126–129. [3] M. N. Kamel Boulos, B. Resch, D. N. Crowley, J. G. Breslin, G. Sohn, R. Burtner, W. A. Pike, E. Jezierski, and K.-Y. S. Chuang, [15] J. Evans-Cowley, “There is an app for that: mobile applications for urban “Crowdsourcing, citizen sensing and sensor web technologies for planning,” International Journal of E-Planning Research (IJEPR), vol. 1, public and environmental health surveillance and crisis management: no. 2, pp. 79–87, 2012. trends, ogc standards and application examples,” International Journal [16] B. Liu, “Sentiment analysis and opinion mining,” Synthesis lectures on of Health Geographics, vol. 10, no. 1, p. 67, Dec 2011. [Online]. human language technologies, vol. 5, no. 1, pp. 1–167, 2012. Available: https://doi.org/10.1186/1476-072X-10-67 [18] G. Angiani, L. Ferrari, T. Fontanini, P. Fornacciari, E. Iotti, F. Magliani, [4] M. N. K. Boulos, S. Wheeler, C. Tavares, and R. Jones, “How smart- and S. Manicardi, “A comparison between preprocessing techniques for phones are changing the face of mobile and participatory healthcare: an sentiment analysis in twitter.” in KDWeb, 2016. overview, with example from ecaalyx,” Biomedical engineering online, vol. 10, no. 1, p. 24, 2011. [19] C. C. Aggarwal and C. Zhai, Mining text data. Springer Science & [5] S. Shah, F. Bao, C.-T. Lu, and I.-R. Chen, “Crowdsafe: crowd sourcing Business Media, 2012. of crime incidents and safe routing on mobile devices,” in Proceedings [20] A. Athar, “Sentiment analysis of citations using sentence structure-based of the 19th ACM SIGSPATIAL International Conference on Advances in features,” in Proceedings of the ACL 2011 student session. Association Geographic Information Systems. ACM, 2011, pp. 521–524. for Computational Linguistics, 2011, pp. 81–87. [6] S. Chun, S. Shulman, R. Sandoval, and E. Hovy, “Government 2.0: Making connections between citizens, data and government,” Informa- [21] P. Piccinini, A. Prati, and R. Cucchiara, “Real-time object detection and tion Polity, vol. 15, no. 1, 2, pp. 1–9, 2010. localization with sift-based clustering,” Image and Vision Computing, [7] R. Kleinhans, M. V. Ham, and J. Evans-Cowley, “Using social media vol. 30, no. 8, pp. 573–587, 2012. and mobile technologies to foster engagement and self-organization in [22] P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox, “Rgb-d mapping: participatory urban planning and neighbourhood governance,” Planning Using depth cameras for dense 3d modeling of indoor environments,” Practice & Research, vol. 30, no. 3, pp. 237–247, 2015. in Experimental robotics. Springer, 2014, pp. 477–491. [8] G. Cabri, M. Cossentino, E. Denti, P. Giorgini, A. Molesini, M. Mor- donini, M. Tomaiuolo, and L. Sabatucci, “Towards an integrated plat- [23] F. Bergenti, A. Poggi, and M. Tomaiuolo, “An actor based software form for adaptive socio-technical systems for smart spaces,” in Enabling framework for scalable applications,” Lecture Notes in Computer Science Technologies: Infrastructure for Collaborative Enterprises (WETICE), (LNCS), vol. 8729, pp. 26–35, 2015, proc. 7th International Conference 2016 IEEE 25th International Conference on. IEEE, 2016, pp. 3–8. on Internet and Distributed Computing Systems (IDCS 2014); Calabria; [9] F. Salim and U. Haque, “Urban computing in the wild: A survey on large Italy; 2014-09-22/24 [MT]. scale participation and citizen engagement with ubiquitous computing, [24] P. Fornacciari, M. Mordonini, A. Poggi, L. Sani, and M. Tomaiuolo, cyber physical systems, and internet of things,” International Journal of “A holistic system for troll detection on twitter,” Computers in Human Human-Computer Studies, vol. 81, pp. 31–48, 2015. Behavior, vol. 89, pp. 258–268, 2018. [17] P. Fornacciari, M. Mordonini, and M. Tomaiuolo, “Social network and sentiment analysis on twitter: Towards a combined approach.” in KDWeb, [25] C. Inc. (2018) Clarifa.com web site. [Online]. Available: 2015, pp. 53–64. https://clarifai.com/ 89