=Paper= {{Paper |id=Vol-1696/invited3 |storemode=property |title=None |pdfUrl=https://ceur-ws.org/Vol-1696/invited3.pdf |volume=Vol-1696 }} ==None== https://ceur-ws.org/Vol-1696/invited3.pdf
         Exploiting Social Media to Address Fundamental Human Rights Issues

                  Massimo Poesio ♠ Ayman Alhelbawy♣,♠ Chris Fox ♠ Udo Kruschwitz♠
                                                ♣
                                                    Minority Rights Group, London, UK
                                                ♠
                                                    University of Essex, Colchester, UK
                                                                 Abstract
This invited talk provided an overview of some of our work in relation to extracting meaningful knowledge from social media feeds to
help in addressing human rights issues highlighting the potential that the rise of ‘big data’ offers in this respect looking at both sides of
the coin regarding big data and human rights: how big data can help human rights work, but also the potential dangers that can originate
from the ability to analyse massive amounts of data very quickly. The primary focus of our work is on applying natural language
processing methods to turn large-scale unstructured and partially structured data streams into actionable knowledge.

Keywords: Human Rights, Social Media, NLP, Arabic NLP


                       1.    Overview                                    mated analysis of ‘big data’ and social media and develops
Vast amounts of social media data are being generated ev-                new approaches to humanitarian and human rights work.
ery second. This represents a paradigm shift in publishing
from largely carefully edited data to user-generated content
                                                                                3.   Knowledge Transfer Partnership
which, as a result, has rapidly changed the way people ex-               The second part of our keynote talk focussed on a practi-
change and consume information as well as how they com-                  cal application of NLP techniques to support human rights
municate. Managing such data streams comes with many                     work in a collaboration between the University of Essex
challenges as has been discussed extensively in the research             and Minority Rights Group International (MRG)7 . This
literature. Nevertheless, it also offers new opportunities.              project is funded by InnovateUK8 through a Knowledge
One such opportunity is the potential to more easily detect              Transfer Partnership (KTP) project. The aim of this project
and document human rights violations. In fact, these de-                 is to provide support to civilian-led reporting of human
velopments have already resulted in changes to how human                 rights violations, in the context of MRG’s involvement in
rights organisations work. The ‘investigator on the ground’              the Ceasefire Centre, and in particular in the Ceasefire Iraq
will not be completely replaced but there are many new                   project9 . This project complements the objectives of the
modes of identifying evidence of human rights violations.                more general HRBDT project, exploring the contributions
Social media such as Facebook, YouTube and Twitter are                   of big data – and in particular, social media – to the identi-
ideal platforms to push content to the world. Obviously,                 fication of human rights violations.
there is a big challenge in validating any such postings.                Specific objectives of the collaboration with MRG are, first
Progress in natural language processing (NLP) means that                 of all, to develop a portal that will make it possible to collate
off-the-shelf tools can now be used to quickly assemble a                reports of human rights violations sent by civilians using a
processing pipeline that takes social media data and turns               variety of formats, from SMS to emails to social media.
it into structured knowledge. We are primarily interested                The portal10 , currently undergoing beta-testing and soon to
in this type of processing pipeline but that needs to be seen            go live, will allow personnel by MRG and associated orga-
as part of a bigger picture. Two research projects we are                nizations to view and analyse reports of human right viola-
involved in illustrate the point.                                        tions sent by civilians.
                                                                         Second, the project aims to develop tools to filter and anal-
2.       Human Rights, Big Data and Technology                           yse this type of information. The analysis techniques devel-
                                                                         oped so far, and at the moment tested with tweets, include
The Human Rights, Big Data and Technology (HRBDT)                        methods for detecting human rights violations reports using
project1 is an interdisciplinary research project funded by              machine learning-based text categorization to classify text
ESRC2 based mainly at the Human Rights Centre3 of the                    (e.g., tweets) according to a classification scheme which, in
University of Essex with partners that include the World                 our case, includes categories such as human right violation
Health Organisation4 , the Harvard FXB Center for Health                 reports (for tweets such as “The army of Assad in Dam-
and Human Rights5 and the Geneva Academy for Interna-                    ascus committed a terrible massacre claiming the lives of
tional Humanitarian Law and Human Rights6 . A core ac-                   dozens of children in their school”), reporting of general
tivity of one of the four workstreams is to explore and ap-              violence (as in “Four people injured as a result of a brawl
ply the potential of natural language processing to the auto-            in Darb Alarbaeen”), or reporting of an accident (e.g., “At
     1                                                                      7
       http://www.hrbdt.ac.uk                                                 http://minorityrights.org
     2                                                                      8
       http://www.esrc.ac.uk                                                  https://www.gov.uk/government/
     3
       http://www.essex.ac.uk/hrc/                                       organisations/innovate-uk
     4                                                                      9
       http://www.who.int/                                                    http://minorityrights.org/what-we-do/
     5
       https://fxb.harvard.edu                                           ceasefire-project/
     6                                                                     10
       http://www.geneva-academy.ch                                           http://iraq.ceasefire.org/
least 24 dead in the sinking of boat for illegal immigrants
off the coast of Istanbul”). As part of the project, a dataset
of over 15,000 Arabic tweets was collected and annotated
according to these categories, and used to train a classi-
fier to recognize such categories in text (Alhelbawy et al.,
2016). The objective is to apply classifiers of this type to
filter the data collected through the portal and/or to gather
additional evidence not directly sent to the portal.

                    4.    Conclusions
The emergence of ‘big data’ in the form of social media is
affecting all parts of life. This development offers a lot of
new challenges but also opportunities such as the applica-
tion of natural language processing techniques to detect and
document human rights violations. NLP tools have matured
to a level that they can easily be applied, are scalable and
robust. This stream of work offers the additional benefit
that it applies state-of-the-art technology to practical appli-
cations that will have a measurable impact on the quality of
life of many people.

                  Acknowledgements
The Human Rights, Big Data and Technology project is
funded by Economic and Social Research Council grant
ES/M010236/1. We also acknowledge support from Inno-
vateUK through a Knowledge Transfer Partnership (KTP)
project between MRG and the University of Essex, partner-
ship number 9488.

                     5.   References
Alhelbawy, A., Poesio, M., and Kruschwitz, U. (2016).
  Towards a corpus of violence acts in arabic social me-
  dia. In Proceedings of LREC.