<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Do Not Feel The Trolls</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Erik Cambria</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Praphul Chandra</string-name>
          <email>praphul.chandra@hp.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Avinash Sharma</string-name>
          <email>sharma@hp.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Amir Hussain</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>HP Labs India</institution>
          ,
          <addr-line>Bangalore</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Stirling</institution>
          ,
          <addr-line>Stirling</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The passage from a read-only to a read-write Web gave people the possibility to freely interact, share and collaborate through social networks, online communities, blogs, wikis and other online collaborative media. The democracy of the Web is what made it so popular in the past decades but such a high degree of freedom of expression also gave birth to negative side e ects { the so called `dark side' of the Web. An example of this is trolling i.e. the exploitation of the anonymity of the Web to post in ammatory and outrageous messages directed to one speci c person or community to provoke them into a desired emotional response. Online community masters usually warn users against trolls with messages such as DNFTT (Do Not Feed The Trolls) but so far this has not been enough to stop trolls trolling. The aim of this work is to use Sentic Computing, a new paradigm for the a ective analysis of natural language text, to detect trolls and hence prevent web-users from being emotionally hurt by malicious posts.</p>
      </abstract>
      <kwd-group>
        <kwd>Sentic Computing</kwd>
        <kwd>AI</kwd>
        <kwd>Semantic Web</kwd>
        <kwd>NLP</kwd>
        <kwd>Opinion Mining and Sentiment Analysis</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        In Internet slang, a troll is someone who posts in ammatory, extraneous, or o
topic messages in an online community, such as an online discussion forum, chat
room, or blog, with the primary intent of provoking other users into a desired
emotional response or of otherwise disrupting normal on-topic discussion [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>The amount of social data on the Web is on an in nite uphill and online
social networking is becoming one of the most prevalent means of expression
worldwide. Websites like Twitter, Youtube and Blogger are providing a tunnel
to link di erent parts of the world and also di erent classes of global society.</p>
      <p>The ipside of the coin, on the other hand, is rather dark, fractious and
bizarre. Social web is inherently democratic and user anonymity is gratuitous in
this space. Be it real world or virtual social web, existence of malicious faction
among inhabitants and users is inevitable.</p>
      <p>In social web context, emotional attacks on a person or a group through
malicious and vulgar comments in order to provoke response are referred to as
`trolling' and the generator is called `a troll'. The term was rst used in early
1990 and since then a lot of concern has been raised to contain or curb trolls.</p>
      <p>
        This work proposes a technique based on Sentic Computing [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], a novel
paradigm for the a ective analysis of natural language text, to automatically
detect and check web trolls. We present results that are e ective in controlling
trolls e ciently. To the best of our knowledge this work has no prior.
      </p>
      <p>The structure of the paper is the following: Section 2 argues about the
phenomenon of internet trolling, Section 3 presents the state of the art of malicious
post detection, Section 4 and Section 5 explain in detail the techniques used
within this work, Section 6 illustrates the overall process for ltering trolls,
Section 7 demostrates the potential of such process through an evaluation study,
and Section 8 comprises concluding remarks and a description of future work.
2</p>
    </sec>
    <sec id="sec-2">
      <title>The Internet Trolling Phenomenon</title>
      <p>Trolling is a method of shing where some baited shing lines are drawn through
the water, usually from a slow-moving boat, with the purpose of hooking unwary
sh. An online troll does pretty much the same.</p>
      <p>The trend of trolling, where anonymous online users bombard victims with
o ensive messages or abuse, appears to have spread a lot recently and it is
alarming most of the biggest social networking sites since, in extreme cases such
as abuse, has led some teenagers to commit suicide. These attacks usually address
not only individuals but also entire communities. For example, reports have
claimed that a growing number of Facebook tribute pages had been targeted,
including those in memory of the Cumbria shootings victims and soldiers who
died in Afghanistan.</p>
      <p>At present users cannot do much rather than manually delete abusive
messages. Current anti-trolling methods, in fact, mainly consist in identi ng
additional accounts that use the same IP address and blocking fake accounts based
on name and anomalous site activity e.g. users who send lots of messages to
non-friends or whose friend requests are rejected at a high rate.</p>
      <p>
        In July 2010 Facebook launched an application that gives users a direct link
to advice, help and the ability to report cyber problems to the Child Exploitation
and Online Protection Centre (CEOP) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Reporting trouble through a link or
a button, however, is a too slow process since social networking websites usually
cannot react instantly to these alarms. A button, moreover, does not stop users
from being emotionally hurt by trolls and it is more likely to be pushed by people
who actually do not need help rather than, for instance, children who are being
sexually groomed and do not realize it.
      </p>
      <p>For these reasons, we need systems able to automatically analyze semantics
and sentics, i.e. cognitive and a ective information, associated to natural
language in order to lter out inopportune messages and, hence, stop users from
`feeling' the trolls.</p>
    </sec>
    <sec id="sec-3">
      <title>Related Work</title>
      <p>
        A prior analysis of the trustworthiness of statements published on the Web has
been presented by Rowe and Butters [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Their approach adopts a contextual
trust value determined for the person who asserted a statement as the
trustworthiness of the statement itself. This study, however, does not focus on the
problem of trolling but rather on de ning a contextual accountability for the
detection of web, email and opinion spam.
      </p>
      <p>
        Existing approaches in these elds, in particular, can be grouped into three
main categories: keyword spotting [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ][
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], in which text is classi ed according
to the presence of fairly unambiguous spam words, lexical a nity [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ][
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], which
assigns arbitrary words a probabilistic a nity for spam content, and statistical
methods [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ][
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], which consist in calculating the valence of keywords,
punctuation and word co-occurrence frequencies on the base of a large training corpus.
      </p>
      <p>The problem with these approaches is that they mainly rely on parts of text
in which web, email and opinion spam is explicitly expressed through spam links,
commercial terms or abusive words. But, more generally, spam manifests
implicitly through context and domain dependent concepts, which makes
keywordbased approaches extremely ine ective.</p>
      <p>To overcome this problem we need to use natural language processing (NLP)
techniques that rely on semantics rather than syntactics. Within this work, in
particular, we exploit two Sentic Computing tools to extract semantics and
sentics from web posts and, eventually, process the results in order to detect and
lter trolls.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Sentic Computing</title>
      <p>Sentic Computing is a new opinion mining and sentiment analysis paradigm
which exploits AI and Semantic Web techniques to better recognize, interpret
and process opinions and sentiments in natural language text.</p>
      <p>The term Sentic Computing derives from the Latin `sentire' (the root of words
such as sentiment and sensation) and `sense' (intended as common sense) and
concerns a kind of computing that relates to, arises from and in uences opinions
and sentiments in natural language text.</p>
      <p>
        In Sentic Computing the analysis of text is not based on statistical learning
models but rather on common sense reasoning tools [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and domain-speci c
ontologies [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Di erently from statistical classi cation, which generally requires
large inputs and thus cannot appraise texts with satisfactory granularity, Sentic
Computing enables the analysis of documents not only on the page or
paragraphlevel but also on the sentence-level.
      </p>
      <p>Within this work, in particular, we exploit the combination of two Sentic
Computing tools for the extraction of semantics and sentics from web posts i.e.
a multi-dimensional vector space of common sense and a ective knowledge
(Section 4.1) coupled with a novel emotion categorization model born from the idea
that our mind consists of four independent emotional spheres, whose di erent
levels of activation make up the total emotional state of the mind (Section 4.2).
4.1</p>
      <sec id="sec-4-1">
        <title>A ectiveSpace</title>
        <p>
          A ectiveSpace [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] is a language visualization system which transforms natural
language from a linguistic form into a multi-dimensional space. A ectiveSpace is
built by blending ConceptNet [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], a semantic network of common sense
knowledge, and WordNet-A ect [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], a linguistic resource for the lexical representation
of emotions. This alignment operation yields A ectNet : a new dataset in which
common sense and a ective knowledge coexist i.e. a matrix 14,301 117,365
whose rows are concepts (e.g. `dog' or `bake cake'), whose columns are either
common sense and a ective features (e.g. `isA-pet' or `hasEmotion-joy'), and
whose values indicate truth values of assertions.
        </p>
        <p>Therefore, in A ectNet, each concept is represented by a vector in the space of
possible features whose values are positive for features that produce an assertion
of positive valence (e.g. `a penguin is a bird'), negative for features that produce
an assertion of negative valence (e.g. `a penguin cannot y') and zero when
nothing is known about the assertion. The degree of similarity between two
concepts, then, is the dot product between their rows in A ectNet. The value of
such a dot product increases whenever two concepts are described with the same
feature and decreases when they are described by features that are negations of
each other.</p>
        <p>
          When performed on A ectNet, however, these dot products have very high
dimensionality (as many dimensions as there are features) and are di cult to
work with. In order to approximate these dot products in a useful way, we project
all of the concepts from the space of features into a space with many fewer
dimensions i.e. we reduce the dimensionality of A ectNet by means of principal
component analysis (PCA). In particular, we perform truncated singular value
decomposition (TSVD) [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] on A ectNet and obtain a new matrix, A ectNet*,
which forms a low-rank approximation of the original data. This estimation
is based on minimizing the Frobenius norm of the di erence between A ectNet
and A ectNet* under the constraint rank(A ectNet* ) = k and it represents the
best approximation of A ectNet in the least-square sense (for the Eckart{Young
theorem [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]).
        </p>
        <p>In particular, we choose to discard all but the rst 100 principal
components and hence obtain A ectiveSpace (Fig. 1), a 100-dimensional space in which
di erent vectors represent di erent ways of making binary distinctions among
concepts and emotions. In A ectiveSpace common sense and a ective
knowledge are in fact combined, not just concomitant, i.e. everyday life concepts like
`have breakfast', `meet people' or `watch tv' are linked to a hierarchy of a ective
domain labels.</p>
        <p>By exploiting the information sharing property of TSVD, concepts with the
same a ective valence are likely to have similar features i.e. concepts
concerning the same opinion tend to fall near each other in the vector space. Concepts
and emotions are represented by vectors of 100 coordinates: these coordinates
can be seen as describing concepts in terms of `eigenmoods' that form the axes
of A ectiveSpace i.e. the basis e0,...,e99 of the vector space. For example, the
most signi cant eigenmood, e0, represents concepts with positive a ective
valence. That is, the larger a concept's component in the e0 direction is, the more
a ectively positive it is likely to be. Consequently concepts with negative e0
components have negative a ective valence.
4.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>The Hourglass of Emotions</title>
        <p>
          This model is a variant of Plutchik's emotion categorization [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] and constitutes
an attempt to emulate Marvin Minsky's conception of emotions. Minsky sees the
mind as made of thousands of di erent resources and believes that our emotional
states result from turning some set of these resources on and turning another
set of them o [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. Each such selection changes how we think by changing
our brain's activities: the state of anger, for example, appears to select a set of
resources that help us react with more speed and strength while also suppressing
some other resources that usually make us act prudently.
        </p>
        <p>The Hourglass of Emotions (Fig. 2) is speci cally designed to recognize,
understand and express emotions in the context of human-computer interaction
(HCI). In the model, in fact, a ective states are not classi ed, as often happens
in the eld of emotion analysis, into basic emotional categories, but rather into
four concomitant but independent dimensions in order to understand how much
respectively:
1. the user is happy with the service provided (Pleasantness)
2. the user is interested in the information supplied (Attention)
3. the user is comfortable with the interface (Sensitivity)
4. the user is disposed to use the application (Aptitude)</p>
        <p>
          Each a ective dimension is characterized by six levels of activation, called
`sentic levels', which determine the intensity of the expressed/perceived emotion
as a oat 2 [
          <xref ref-type="bibr" rid="ref3">-3,3</xref>
          ]. These levels are also labelled as a set of 24 basic emotions
(six for each of the a ective dimensions) in a way that the model can specify the
a ective information associated to text both in a dimensional and in a discrete
form. The dimensional form, in particular, is called `sentic vector' and it is a four
dimensional vector that can potentially express any human emotion in terms of
Pleasantness, Attention, Sensitivity and Aptitude. Some particular sets of sentic
vectors have special names as they specify well-known compound emotions. For
example the set of sentic vectors with a level of Pleasantness 2 (1,2] (`joy'), a
null Attention, a null Sensitivity and a level of Aptitude 2 (1,2] (`trust') are
called `love sentic vectors' since they specify the compound emotion of `love'.
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Troll Detector</title>
      <p>The main aim of the Troll Detector is to identify malicious contents in natural
language text with a certain con dence level. To train the detector, we rst
identify the concepts most commonly used by trolls (Section 5.1) and then expand
the resulting knowledge base with semantically related concepts (Section 5.2).
We nally de ne a method to calculate trollness i.e. the probability for a post
to be edited by a troll (Section 5.3).
5.1</p>
      <sec id="sec-5-1">
        <title>CF-IOF Weighting</title>
        <p>
          The technique we use to identify the concepts commonly used by trolls is called
CF-IOF [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] (concept frequency { inverse opinion frequency) and it is an
approach similar to TF-IDF weighting which evaluates how important a concept
is to a set of opinions concerning the same topic.
        </p>
        <p>We rst calculate the frequency of a concept ci for a given topic j by counting
the occurrences of the concept ci in the set of available j-tagged opinions and
divide the result by the sum of occurrences of the same concept in the whole set
of opinions concerning j. We then multiply this frequency by the logarithm of
the total number of opinions divided by the number of opinions containing the
concept ci, that is:
(CF -IOF )i = X
j</p>
        <p>ni;j
Pk nk;j
log</p>
        <p>jOj
jfo : ci 2 ogj
where ni;j is the number of occurrences of the considered concept ci in the
opinions tagged with the topic j, jfo : ci 2 ogj the number of opinions where ci
appears and jOj the total number of opinions.</p>
        <p>A high weight in CF-IOF is reached by a high concept frequency (in the
given opinions) and a low opinion frequency of the concept in the whole
collection of opinions. Therefore, thanks to CF-IOF weights, we manage to lter out
common concepts and detect relevant concepts that are usually used by trolls to
emotionally attack unaware users.
5.2</p>
      </sec>
      <sec id="sec-5-2">
        <title>Spectral Association</title>
        <p>
          In order to expand the set of concepts previously obtained by applying CF-IOF,
we use a technique called spectral association [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] that involves assigning values,
or activations, to `seed concepts' and applying an operation that spreads their
values across the ConceptNet graph.
        </p>
        <p>This operation, an approximation of many steps of spreading activation,
transfers the most activation to concepts that are connected to the key
concepts by short paths or many di erent paths in common sense knowledge. In
particular, we build a matrix C that relates concepts to other concepts, instead
of their features, and add up the scores over all relations that relate one concept
to another, disregarding direction.</p>
        <p>Applying C to a vector containing a single concept spreads that concept's
value to its connected concepts. Applying C2 spreads that value to concepts
connected by two links (including back to the concept itself). But what we'd really
like is to spread the activation through any number of links, with diminishing
returns, so perhaps the operator we want is:
1 + C +</p>
        <p>C2
2!
+</p>
        <p>C3
3!
+ ::: = eC</p>
        <p>We can calculate this odd operator, eC , because we can factor C. C is already
symmetric, so instead of applying Lanczos' method to CCT and getting the SVD,
we can apply it directly to C and get the spectral decomposition C = V V T .
As before, we can raise this expression to any power and cancel everything but
the power of . Therefore, eC = V e V T . This simple twist on the SVD lets us
calculate spreading activation over the whole matrix instantly.</p>
        <p>As with the SVD, we can truncate these matrices to k axes and therefore save
space while generalizing from similar concepts. We can also rescale the matrix
so that activation values have a maximum of 1 and do not tend to collect in
highly-connected concepts such as `person', by normalizing the truncated rows
of V e =2 to unit vectors, and multiplying that matrix by its transpose to get a
rescaled version of V e V T .
5.3</p>
      </sec>
      <sec id="sec-5-3">
        <title>Calculating Trollness</title>
        <p>In order to calculate the probability for a post to be edited by a troll, we exploit
both the semantics and the sentics associated to it.</p>
        <p>
          For each concept contained in the post, the Troll Detector checks if this
belongs to the set of `troll concepts' calculated through spectral association and
exploits its relative sentic vector to check if it carries malicious a ective charge.
By analyzing a set of 1000 o ensive phrases extracted from Wordnik [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ], in
fact, we found that, statistically, a post is likely to be edited by a troll when
its average sentic vector has a high absolute value of Sensitivity and a very low
polarity. Hence we de ned the trollness ti associated to a concept ci as a oat
2 [0; 1] such that:
ti(ci) =
si(ci) + jSnsit(ci)j
5
pi(ci)
where si ( oat 2 [0; 1]) is the semantic similarity of ci wrt any of the CF-IOF
seed concepts, pi ( oat 2 [ 1; 1]) is the polarity associated to the concept ci and
5 is the normalization factor (the maximum value of the numerator in fact is
given by a similarity of 1, a Sensitivity of 3 or -3 and a polarity equal to -1). In
particular, pi is de ned [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ] as:
pi(ci) =
        </p>
        <p>P lsnt(ci) + jAttnt(ci)j</p>
        <p>
          jSnsit(ci)j + Aptit(ci)
9
where 9 is the normalization factor (since the numerator's maximum value is
given by the sentic vectors [
          <xref ref-type="bibr" rid="ref3 ref3 ref3">3, 3, 0, 3</xref>
          ] and the minimum by [
          <xref ref-type="bibr" rid="ref3 ref3 ref3">3, 0, 3, 3</xref>
          ]).
        </p>
        <p>In the formula, Attention and Sensitivity are taken in absolute value since,
from the point of view of polarity rather than a ection, all of their sentic values
represent positive and negative values respectively (e.g. `anger' is positive in the
sense of level of activation of Sensitivity but negative in terms of polarity and
`surprise' is negative in the sense of lack of Attention but positive from a polarity
point of view).</p>
        <p>Hence, the total trollness of a post containing N concepts is de ned as:
t =
i=1</p>
        <p>N
5 X 9 si(ci) + 10 jSnsit(ci)j
9</p>
        <sec id="sec-5-3-1">
          <title>P lsnt(ci)</title>
          <p>N
jAttnt(ci)j</p>
        </sec>
        <sec id="sec-5-3-2">
          <title>Aptit(ci)</title>
          <p>This information is stored, together with post type and content plus sender
and receiver ID, in an interaction database that keeps trace of all the messages
and comments interchanged between users within the same social network.</p>
          <p>Posts with a high level of trollness (current threshold has been set, using a
trial and error approach, to 60%) are labelled as troll posts and, whenever a
speci c user addresses more than two troll posts to the same person or community,
his/her sender ID is labelled as troll for that particular receiver ID.</p>
          <p>All the past troll posts sent to that particular receiver ID by that speci c
sender ID are then automatically deleted from the website (but kept in the
database with the possibility for the receiver to either visualize them in an
apposite troll folder and, in case, restore them). Moreover, any new post with a
high level of trollness edited by a user labelled as troll for that speci c receiver is
automatically blocked i.e. saved in the interaction database but never displayed
in the social networking website.
6</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Troll Filtering Process</title>
      <p>The process for ltering trolls (illustrated in Fig. 3) comprises four main
components: a NLP module, which performs a rst skim of the document, a Semantic
Parser, whose aim is to extract concepts from the lemmatized text, A
ectiveSpace, for the extraction of sentics from the given concepts, and the Troll Detector,
whose aim is to detect and eventually block the troll.</p>
      <p>The NLP module interprets all the a ective valence indicators usually
contained in text such as special punctuation, complete upper-case words,
onomatopoeic repetitions, exclamation words, negations, degree adverbs and
emoticons, and eventually lemmatizes text.</p>
      <p>The Semantic Parser then deconstructs text into concepts and provides, for
each of them, the relative frequency, valence and status i.e. the concept's
occurrence in the text, its positive or negative connotation, and the degree of intensity
with which the concept is expressed.</p>
      <p>The A ectiveSpace module projects the retrieved concepts into the vector
space, clustered wrt the Hourglass model, and it infers the a ective valence of
these, in terms of Pleasantness, Attention, Sensitivity and Aptitude, according
to the positions they occupy in the space.</p>
      <p>This information, encoded as a sentic vector, is given as input to the Troll
Detector which exploits it, together with the semantic information coming directly
from the Semantic Parser, to calculate the post's trollness and, eventually, to
detect and block the troll (according to the information stored in the interaction
database). As an example of Troll Filtering Process output, we can consider a
troll post recently addressed to the Indian author Chetan Bhagat: \You can't
write, you illiterate douchebag, so quit trying, I say!!!". In this case we have a
very high level of Sensitivity (corresponding sentic level `rage') and a negative
polarity, which give a high percentage of trollness, as shown below:
&lt;Concept: !`write'&gt;
&lt;Concept: `illiterate'&gt;
&lt;Concept: `douchebag'&gt;
&lt;Concept: `quit try'&gt;
&lt;Concept: `say'&gt;
Semantics: 0.69
Sentics: [0.0, 0.48, 2.7, -1.22]
Polarity: -0.38</p>
      <p>Trollness: 0.75
7</p>
    </sec>
    <sec id="sec-7">
      <title>Evaluation</title>
      <p>In order to perform a rst evaluation of our system, we considered a set of 500
tweets (most of which fetched from Wordnik) manually annotated as troll and
non-troll posts. We considered true positives those posts with both a positive
troll- ag and a trollness 2 [0.6, 1] and those with both a negative troll- ag and
a trollness 2 [0, 0.6). The threshold has been set to 60% based on trial and error
over a separate dataset of 50 tweets.</p>
      <p>Results show that, by using the Troll Filtering Process, in ammatory and
outrageous messages can be identi ed with good precision (82%) and decorous
recall rate (75%). In particular, the F-measure value (78%) is signi cantly high
compared to the corresponding F-measure rates of the baseline methods (53%
for keyword spotting, 59% for lexical a nity, 66% for statistical methods).</p>
      <p>However, we expect to obtain much better results by evaluating the process
at interaction-level rather than just at post-level. In the next future, in fact,
we plan to evaluate the Troll Filtering Process by monitoring not just single
posts but also users' holistic behaviour within the same social network (i.e.
contents and recipients of their interaction) and submit further results elsewhere
for publication.
8</p>
    </sec>
    <sec id="sec-8">
      <title>Conclusion and Future E orts</title>
      <p>As the Web plays a more and more signi cant role in people's social lives, it
contains more and more information concerning their opinions and feelings. After
the explosion of Web 2.0, a lot of users have been exploiting this trend, together
with the anonymity of the Web, to attack speci c people or communities with
in ammatory and outrageous messages and, hence, provoke them into a desired
emotional response.</p>
      <p>For their endish nature, these users have been labelled as trolls. Online
community masters have desperately tried to warn users against these mischievous
people with messages such as DNFTT (Do Not Feed The Trolls) but so far this
has not been enough to stop trolls trolling.</p>
      <p>Within this work we exploited Sentic Computing, a new paradigm for the
a ective analysis of natural language text, to design a process capable to extract
semantics and sentics from web-posts and infer from these the truthfulness of
user interaction.</p>
      <p>The main aim of the Troll Filtering Process, in fact, is to exploit the cognitive
and a ective information associated to natural language text to de ne a level
of trollness of each post and, according to this, classify users and prevent the
malicious ones from emotionally hurting other people or communities within the
same social network.</p>
      <p>In the next future, we plan to improve the process by using a much bigger
dataset for training the Troll Detector and also to perform an evaluation of
the system at interaction-level rather than just at post-level, in order to better
understand, and hence prevent, trolls' behaviour.</p>
      <p>Eventually, we plan to enhance the system by making most of its
functionalities available as web-services in a way that the Troll Filtering Process could be
easily embedded in any social networking website and, hence, change the
meaning of the popular acronym often displayed in these websites, DNFTT, from a
shadowy and often ine ective suggestion to a reassuring and deterrent slogan {
Do Not Feel The Trolls.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>1. http://en.wikipedia.org/wiki/Troll (Internet) { Wikipedia</mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Cambria</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hussain</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Havasi</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eckl</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Sentic Computing: Exploitation of Common Sense for the Development of Emotion-Sensitive Systems</article-title>
          . LNCS, vol.
          <volume>5967</volume>
          , pp.
          <volume>153</volume>
          {
          <fpage>161</fpage>
          . Springer{Verlag, Berlin Heidelberg (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3. http://telegraph.co.uk/technology/facebook/7939721/Facebook-vows
          <article-title>-new-securitymeasures-to-combat-alarming-trolling-abuse-trend.html, The Telegraph (</article-title>
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Rowe</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Butters</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          : Assessing Trust:
          <article-title>Contextual Accountability</article-title>
          . In: SPOT at ESWC,
          <string-name>
            <surname>Heraklion</surname>
          </string-name>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Dave</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lawrence</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Pennock</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Mining the Peanut Gallery: Opinion Extraction and Semantic Classi cation of Product Reviews</article-title>
          . In: WWW, Budapest (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Chandrasekaran</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karayanan</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Upadhyaya</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <string-name>
            <surname>Towards Phising E-Mail</surname>
            <given-names>Detection</given-names>
          </string-name>
          <article-title>Based on Their Structural Properties</article-title>
          . In: SCSS, New York (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Hu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Mining and Summarizing Customer Reviews</article-title>
          . In: KDD,
          <string-name>
            <surname>Seattle</surname>
          </string-name>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Jindal</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Analyzing and Detecting Review Spam</article-title>
          . In: ICDM,
          <string-name>
            <surname>Omaha</surname>
          </string-name>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Li</surname>
            .,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhong</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Combining Multiple Email Filters Based on Multivariate Statistical Analysis</article-title>
          . In: ISMIS,
          <string-name>
            <surname>Bari</surname>
          </string-name>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Jindal</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Opinion Spam and Analysis</article-title>
          . In: WSDM, Palo Alto (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Cambria</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hussain</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Havasi</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eckl</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Common Sense Computing: From the Society of Mind to Digital Intuition and Beyond</article-title>
          .
          <source>LNCS</source>
          , vol.
          <volume>5707</volume>
          , pp.
          <volume>252</volume>
          {
          <fpage>259</fpage>
          . Springer{Verlag, Berlin Heidelberg (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Cambria</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grassi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hussain</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Havasi</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Sentic Computing for Social Media Marketing</article-title>
          . To appear
          <source>in: Multimedia Tools and Applications</source>
          . Springer (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Cambria</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hussain</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Havasi</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eckl</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>A ectiveSpace: Blending Common Sense and A ective Knowledge to Perform Emotive Reasoning</article-title>
          . In: WOMSA at CAEPIA,
          <string-name>
            <surname>Seville</surname>
          </string-name>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Havasi</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Speer</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alonso</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>ConceptNet 3: a Flexible, Multilingual Semantic Network for Common Sense Knowledge</article-title>
          . In: RANLP,
          <string-name>
            <surname>Borovets</surname>
          </string-name>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Strapparava</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Valitutti</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>WordNet-A ect: an A ective Extension of WordNet</article-title>
          . In: LREC, Lisbon (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Wall</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rechtsteiner</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rocha</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Singular Value Decomposition and Principal Component Analysis</article-title>
          . In: Berrar,
          <string-name>
            <surname>D.</surname>
          </string-name>
          et al. (eds.)
          <article-title>A Practical Approach to Microarray Data Analysis</article-title>
          . pp.
          <volume>91</volume>
          {
          <fpage>109</fpage>
          . Kluwer,
          <string-name>
            <surname>Norwell</surname>
          </string-name>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Eckart</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Young</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>The Approximation of One Matrix by Another of Lower Rank</article-title>
          .
          <source>Psychometrika</source>
          <volume>1</volume>
          (
          <issue>3</issue>
          ),
          <volume>211</volume>
          {
          <fpage>218</fpage>
          (
          <year>1936</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Plutchik</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          :
          <source>The Nature of Emotions. American Scientist</source>
          <volume>89</volume>
          (
          <issue>4</issue>
          ),
          <volume>344</volume>
          {
          <fpage>350</fpage>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Minsky</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <source>The Emotion Machine. Simon and Schuster</source>
          , New York (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Cambria</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Speer</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Havasi</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hussain</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>SenticNet: a Publicly Available Semantic Resource for Opinion Mining</article-title>
          .
          <source>In: AAAI CSK10</source>
          ,
          <string-name>
            <surname>Arlington</surname>
          </string-name>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Havasi</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Speer</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Holmgren</surname>
          </string-name>
          , J.:
          <source>Automated Color Selection Using Semantic Knowledge. In: AAAI CSK10</source>
          ,
          <string-name>
            <surname>Arlington</surname>
          </string-name>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>22. http://wordnik.com { Wordnik</mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Cambria</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hussain</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Havasi</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eckl</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Munro</surname>
          </string-name>
          , J.:
          <article-title>Towards Crowd Validation of the UK National Health Service</article-title>
          . In: WebSci10,
          <string-name>
            <surname>Raleigh</surname>
          </string-name>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>