<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Workshop,
Glasgow, Scotland</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Sweet FA: Sentiment, Swearing and Soccer</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>David Corney</string-name>
          <email>d.p.a.corney@rgu.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Computing &amp; Digital Media, Robert Gordon University</institution>
          ,
          <addr-line>Aberdeen</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2014</year>
      </pub-date>
      <volume>0</volume>
      <fpage>1</fpage>
      <lpage>04</lpage>
      <abstract>
        <p>The sentiments expressed by football fans in the stories that they tell are often intensi ed by the use of swear words. Football provides a useful test bed for sentiment analysis due to the symmetric nature of events in matches: what is good for one team is bad for the other. We can relate social media messages to the narrative that fans of a given team might be expected to construct. We use these features of football-related tweets to investigate some common assumptions about swearing as a sentiment marker on social networks. The results demonstrate that swearing and other sentiment markers depend heavily on context, and that understanding this context is essential if sentiment is to be detected faithfully. We also show that swearing is not always indicative of negative sentiment.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Football fans construct a shared narrative about their
team's performance. Fanzines, terrace talk and pub
conversations have always formed part of fan
identity and communication. Online social networks have
added another venue in which those shared identities
can be constructed.</p>
      <p>Football stories shared by fans are emotional, and
narratives shared via Twitter are freighted with
sentiment. The language used by fans is often scattered
with obscenities which provide colour and also
intensify expressions of sentiment. In this paper, we
describe an initial study exploring the way that
sentiment is intensi ed by swearing in Twitter messages
from football fans during games. By limiting our
analysis to messages containing swear words, we focus on
the more intense expressions of sentiment.</p>
      <p>Football provides a particularly interesting test bed
for sentiment analysis due to the symmetric nature of
events within matches: what is good for one team is
(equally) bad for the other. For example, a goal has
two opposing \valences" which makes it possible to
search for and analyze approximately equal volumes
of positive and negative sentiment per event. For
simplicity, we restrict our analysis to FA English Premier
League, which is the most-watched football league in
the world.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related work</title>
      <p>
        Swearing is a feature of social networks and has been
for a long time (by internet standards.) In his 2008
paper [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], Thelwall studied the occurrence of swearing
in MySpace pro les. Swearing is not only widespread
(67% of UK MySpace pro les of 16-19 year olds
contained some form of swearing) but varied. Some of the
swearing was self-directed (\tehe i am sorry.. i m such
a sleep deprived twat alot of the time! lol") or clearly
a ectionate (\Chris you're slacking again !!! Get the
fuck o myspace lol !! you good anyway?") It was
frequently used in an approbatory fashion (\Thanks
for the party last night it was fucking good and you
are great hosts." \That 50's rock and roll weekender
was fucking mint!") This paper demonstrates that,
for young people at least, swearing is part of their
performed identity online. Not only that, swearing is
multipurpose, used to demonstrate amusement, a ection
and self-deprecation as well as negative sentiment.
      </p>
      <p>
        Swearing is known to be a response to { and a
mediator of { emotions [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], as are other forms of language
that are prima facie abusive or insulting [
        <xref ref-type="bibr" rid="ref1 ref3 ref7">1, 3, 7</xref>
        ]. In
workplaces in the UK and New Zealand, for example,
\jocular abuse" is part of team bonding. Swearing,
piss-taking and other forms of abuse are not only
tolerated, but form an essential part of the workplace
group dynamic.
      </p>
      <p>Even in a setting as replete with strong language
as football, there is a strong imperative to keep the
reality of the discourse of swearing o the airwaves.
Consider this tweet for example:1
Ivanovic over kicked the ball and a Chelsea fan angrily
swears \fucking cunt" and the commentator emotionally
apologises Lmao #CFC #Setanta1.</p>
      <p>However, as we demonstrate here, Twitter as a
social network provides a much less ltered view of fans'
actual language. Swearing is used regularly to
intensify sentiment and is, as such, a relevant marker in
sentiment analysis.</p>
      <p>
        Sentiment analysis is widely used by brand
managers to monitor public perception of their products
and services, including Amazon reviews and Ebay
feedback. However, much of this work assumes that
swearing typically represents negative sentiment. Hu
and Liu have created lists of positive and negative
words associated with opinion or sentiment [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. In
the current version of list list2, words such as \shit",
\fuck" and \damn" are included in the negative set
and not in the positive set. While they argue that
having such an opinion lexicon is not su cient for
sentiment analysis, it is nonetheless a useful tool [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        A more sophisticated analysis of detecting
sentiment from swear words has been presented by
Maynard et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. They also use a gazetteer of opinion
words, including swear words. They recognise that
swear words may be used to intensify the expression
of a sentiment, be it positive or negative. However, if a
sentence contains swear words and no words recognised
as implying a positive sentiment, then they assume the
sentence is negative. While this may often be an
effective approach, we would add that the wider context
can often be used to interpret the sentiment even of
isolated swear words as we discuss below (Section 4.8).
      </p>
      <p>
        Twitter has been used to help automatically detect
events during football matches [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. That system
collects tweets based on hashtags and then detects spikes
in the volume of tweets collected which they associate
with major game events. For each spike, they analyze
the words of the tweets and classify the event using
machine learning. They consider a xed range of events
(goals, own-goals, red cards, yellow cards and
substitutions) and compare these classi cations to the o cial
match data to evaluate their system. They also
clas1This, and all other tweets quoted and analysed in the paper
are available on request from the authors.
      </p>
      <p>2http://www.cs.uic.edu/~liub/FBS/sentiment-analysis.
html#lexicon, dated 12/3/2011
sify individual tweeters as fans of one team or another
by counting the number of mentions of each team over
several matches, similar to our approach (Section 3.2).</p>
      <p>
        Similar methods have been used to classify events
during American football (NFL) matches [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Tweets
were collected based on team names and NFL
terminology. Events were also detected by nding spikes in
the volume of tweets and each event was assigned to
one of a xed number of classes, in this case using
lexicographic analysis. Their system was very e ective
at detecting the most signi cant scoring events such
as touchdowns, but was less e ective at nding less
signi cant events like interceptions and eld goals.
3
3.1
      </p>
    </sec>
    <sec id="sec-3">
      <title>Methods</title>
      <sec id="sec-3-1">
        <title>Collecting Tweets</title>
        <p>On a typical Saturday afternoon, 5 or 6 English
Premier League matches kick o at 3pm, at the same time
as many matches from lower divisions. For three
consecutive weeks, we collected public tweets discussing
the matches. For each match-day, we crawled Twitter
using their standard Streaming API3 and ltered using
a total of 28 hashtags. These are the standard
abbreviation hashtags of the 20 teams (e.g. #CFC, #MCFC)
along with certain widely-used hashtags that
indicate team support (e.g. #KTBFFH for Chelsea and
#YNWA for Liverpool, both based on popular
supporters' chants). We collected tweets starting 30
minutes before kick-o and continued until 30 minutes
after nal whistles. On average, we collected 125,070
tweets per match-day. Our analysis focuses on period
from 3pm to 5:00pm each Saturday. This includes 90
minutes of football, the half-time break (c. 15
minutes) and a few minutes of post-match response.</p>
        <p>Throughout this analysis, we have made frequent
use of the mainstream media accounts of matches, for
example to verify when goals were scored, players were
sent o and other noteworthy events occurred. In
particular, we used the live-blogging \minute by minute"
commentaries provided by the BBC for the three
Saturday's in question, namely 7/12/20134, 14/12/20135
and 21/12/20136. Knowing these match events and
the team supported by each fan (see below), we can
derive an \emotional ground truth" which we expect to
then be re ected in the language used in fans' tweets.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Linking Tweets to Teams</title>
        <p>We identi ed the team that each Twitter user
supports (if any). Initial manual inspection of a number of
3https://dev.twitter.com/docs/api/1.1/post/statuses/
filter
4http://www.bbc.co.uk/sport/0/football/25264555
5http://www.bbc.co.uk/sport/0/football/25365181
6http://www.bbc.co.uk/sport/0/football/25463334
tweets suggests that fans tend to use their team's
standard abbreviation hashtag greatly more often than any
other teams', irrespective of sentiment. This identi
cation was then con rmed by inspection of the text of
a sample of the tweets. We therefore de ne a fan's
degree of support for one team as how many more times
that team's abbreviation is mentioned by the user
compared to their second-most mentioned team. For each
user, we aggregated all their tweets and counted the
total number of times they mention each team. Here,
we include as \fans" any user with a degree of two or
more and treat everyone else as neutral. Having
assigned fans to teams, we can then associate speci c
tweets with speci c teams even when no team is
mentioned, if other tweets from the same person make their
allegiance clear.</p>
        <p>To evaluate this method, we randomly selected 100
tweeters that our algorithm had predicted to be fans
of various teams. We then examined the tweets in our
collection from each person to determine which team
they expressed support for, if any. We manually
labelled them as supporters of a speci c team, neutral
or unclear (e.g. due to non-English tweets). Of the
100 people thus analyzed, 93% were correctly assigned
to teams by the algorithm; 7% appeared to be neutral
commentators who showed no clear preference for any
team and one was a spam account unrelated to
football (but using a team hashtag to attract clicks). Our
algorithm mis-assigned them to which speci c team
they happened to mention most often. In no case was
a clear fan of one team assigned to any other, giving
us a strong con dence in the rest of our analysis.
3.3</p>
      </sec>
      <sec id="sec-3-3">
        <title>Filtering Tweets by Use of Swearing</title>
        <p>The use of twitter by fans during matches results in
an average of over 125,000 tweets per game
containing the team's hashtag. However, we are particularly
interested in those tweets that contain not only
indications of sentiment, but indications of a high-intensity
sentiment. For this reason we lter the tweets that
we collected using the stems of the two most
common swearwords in the English language: \shit" and
\fuck." 7 Post ltering, our corpus consisted of an
average of 6483 tweets per match-day, meaning that more
than 1 in 20 tweets (5.36%) from football fans contain
the words `shit' or `fuck' or their derivations.</p>
        <p>We further ltered these tweets manually to remove
messages that were not full-on \fannishness." So, for
example, tweets asking about television coverage or
discussing betting on the match were not assessed for
sentiment. We then manually assessed the remaining
tweets, with both authors coding the sentiment(s) in
7Because we used these terms as stems, the lter also
matched \shitting, batshit, shite, fucking, fucker, fucked..."
sets of tweets.</p>
        <p>Once we had collected our corpus of tweets from
fans watching matches we could begin the process of
relating them to a narrative. That narrative is
multilevel and consists of discourse about events, games and
the English Premier League competition as a whole.
The competition and the game are easily de ned: the
English Premier League runs from August to May each
year and each of the 20 teams plays the other twice,
once at home and once away. The winner is decided
on the number of points acquired from the matches
and ties for position are decided using the number of
goals a team has scored minus the number they have
conceded.</p>
        <p>Matches, too, are clearly bounded. We know which
teams are involved, the ground at which they are
playing, the start and end time of the match.</p>
        <p>Events are more di cult to de ne. There are some
canonical events that are noted as part of the
statistical record: for example, goals, fouls, bookings, free
kicks and penalties are all recorded with an associated
time stamp. However, some events are not part of the
o cial record, despite being matters of signi cance to
the fans. Take the example, shown later, of the
Liverpool captain Steven Gerrard su ering a recurrence of
an old injury, which threatened to prevent him taking
part in the next few matches. We can assign a
timestamp to this event, but we need to mine the event
commentary in detail to infer what has taken place.</p>
        <p>Even more challenging to identify are the events
that occupy a timeline rather than a timepoint. For
the sake of clarity we will call states that persist \
uents" and reserve the term \events" for those actions
that change the state of a match. Examples of uents
include \the run of play" - an informal de nition of
which team is currently dominating in terms of
possession.
3.4</p>
      </sec>
      <sec id="sec-3-4">
        <title>Complicating factors</title>
        <p>A number of factors make analyzing a typical Saturday
afternoon's football especially challenging.</p>
        <p>First, numerous games are played simultaneously.
Here, we've been considering the ve or six
Premiership matches being played from 3pm, but at that same
moment, up to 36 other matches may be being played
in the three other professional football divisions in
England. There are also matches played in the
(separate) Scottish league, not to mention other matches in
other countries.</p>
        <p>Second, there is a great variety in the response to
di erent classes of event. The emotional impact of
being awarded free-kick expires more rapidly than the
emotional impact of scoring a goal from that; and the
emotional impact of the goal expires less rapidly if that
goal has changed who is winning. Furthermore, events
within a single match typically overlap to some degree
and have their own duration. They are not discrete,
point-sources as may be assumed in theoretical
analysis.</p>
        <p>Thirdly, assigning fans to teams is made harder by
the tendency of some fans to use the hashtags of their
opposing team in order to get the attention of fans of
that team, as opposed to being one of them. Insults
and banter are only e ective if their target is aware of
them.</p>
        <p>Finally, it requires some judgement to assess the
degree of sincerity of messages, due to the use of sarcasm,
irony etc. Both authors are native English speakers
not unversed in such matters, but nonetheless, some
messages may have been misinterpreted.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Case studies</title>
      <p>The tweets that we examined for these case studies
came from a number of matches in the English Premier
League that took place in the rst half of December.
This window of time can roughly be considered as
midseason: fans have had around 15 league games to assess
their team`s performance this season, but with over 20
games still to play, there is still plenty to play for. This
is the time of year that matches begin to be referred
to as \real six-pointers"8.</p>
      <p>The matches that we draw examples from
are: Stoke City-Chelsea, Liverpool-West Ham (both
7/12/2013), Cardi -West Bromwich Albion,
ChelseaCrystal Palace (14/12/2013) and Manchester
UnitedWest Ham United (21/12/2013). For context, the top
four positions in the league on December 9th were
occupied by Arsenal, Liverpool, Chelsea and Manchester
City respectively. Manchester United were in (an
unusually low) 9th place, just above Hull and Stoke. West
Ham and Crystal Palace were both in the bottom four
just below West Brom and Cardi . It is also worth
noting that di erent clubs have very di erent numbers of
fans, with average home attendances varying between
20,000 and 75,000. This is likely to be re ected in the
number of tweets in our collection related to each club.</p>
      <p>From the analysis of only those tweets containing
swearing we observed that sentiment is expressed in
a complex and sometimes counter-intuitive manner,
several aspects of which we now discuss.</p>
      <p>8In the English football leagues, teams are awarded three
points for a win, a point for a draw and no points if they lose.
When teams are playing against opponents that are close to
them in the league table it is important not only to win all
three points, but also to deny the other team the opportunity
to score any points. Hence the fact that these games are referred
to { with more poetry than numeracy, perhaps { as \real
sixpointers."
4.1</p>
      <p>\We're Shit and We Know We Are"
In the English Premier League it is not uncommon
for fans to be highly critical of their own team. This
criticism is levelled at individual players, the team as
a whole, the manager and { very occasionally { the
fans themselves.</p>
      <p>This is not an uncommon occurrence and { for some
sets of fans in particular { accounts for the majority
of swearing use in large segments of the match. For
example, in the opening 42 minutes of the Liverpool vs
West Ham United match (7th December), over half of
the tweets we collected from Liverpool fans were gripes
about their manager, Brendan Rodgers, and several
team members. For example, the following tweets all
arrived within a few minutes and all were sent by
Liverpool fans:
15:17:57 @lfc Joe allen9, easily beaten un the middle of
the park. Fuck Rodgers. #ynwa #lfc #LivWhu
15:21:41 Mignolet10 bails our shite defence out once
again. #LFC
15:25:31 This is like the hull game, creating fuck all atm
and mignolet keeping us in the game #LFC
15:26:17 Please learn to pass sterling11! Or fuck o #lfc
15:27:17 This game is driving me mad. Just fucking
score already damnit. #LFC
15:30:07 Amazes me that people don't understand that
we are shite, one player being world class doesn't make
a quality team #LFC
15:31:30 The fatc people on here are praising defenders
and Allen means our attack is shite and nothing more.
#LFC
15:32:28 For a team that focuses so much on passing,
passings been shite today #lfc
15:33:28 Henderson12, sterling are shit #lfc
15:34:18 Raheem Sterling knows fuck all. He should join
Newcastle or Swansea. #lfc</p>
      <p>When criticism is sparked by a particular event fans
may single out an individual for praise while criticising
the team as a whole. For example, the tweet \Mignolet
bails our shite defence out once again," is
simultaneously a whinge about the team and praise for an
individual. Inversely, when Chelsea keeper Petr Cech let
in an equaliser in the Chelsea-Stoke game (7th
December), he was singled out for criticism by the Chelsea
fans, as this example illustrates:
15:43:53 What the fuck was you doing Cech. The team
has played brilliant and then you go and do that. FFS!!
Come on boys. Heads up.. #CFC</p>
      <p>Such examples demonstrate that:
9Liverpool mid elder.
10Liverpool goalkeeper.
11Liverpool winger.
12Liverpool mid elder.</p>
      <p>Fans of a given team are likely to use swearing
in disapprobation of their own team, players or
manager.</p>
      <p>We also note that fans in these English Premier
League games were much more likely to express a
negative sentiment that was intensi ed by
swearing about their own team than about an opposing
team.</p>
      <p>Therefore negative sentiment intensi ed by
swearing provide strong evidence of a tweeter's a
liation but, perhaps counterintuitively to someone
not familiar with \terrace culture", they are most
likely to be a liated with the team that they are
criticising.
4.2</p>
      <p>O , O , O : on Bad Sportsmanship and
Bad Players
During the Liverpool-West Ham game of 14th
December, West Ham captain Kevin Nolan received a red
card for deliberately stamping on Liverpool's Jordan
Henderson. This meant that he missed the rest of the
match and { as it was his fth red card of the season
{ he also received a three-match ban.</p>
      <p>The tweets from both Liverpool and West Ham fans
indicated strong disapprobation. Liverpool fans
reacted predictably on Twitter:
16:39:29 Go on fuck o Nolan you prick #LFC
16:39:36 Kevin Nolan you fucking twat. #LFC
16:39:47 Fucking dirty bastard #LFC
16:40:02 NOLAN YOU FUCKING PRICK! #LFC
16:40:07 Fuck you nolan.#lfc
16:40:16 Nolan u dirty fuckin cunt! #LFC
16:40:42 Fuck o Nolan! Deserved red #LFC
#PremierLeague</p>
      <p>While we might usually expect a certain amount of
dismay at the loss of the captain for three matches
from a team`s fans, Kevin Nolan`s ban was greeted
with an unexpected amount of positive sentiment by
the West Ham fans.
16:39:18 FUCKING HAPPY DAYS KEVIN NOLAN IS
GOING TO BE BANNED FOR 3 GAMES!!!!! GET IN
THERE HAHAHA #thereisagod #whufc
16:39:34 thank fuck Nolan banned for a while #coyi
#WHUFC
16:40:08 Nolans last game for us? Let's fucking hope
so! #WHUFC
16:40:38 Yes come on now he is banned Wahoo!fuck o
Nolan u prick #coyi #whufc #whu</p>
      <p>As we will see when considering humour (Section
4.6), this may be more to do with Kevin Nolan`s
performance as a player than his infraction of the rules.
From this we can conclude that:</p>
      <p>Fans rarely criticise players from the opposing
teams for poor performances but they will
criticise them for foul play
Fans may criticise their own players for foul play
and are apparently keen to do so if the player is
performing poorly.
4.3</p>
      <p>
        Fuck's Sake! v. Fuck, Yeah! Or Swearing
Not Necessarily Considered Harmful
As noted earlier, there is a tendency to consider
swearing in social media messages as a sign that the
author of that message is expressing a negative
sentiment [
        <xref ref-type="bibr" rid="ref4 ref6">4, 6</xref>
        ]. However, inspection of the content of these
tweets demonstrates that this is sometimes wrong as
these two tweets from Liverpool fans demonstrate:
16:04:57 I NEVER get to see #LFC and the #Oilers win
on the same day but I'm feeling good about it today.
Don't let me down you fucks. #fucks
16:06:00 Fuckin get in reds, 2-0 #LFC
      </p>
      <p>The tweet from the Oilers fan at 16:04 contains
an a ectionate mock threat, with the tongue-in-cheek
nature of the tweet being reinforced by the use of a
dafttag13. The more succinct \Fuckin get in reds, 2-0
#LFC" is a more straightforward expression of
celebration.</p>
      <p>However we can pair that second tweet from a
Liverpool fan with a tweet from a West Ham United fan
in response to the same event:
16:05:20 We are fucked... Hello championship!14 Big
Sam15 out!! #whufc</p>
      <p>Further analysis reveals the frequency of di ering
phrases using the word `fuck' both positively and
negatively. Table 1 summarises the total frequency with
which each phrase is used across all matches for the
three weeks shown. They include variant word forms
(e.g. \fuck sake", \fuck's sake", \fucks sake")
From this we can infer that</p>
      <p>The same event, seen as positive by one tweeter
and negative by another, can prompt tweets that
contain swearing in both cases.</p>
      <p>While swearing may indicate that the sentiment
of the tweeter is intense, it does not
unambiguously demonstrate a positive or negative valence
13Here we de ne \dafttag" as a hashtag that is added,
usually to the end of a tweet that communicates the tweeter's
sentiment in a manner that is self-parodying or an expression of the
tweeter's covert message content. These dafttags often indicate
sarcasm, humour or other modes of communication and indicate
to the reader that they should look beyond the tweet's surface
meaning.</p>
      <p>14The division below the Premiership.
15Sam Allardyce, the West Ham manager.
to that sentiment, so we cannot universally
classify swearing as positive or negative.
4.4</p>
      <p>And Another Thing... Multiple
Sentiments in 140 Characters or Less
For a communication act that is limited to 140
characters, football fans' tweets can display surprisingly rich
and complex sentiments. For example, in the case of
the Chelsea-Stoke City game (7th December), as Stoke
equalized, one Chelsea fan managed to express
disappointment, despair and hope in the same tweet:
15:45:00 Every fucking match something stupid gets
#cfc in trouble. I'm hoping for quick response.</p>
      <p>Likewise in the Liverpool{West Ham game,
Liverpool fans were pleased with a West Ham own-goal that
took Liverpool into the lead, but this didn't prevent
them from expressing their disappointment with their
own team's performance, particularly by the players
Sterling and Allen:
15:42:39 Luck as fuck goal but I'll take it after that
Sterling miss. #LFC
15:44:18 Even though its an own goal we deserve the
lead. Dominating the match but how shit is Joe Allen,
seriously #LFC</p>
      <p>Fans on Twitter also use simile, wordplay and
allusion to comment on games. For example, this Chelsea
fan is expressing displeasure at the scoreline against
Crystal Palace, a team not generally thought of as
strong opponents:
16:36:34 This is bollocks... we at home to Palace, not
away to Barcelona, fucking painful just waiting for them
to equalize. #CFC</p>
      <p>This user alludes to the relative strengths of
Chelsea, Palace and Barcelona16, the relative ease of
playing at home versus away, the scoreline (a one goal
di erence between Chelsea and Palace) and the run of
play, all within the space of a single tweet. The use of
allusion { which relies on the reader's understanding of
16Barcelona FC is widely regarded as playing some of the most
beautiful football in the world, as well as being one of the most
objectively successful teams. Crystal Palace, based in a London
suburb, have been relegated from the English Premier League
more often than any other team.
various elements of context { makes this a particularly
information-dense message.</p>
      <p>From these examples we can infer:</p>
      <p>Tweets, although brief, can contain multiple,
sometimes oppositely-valenced sentiments.</p>
      <p>That individual tweets may therefore be too broad
a unit of analysis if we wish to identify sentiment.
4.5</p>
      <p>Around the Grounds: The Story is not
Restricted to the Match
While in the main, fans tweet about their own team
and the match that they are currently engaged in,
occasionally they will tweet about other concurrent
games. This makes automatic event detection a
challenging task. A Liverpool fan watching the game
against West Ham may tweet about an event in the
Chelsea-Stoke City match; for example, when
Oussama Assaidi, on loan from Liverpool to Stoke, scored
the nal goal in a match that ended Stoke City 3 - 2
Chelsea, the Liverpool tweeters reacted exuberantly:
16:49:04 YES STOKE I FUCKING LOVE YOU! And an
ex liverpool player does it ahahahhahaahahhaaha #LFC
16:49:33 Fucking Assiadi go on lad! Haha doing his club
a favour #lfc
16:49:55 Assaidi you absolute beaut!! Fuck you
Chelsea!!! #Assaidi #LFC
16:50:05 #Assaidi you fucking beauty!! #LFC #cheers
#ChelseaKiller
16:50:13 ASSAIDI YOU FUCKING BEAUTY!!!!!!!!!!
#LFC</p>
      <p>At the beginning of the Chelsea-Crystal Palace
game the following week (14th December), one Chelsea
fan tweeted the following:
15:03:48 Get drunk as fuck &amp; wake up to Arsenal losing
6-3 &amp; a chance to cut the lead to 2 points. Fuck yes.
#CFC</p>
      <p>For context, Chelsea and Arsenal are both London
teams with an historical rivalry. They also both
occupied places in the top three of the table throughout
December and were jockeying for the upper hand.</p>
      <p>From these examples it is possible to infer that
Fans tweet about games that their teams are not
playing in.</p>
      <p>There appears to be a higher likelihood of this
happening if there is either a positive link between
the teams (e.g. a player is on loan to another
team) or a negative link (e.g. a long-standing
rivalry or a close position in the league table.)
Thus the \story" from a particular tweeter's point
of view may not be restricted to a given game.
Rather it is anything that a ects their team's
position in the table, or that has some relationship
to an a liation or a rivalry with another club.
4.6</p>
      <p>Funny Old Game: Humour in Fan Tweets
The British are proud of their ready wit, even (perhaps
especially) in times of emotional stress. It is
unsurprising, then, that we nd a lot of humour in the tweets
sent by English Premier League football fans. Humour
is interesting from a storytelling point of view as it
often relies on ambiguity or wordplay for its e ect; thus
we need to parse humorous twitterances with the same
care that we parse humorous utterances.</p>
      <p>Examples include the following from the Liverpool{
West Ham match where the rst two goals came from
West Ham own goals.
16:06:33 fuck sturridge17...this own goal is some player
#LFC</p>
      <p>And where West Ham captain Kevin Nolan was
expected to turn in a disappointing performance, this
tweet was very widely retweeted:
15:59:01 Kevin Nolan is very adaptable. He is equally
shit in a number of positions #rubbishplayer #lfcswhu
#whufc</p>
      <p>When West Ham pulled back a goal later in the
match to bring the scoreline to 2-1, Liverpool fans
offered the following:
16:35:58 #LFC in fuck this up "shock"
16:36:21 You know if Moses18 was a horse he'd be a Pritt
Stick by now #fuckingawful #LFC</p>
      <p>Likewise in the Chelsea{Stoke City game, fans
reported some on-terrace humour. In the last decade,
Chelsea have won at least one major national or
international competition in all but one season. Stoke
City, however, have only managed to win the
Autoglass Trophy { a competition for teams in the bottom
two divisions of the English top- ight football league
{ since 197219.
15:26:39 #scfc fans: Ur gonna win fuck all
#cfc fans: U've never won fuck all
#scfc fans: Autoglass trophy we've won it 2 times
#classicBanter</p>
      <p>From these examples we can infer that:</p>
      <p>Humour is an important part of fans' storytelling
both at games and online. E ective attempts at
humour are picked up and passed around as either
retweets or reports of stadium banter.</p>
      <p>17Daniel Sturridge, striker for Liverpool and the English
national team and seen in a generally positive light. Here, `fuck'
can be interpreted as `forget' or `never mind'.</p>
      <p>18Victor Moses, Liverpool winger.</p>
      <p>19This is not the kind of trophy that Premier League teams
often boast of winning.</p>
      <p>Humour is not in and of itself a sign of
positivevalenced sentiment. For example \LFC in fuck
this up `shock"' and \If Moses was a horse he'd be
a Pritt Stick" are both jokes, but the originators
of these jokes are not happy { as evidenced by the
hashtag #fuckingawful.</p>
      <p>Jokes can rely on apparent or actual denigration
of a player or team, usually one's own (\Fuck
sturridge..this own goal is some player," \Autoglass
trophy we've won it 2 times.") These are often
examples of self-e acing humour from fans.
Humorous tweets often carry a heavy freight of
context and can be challenging to mine for
sentiment without thorough understanding of that
context.
4.7</p>
      <p>Sick as a Parrot, Flu y as a Kitten: the
use of Creative Language.</p>
      <p>Football has a shared lexicon of formal and
informal terminology20 that has reached the level of cliche
through widespread use on terraces and in newspaper
coverage.</p>
      <p>However, despite a fairly stable football cliche
lexicon, football fans on social networks are inventive in
their use of language. We see several examples of
tmesis { the insertion of a word into the middle of
another word or phrase.
16:48:46 Un-fucking-believable! #CFCLive
16:49:03 Assi-fucking-idi21!!!! what a strike son #scfc</p>
      <p>We also see the use of relatively uncommon forms of
swearing use, for example, the imperative form
\motherfuck":
15:53:22 Mother FUCK this scoreline...</p>
      <p>And the noun \fuckery":
16:49:47 This is some major fuckery.... #cfc</p>
      <p>As well as deliberate misspellings such as \bollox"
- possibly used to turn a swear word into a \minced"
variant22.</p>
      <p>This Liverpool fan uses an unusual but evocative
pair of similes to convey the change in emotional state
that they have experienced through the recent portions
of the game:
16:40:58 From being 2-1 and shitting our pants to being
4-1 and feeling as u y as kittens. #LFC</p>
      <p>We also see this example of ironic litotes from West
Ham fan who is commenting on another Twitter user's
assessment of the performance of Manchester United
and comparing it with their own team's performance:
20\A real six-pointer," \A game of two halves," \Sick as a
parrot," etc.</p>
      <p>21A misspelling of Oussama \Assa-fucking-idi", Stoke City
winger on loan from Liverpool.</p>
      <p>22A minced oath is a deliberate misspelling, mispronunciation
or other mis-rendering of a word in order to render a euphemism.
15:47:51 "@[USER REDACTED]: Haha man united lost
again! They are so shite" wish #WHUFC was as shit as
them</p>
      <p>From these examples we must infer that:</p>
      <p>While tweets are short and ephemeral, they
nevertheless contain rhetorical gures such as
simile, litotes, irony, metaphor and allusion. They
also contain wordplay in forms that include
punning, euphemism and the use of uncommon parts
of speech.</p>
      <p>As a result, we must be aware that tweets are not
always simple utterances, and that they should be
assessed with care.
4.8
\I Fucking Hate Football Sometimes":
the Importance of Assumed Context in
Matchday Tweets
Football inspires a devoted following partly because it
creates shared experiences on multiple timescales. For
example, an event in a match (\Did you see Assaidi`s
goal?"), a match itself (\Can you believe we're 3-2
up against Chelsea?") and a competition as a whole
(\I still think we`ll be facing relegation come April")
provide many opportunities for bonding over shared
joys and sorrows.</p>
      <p>However that intense sharing can ba e newcomers
because so much common context is taken for granted.
Both at the grounds themselves and on Twitter, fans
assume that fellow fans will be aware of a team's recent
performance history, a player's injury worries, or the
likelihood of winning a competition. This may in part
be deliberate obscurantism in order to highlight
ingroup vs. out-group di erences. By demanding deep
knowledge of a team's history before admitting
someone to an inner social group of `true fans', that group
claims a stronger social identity.</p>
      <p>This can lead to tweets that are impossible to assess
for sentiment unless the assessor is aware of this same
context. For example, from the Liverpool vs West
Ham United game, these tweets come from Liverpool
fans:
16:12:27 ahh fuck Gerrard #lfc
16:37:51 luis suarez you lil shit #lfc
16:40:40 Luis. Fucking. Suarez. Again. #LFC
16:42:54 Suarez Is UnFuckingBelievable #LFC #YNWA</p>
      <p>In the absence of context it may appear that the
Liverpool fans are venting their displeasure at Steven
Gerrard and Luis Suarez. However, examining the
context shows us that at around 16:12, Steven Gerrard
picked up a hamstring injury { a recurrent problem
that has plagued his career { and that Suarez scored
twice, at 16:37 and 16:40.</p>
      <p>There are also many tweets where the sentiment
is not hard to infer, but where the context has been
stripped by the tweeter. For example, during the
Chelsea vs Stoke City game, shortly after Stoke scored
to take the game to 3-2 in their favour, Chelsea fans
responded:
16:49:59 I fucking hate football sometimes #CFC
16:50:34 Fuck Fuck Fuck #CFC #STKvCHE</p>
      <p>In the Crystal Palace game, it is possible to infer
that this fan is responding to some form of
commentary:
15:16:38 "Dominated possession"?? Fuck o . #CPFC</p>
      <p>But it is impossible to know who made the original
comment, or what they were commenting upon.</p>
      <p>From these examples we can infer that:</p>
      <p>Fans assume that their audiences share their
contextual information.</p>
      <p>It is not possible to provide a thorough analysis
of sentiment on Twitter if we do not also mine for
context.
4.9</p>
      <p>Speed is of the essence: Urgency and
entropy
When a signi cant event happens, such as a goal
being scored, fans tend to communicate their responses
with great urgency. This is shown by an increase in
the number of tweets sent in the following minutes
and with a simultaneous shortening of those tweets.
To investigate this, we examined the goals from three
randomly-selected matches from our collection. We
selected the tweets sent by both sets of fans for ve
minutes immediately before each goal, and for ve minutes
immediately after. Although there is considerable
variation between the matches and the goals, the pattern
is clear: in the relatively uneventful period before a
goal, few tweets are sent (mean = 128.58 per minute
in these cases) and they are relatively long (mean =
78.04 characters). In the minutes following a goal, fans
from both teams respond rapidly with a surge in the
number of tweets (mean 448.1 per minute, an increase
of 248%) that are typically short (down to 62.8
characters, a drop of 19.5%). The details of these events
are shown in Table 2.</p>
      <p>Typical examples of longer tweets sent when no goal
has been scored recently include (from West
HamManchester Utd, 21st December):
15:32:35 Stoke, Southampton, West Brom etc all go to
Old Tra ord and look decent. We go there and still look
shit! #WHUFC #GoingDown
16:10:00 This is the Manchester United we have been
watching all these years. Scaring shit out of their
opponents when they attack.#Mufc.#GGMU
16:13:57 Fuck sakes, Welbeck injured! What sort of
training is Moyes putting these boys to, getting injuries
like picking cherries? #MUFC
16:04:08 Why the fuck is Taylor not on the wing?! Fat
Sam aint got a bloody clue. Useless fat fuck #COYI
#whufc</p>
      <p>Typical examples of shorter tweets sent just after
goal include:
15:36:18 FUCK YEAH ADNAN!! #MUFC
16:40:12 That was fucking o side..#MUFC
and the terse:
16:31:06 3-0 fuckers #MUFC</p>
      <p>From these results we can infer that:</p>
      <p>Fans respond rapidly to signi cant events by
sending short focussed messages.</p>
      <p>Fans send many messages when a goal is scored.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>Through a careful analysis of a collection of tweets
about football that contain swearing, we have shown
that bad language is not always negative; that the
wider context is often crucial to interpret meaning; and
that perhaps counter-intuitively, some of the strongest
sentiments expressed are self-critical.</p>
      <p>
        The examples in this paper demonstrate that swear
words are used to intensify both positive and
negative sentiments (`fucking beauty' v 'fucking painful').
However, even when the sentiment seems apparent,
widespread irony makes it necessary to consider
context before interpreting the valence of the sentiment.
As noted earlier, the tweet \Luis. Fucking. Suarez.
Again." is actually a positive sentiment by a
Liverpool fan in response to Suarez scoring, but the
automated sentiment analysis systems discussed
earlier [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ] would see \fucking" as the only
sentimentcarrying word of the message and so denote the whole
message as negative.
      </p>
      <p>This is one case where fans implicitly assume that
their audiences share their context, meaning that
apparently ambiguous expressions of sentiment will in
fact be correctly interpreted by their intended
audience. We suggest therefore, that it is often impossible
to accurately analyze sentiment on Twitter if the
context of utterances is not also analyzed.</p>
      <p>The narratives that football fans tell are simpler to
interpret than many of the stories played out on social
media. For example, there is a very low incentive for
a fan to give a false signal, and we have access to a
ground truth { in the form of a match report { that
indicates the likely valence of the sentiments expressed
at di erent points in the narrative.</p>
      <p>However, the narrative itself is open-ended { even
in the close season fans talk about their hopes for next
year and their memories of previous triumphs and
disasters. The events are also complex and overlapping {
a goal within a match within a season for example.</p>
      <p>We have seen that fans of English Premier League
teams often swear about or at their own team, and
relatively rarely about an opposing team or match o
cials. It may be that swearing is being used as a means
of demonstrating a liation to a particular group, or
to demonstrate greater passion about their own team
and an apparent indi erence to all others. In either
case, intense expressions of negative sentiment
actually provide strong evidence of a tweeter's a liation
towards the target of their criticism. Note, however,
that an event which is seen as positive by one tweeter
and negative by another can prompt tweets that
contain swearing from both. This means the storytellers
can be regarded as \unreliable narrators" { likely to
disparage what outsiders may see as neutral or positive
in that team's performance.</p>
      <p>Humour is an important part of fans' storytelling
both at games and online. E ective attempts at
humour are picked up and passed around as either
retweets or reports of stadium banter. However,
humour is not in and of itself a sign of positive-valenced
sentiment. English fans demonstrate a striking ability
to joke even (perhaps especially) when things are
going against their wishes. Humorous tweets often carry
a heavy freight of context and can be challenging to
mine for sentiment without thorough understanding of
that context.</p>
      <p>Even very complex information, including mixed
sentiments, can be expressed in fewer than
140characters. In response to signi cant events, such as
goals, fans tend to respond very rapidly with shorter,
more focused messages than usual; but at other times,
tweets can be packed densely with information and
contain rhetorical gures, deliberate ambiguities and
novel wordplay. It is important therefore to be aware
that tweets are not always simple utterances despite
their brevity, and that they should be analyzed and
assessed with care.</p>
      <p>
        In light of these observations, we would recommend
that any automated sentiment analysis system should
explicitly consider the nature of the evidence of
sentiment, including the wider context. If the only evidence
is from syntax or a lexicon, great care should be taken.
The simple assignment of fans to teams used here is
su cient, but this process could be improved by
considering tweets from each account over a longer period
of time as fans tend not to change loyalties. Match
analysis may be simpler if only one match is being
played. This is more often true for internationals and
cup nal matches [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
15:10:00
15:42:00
16:08:00
16:11:00
16:47:00
15:27:00
16:23:00
15:25:00
15:36:00
16:29:00
16:39:00
Mean
Post length
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Baruch</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Jenkins</surname>
          </string-name>
          .
          <article-title>Swearing at work and permissive leadership culture: When antisocial becomes social and incivility is acceptable</article-title>
          .
          <source>Leadership &amp; Organization Development Journal</source>
          ,
          <volume>28</volume>
          (
          <issue>6</issue>
          ):
          <volume>492</volume>
          {
          <fpage>507</fpage>
          ,
          <string-name>
            <surname>Apr</surname>
          </string-name>
          .
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D.</given-names>
            <surname>Corney</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Martin</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Goker</surname>
          </string-name>
          .
          <article-title>Spot the ball: Detecting sports events on Twitter</article-title>
          .
          <source>In European Conference on Information Retrieval ECIR2014</source>
          , pages
          <fpage>449</fpage>
          {
          <fpage>454</fpage>
          , Amsterdam, Holland,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>N.</given-names>
            <surname>Daly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Holmes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Newton</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Stubbe</surname>
          </string-name>
          .
          <article-title>Expletives as solidarity signals in FTAs on the factory oor</article-title>
          .
          <source>Journal of Pragmatics</source>
          ,
          <volume>36</volume>
          (
          <issue>5</issue>
          ):
          <volume>945</volume>
          {
          <fpage>964</fpage>
          , May
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Hu</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Liu</surname>
          </string-name>
          .
          <article-title>Mining and summarizing customer reviews</article-title>
          .
          <source>In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining</source>
          , Seattle, Washington, USA, Aug.
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>B.</given-names>
            <surname>Liu</surname>
          </string-name>
          .
          <article-title>Sentiment analysis and subjectivity</article-title>
          . In N. Indurkhya and
          <string-name>
            <surname>F. J</surname>
          </string-name>
          . Damerau, editors,
          <source>Handbook of Natural Language Processing. Chapman &amp; Hall, 2nd edition</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D.</given-names>
            <surname>Maynard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Bontcheva</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Rout</surname>
          </string-name>
          .
          <article-title>Challenges in developing opinion mining tools for social media</article-title>
          .
          <source>In Proceedings of @NLP can u tag #usergeneratedcontent?! Workshop at LREC</source>
          <year>2012</year>
          , Turkey,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>B. A.</given-names>
            <surname>Plester</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Sayers</surname>
          </string-name>
          . \
          <article-title>Taking the piss": Functions of banter in the IT industry</article-title>
          . Humor:
          <source>International Journal of Humor Research</source>
          ,
          <volume>20</volume>
          (
          <issue>2</issue>
          ), Jan.
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>R.</given-names>
            <surname>Stephens</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Allsop</surname>
          </string-name>
          .
          <article-title>E ect of manipulated state aggression on pain tolerance</article-title>
          .
          <source>Psychological Reports</source>
          ,
          <volume>111</volume>
          (
          <issue>1</issue>
          ):
          <volume>311</volume>
          {
          <fpage>321</fpage>
          ,
          <string-name>
            <surname>Aug</surname>
          </string-name>
          .
          <year>2012</year>
          . PMID:
          <volume>23045874</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Thelwall</surname>
          </string-name>
          .
          <article-title>Fk yea i swear: Cursing and gender in MySpace</article-title>
          . Corpora,
          <volume>3</volume>
          (
          <issue>1</issue>
          ):
          <volume>83</volume>
          {
          <fpage>107</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>G. van Oorschot</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. van Erp</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Dijkshoorn</surname>
          </string-name>
          .
          <article-title>Automatic extraction of soccer game events from Twitter</article-title>
          .
          <source>In Proc. of the Workshop on Detection, Representation, and Exploitation of Events in the Semantic Web</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zhong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wickramasuriya</surname>
          </string-name>
          , and
          <string-name>
            <given-names>V.</given-names>
            <surname>Vasudevan</surname>
          </string-name>
          .
          <article-title>Human as real-time sensors of social and physical events: A case study of Twitter and sports games</article-title>
          .
          <source>arXiv preprint arXiv:1106.4300</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>