=Paper=
{{Paper
|id=Vol-1198/byrne
|storemode=property
|title=Sweet FA: Sentiment, Swearing and Soccer
|pdfUrl=https://ceur-ws.org/Vol-1198/byrne.pdf
|volume=Vol-1198
|dblpUrl=https://dblp.org/rec/conf/mir/ByrneC14
}}
==Sweet FA: Sentiment, Swearing and Soccer==
Sweet FA: Sentiment, Swearing and Soccer
Emma Byrne David Corney
ebyrne.uk@googlemail.com School of Computing & Digital Media
Robert Gordon University, Aberdeen
d.p.a.corney@rgu.ac.uk
with obscenities which provide colour and also inten-
sify expressions of sentiment. In this paper, we de-
Abstract scribe an initial study exploring the way that senti-
ment is intensified by swearing in Twitter messages
The sentiments expressed by football fans in from football fans during games. By limiting our anal-
the stories that they tell are often intensified ysis to messages containing swear words, we focus on
by the use of swear words. Football provides the more intense expressions of sentiment.
a useful test bed for sentiment analysis due Football provides a particularly interesting test bed
to the symmetric nature of events in matches: for sentiment analysis due to the symmetric nature of
what is good for one team is bad for the other. events within matches: what is good for one team is
We can relate social media messages to the (equally) bad for the other. For example, a goal has
narrative that fans of a given team might be two opposing “valences” which makes it possible to
expected to construct. We use these features search for and analyze approximately equal volumes
of football-related tweets to investigate some of positive and negative sentiment per event. For sim-
common assumptions about swearing as a sen- plicity, we restrict our analysis to FA English Premier
timent marker on social networks. The results League, which is the most-watched football league in
demonstrate that swearing and other senti- the world.
ment markers depend heavily on context, and
that understanding this context is essential if
sentiment is to be detected faithfully. We also 2 Related work
show that swearing is not always indicative of Swearing is a feature of social networks and has been
negative sentiment. for a long time (by internet standards.) In his 2008
Please note that strong language is used paper [9], Thelwall studied the occurrence of swearing
throughout this paper! in MySpace profiles. Swearing is not only widespread
(67% of UK MySpace profiles of 16-19 year olds con-
1 Introduction tained some form of swearing) but varied. Some of the
swearing was self-directed (“tehe i am sorry.. i m such
Football fans construct a shared narrative about their a sleep deprived twat alot of the time! lol”) or clearly
team’s performance. Fanzines, terrace talk and pub affectionate (“Chris you’re slacking again !!! Get the
conversations have always formed part of fan iden- fuck off myspace lol !! you good anyway?”) It was
tity and communication. Online social networks have frequently used in an approbatory fashion (“Thanks
added another venue in which those shared identities for the party last night it was fucking good and you
can be constructed. are great hosts.” “That 50’s rock and roll weekender
Football stories shared by fans are emotional, and was fucking mint!”) This paper demonstrates that,
narratives shared via Twitter are freighted with sen- for young people at least, swearing is part of their per-
timent. The language used by fans is often scattered formed identity online. Not only that, swearing is mul-
tipurpose, used to demonstrate amusement, affection
Copyright c by the paper’s authors. Copying permitted only
and self-deprecation as well as negative sentiment.
for private and academic purposes.
In: S. Papadopoulos, P. Cesar, D. A. Shamma, A. Kelliher, R.
Swearing is known to be a response to – and a me-
Jain (eds.): Proceedings of the SoMuS ICMR 2014 Workshop, diator of – emotions [8], as are other forms of language
Glasgow, Scotland, 01-04-2014, published at http://ceur-ws.org that are prima facie abusive or insulting [1, 3, 7]. In
workplaces in the UK and New Zealand, for example, sify individual tweeters as fans of one team or another
“jocular abuse” is part of team bonding. Swearing, by counting the number of mentions of each team over
piss-taking and other forms of abuse are not only tol- several matches, similar to our approach (Section 3.2).
erated, but form an essential part of the workplace Similar methods have been used to classify events
group dynamic. during American football (NFL) matches [11]. Tweets
Even in a setting as replete with strong language were collected based on team names and NFL termi-
as football, there is a strong imperative to keep the nology. Events were also detected by finding spikes in
reality of the discourse of swearing off the airwaves. the volume of tweets and each event was assigned to
Consider this tweet for example:1 one of a fixed number of classes, in this case using lex-
Ivanovic over kicked the ball and a Chelsea fan angrily icographic analysis. Their system was very effective
swears “fucking cunt” and the commentator emotionally at detecting the most significant scoring events such
apologises Lmao #CFC #Setanta1. as touchdowns, but was less effective at finding less
However, as we demonstrate here, Twitter as a so- significant events like interceptions and field goals.
cial network provides a much less filtered view of fans’
actual language. Swearing is used regularly to inten- 3 Methods
sify sentiment and is, as such, a relevant marker in
3.1 Collecting Tweets
sentiment analysis.
Sentiment analysis is widely used by brand man- On a typical Saturday afternoon, 5 or 6 English Pre-
agers to monitor public perception of their products mier League matches kick off at 3pm, at the same time
and services, including Amazon reviews and Ebay as many matches from lower divisions. For three con-
feedback. However, much of this work assumes that secutive weeks, we collected public tweets discussing
swearing typically represents negative sentiment. Hu the matches. For each match-day, we crawled Twitter
and Liu have created lists of positive and negative using their standard Streaming API3 and filtered using
words associated with opinion or sentiment [4]. In a total of 28 hashtags. These are the standard abbrevi-
the current version of list list2 , words such as “shit”, ation hashtags of the 20 teams (e.g. #CFC, #MCFC)
“fuck” and “damn” are included in the negative set along with certain widely-used hashtags that indi-
and not in the positive set. While they argue that cate team support (e.g. #KTBFFH for Chelsea and
having such an opinion lexicon is not sufficient for sen- #YNWA for Liverpool, both based on popular sup-
timent analysis, it is nonetheless a useful tool [5]. porters’ chants). We collected tweets starting 30 min-
A more sophisticated analysis of detecting senti- utes before kick-off and continued until 30 minutes af-
ment from swear words has been presented by May- ter final whistles. On average, we collected 125,070
nard et al. [6]. They also use a gazetteer of opinion tweets per match-day. Our analysis focuses on period
words, including swear words. They recognise that from 3pm to 5:00pm each Saturday. This includes 90
swear words may be used to intensify the expression minutes of football, the half-time break (c. 15 min-
of a sentiment, be it positive or negative. However, if a utes) and a few minutes of post-match response.
sentence contains swear words and no words recognised Throughout this analysis, we have made frequent
as implying a positive sentiment, then they assume the use of the mainstream media accounts of matches, for
sentence is negative. While this may often be an ef- example to verify when goals were scored, players were
fective approach, we would add that the wider context sent off and other noteworthy events occurred. In par-
can often be used to interpret the sentiment even of ticular, we used the live-blogging “minute by minute”
isolated swear words as we discuss below (Section 4.8). commentaries provided by the BBC for the three Sat-
Twitter has been used to help automatically detect urday’s in question, namely 7/12/20134 , 14/12/20135
events during football matches [10]. That system col- and 21/12/20136 . Knowing these match events and
lects tweets based on hashtags and then detects spikes the team supported by each fan (see below), we can
in the volume of tweets collected which they associate derive an “emotional ground truth” which we expect to
with major game events. For each spike, they analyze then be reflected in the language used in fans’ tweets.
the words of the tweets and classify the event using ma-
chine learning. They consider a fixed range of events 3.2 Linking Tweets to Teams
(goals, own-goals, red cards, yellow cards and substitu- We identified the team that each Twitter user sup-
tions) and compare these classifications to the official ports (if any). Initial manual inspection of a number of
match data to evaluate their system. They also clas-
3 https://dev.twitter.com/docs/api/1.1/post/statuses/
1 This, and all other tweets quoted and analysed in the paper filter
are available on request from the authors. 4 http://www.bbc.co.uk/sport/0/football/25264555
2 http://www.cs.uic.edu/ liub/FBS/sentiment-analysis. 5 http://www.bbc.co.uk/sport/0/football/25365181
~
html#lexicon, dated 12/3/2011 6 http://www.bbc.co.uk/sport/0/football/25463334
tweets suggests that fans tend to use their team’s stan- sets of tweets.
dard abbreviation hashtag greatly more often than any Once we had collected our corpus of tweets from
other teams’, irrespective of sentiment. This identifi- fans watching matches we could begin the process of
cation was then confirmed by inspection of the text of relating them to a narrative. That narrative is multi-
a sample of the tweets. We therefore define a fan’s de- level and consists of discourse about events, games and
gree of support for one team as how many more times the English Premier League competition as a whole.
that team’s abbreviation is mentioned by the user com- The competition and the game are easily defined: the
pared to their second-most mentioned team. For each English Premier League runs from August to May each
user, we aggregated all their tweets and counted the year and each of the 20 teams plays the other twice,
total number of times they mention each team. Here, once at home and once away. The winner is decided
we include as “fans” any user with a degree of two or on the number of points acquired from the matches
more and treat everyone else as neutral. Having as- and ties for position are decided using the number of
signed fans to teams, we can then associate specific goals a team has scored minus the number they have
tweets with specific teams even when no team is men- conceded.
tioned, if other tweets from the same person make their Matches, too, are clearly bounded. We know which
allegiance clear. teams are involved, the ground at which they are play-
To evaluate this method, we randomly selected 100 ing, the start and end time of the match.
tweeters that our algorithm had predicted to be fans Events are more difficult to define. There are some
of various teams. We then examined the tweets in our canonical events that are noted as part of the statis-
collection from each person to determine which team tical record: for example, goals, fouls, bookings, free
they expressed support for, if any. We manually la- kicks and penalties are all recorded with an associated
belled them as supporters of a specific team, neutral time stamp. However, some events are not part of the
or unclear (e.g. due to non-English tweets). Of the official record, despite being matters of significance to
100 people thus analyzed, 93% were correctly assigned the fans. Take the example, shown later, of the Liver-
to teams by the algorithm; 7% appeared to be neutral pool captain Steven Gerrard suffering a recurrence of
commentators who showed no clear preference for any an old injury, which threatened to prevent him taking
team and one was a spam account unrelated to foot- part in the next few matches. We can assign a time-
ball (but using a team hashtag to attract clicks). Our stamp to this event, but we need to mine the event
algorithm mis-assigned them to which specific team commentary in detail to infer what has taken place.
they happened to mention most often. In no case was Even more challenging to identify are the events
a clear fan of one team assigned to any other, giving that occupy a timeline rather than a timepoint. For
us a strong confidence in the rest of our analysis. the sake of clarity we will call states that persist “flu-
ents” and reserve the term “events” for those actions
3.3 Filtering Tweets by Use of Swearing that change the state of a match. Examples of fluents
include “the run of play” - an informal definition of
The use of twitter by fans during matches results in
which team is currently dominating in terms of pos-
an average of over 125,000 tweets per game contain-
session.
ing the team’s hashtag. However, we are particularly
interested in those tweets that contain not only indica-
tions of sentiment, but indications of a high-intensity 3.4 Complicating factors
sentiment. For this reason we filter the tweets that A number of factors make analyzing a typical Saturday
we collected using the stems of the two most com- afternoon’s football especially challenging.
mon swearwords in the English language: “shit” and
First, numerous games are played simultaneously.
“fuck.” 7 Post filtering, our corpus consisted of an aver-
Here, we’ve been considering the five or six Premier-
age of 6483 tweets per match-day, meaning that more
ship matches being played from 3pm, but at that same
than 1 in 20 tweets (5.36%) from football fans contain
moment, up to 36 other matches may be being played
the words ‘shit’ or ‘fuck’ or their derivations.
in the three other professional football divisions in
We further filtered these tweets manually to remove England. There are also matches played in the (sepa-
messages that were not full-on “fannishness.” So, for rate) Scottish league, not to mention other matches in
example, tweets asking about television coverage or other countries.
discussing betting on the match were not assessed for
Second, there is a great variety in the response to
sentiment. We then manually assessed the remaining
different classes of event. The emotional impact of be-
tweets, with both authors coding the sentiment(s) in
ing awarded free-kick expires more rapidly than the
7 Because we used these terms as stems, the filter also emotional impact of scoring a goal from that; and the
matched “shitting, batshit, shite, fucking, fucker, fucked...” emotional impact of the goal expires less rapidly if that
goal has changed who is winning. Furthermore, events 4.1 “We’re Shit and We Know We Are”
within a single match typically overlap to some degree
and have their own duration. They are not discrete, In the English Premier League it is not uncommon
point-sources as may be assumed in theoretical analy- for fans to be highly critical of their own team. This
sis. criticism is levelled at individual players, the team as
a whole, the manager and – very occasionally – the
Thirdly, assigning fans to teams is made harder by
fans themselves.
the tendency of some fans to use the hashtags of their
This is not an uncommon occurrence and – for some
opposing team in order to get the attention of fans of
sets of fans in particular – accounts for the majority
that team, as opposed to being one of them. Insults
of swearing use in large segments of the match. For
and banter are only effective if their target is aware of
example, in the opening 42 minutes of the Liverpool vs
them.
West Ham United match (7th December), over half of
Finally, it requires some judgement to assess the de-
the tweets we collected from Liverpool fans were gripes
gree of sincerity of messages, due to the use of sarcasm,
about their manager, Brendan Rodgers, and several
irony etc. Both authors are native English speakers
team members. For example, the following tweets all
not unversed in such matters, but nonetheless, some
arrived within a few minutes and all were sent by Liv-
messages may have been misinterpreted.
erpool fans:
15:17:57 @lfc Joe allen9 , easily beaten un the middle of
4 Case studies the park. Fuck Rodgers. #ynwa #lfc #LivWhu
15:21:41 Mignolet10 bails our shite defence out once
The tweets that we examined for these case studies again. #LFC
came from a number of matches in the English Premier 15:25:31 This is like the hull game, creating fuck all atm
League that took place in the first half of December. and mignolet keeping us in the game #LFC
This window of time can roughly be considered as mid- 15:26:17 Please learn to pass sterling11 ! Or fuck off #lfc
season: fans have had around 15 league games to assess 15:27:17 This game is driving me mad. Just fucking
their team‘s performance this season, but with over 20 score already damnit. #LFC
games still to play, there is still plenty to play for. This 15:30:07 Amazes me that people don’t understand that
is the time of year that matches begin to be referred we are shite, one player being world class doesn’t make
to as “real six-pointers”8 . a quality team #LFC
The matches that we draw examples from 15:31:30 The fatc people on here are praising defenders
are: Stoke City-Chelsea, Liverpool-West Ham (both and Allen means our attack is shite and nothing more.
7/12/2013), Cardiff-West Bromwich Albion, Chelsea- #LFC
Crystal Palace (14/12/2013) and Manchester United- 15:32:28 For a team that focuses so much on passing,
West Ham United (21/12/2013). For context, the top passings been shite today #lfc
four positions in the league on December 9th were oc- 15:33:28 Henderson12 , sterling are shit #lfc
cupied by Arsenal, Liverpool, Chelsea and Manchester 15:34:18 Raheem Sterling knows fuck all. He should join
City respectively. Manchester United were in (an un- Newcastle or Swansea. #lfc
usually low) 9th place, just above Hull and Stoke. West
Ham and Crystal Palace were both in the bottom four When criticism is sparked by a particular event fans
just below West Brom and Cardiff. It is also worth not- may single out an individual for praise while criticising
ing that different clubs have very different numbers of the team as a whole. For example, the tweet “Mignolet
fans, with average home attendances varying between bails our shite defence out once again,” is simultane-
20,000 and 75,000. This is likely to be reflected in the ously a whinge about the team and praise for an indi-
number of tweets in our collection related to each club. vidual. Inversely, when Chelsea keeper Petr C̆ech let
From the analysis of only those tweets containing in an equaliser in the Chelsea-Stoke game (7th Decem-
swearing we observed that sentiment is expressed in ber), he was singled out for criticism by the Chelsea
a complex and sometimes counter-intuitive manner, fans, as this example illustrates:
several aspects of which we now discuss. 15:43:53 What the fuck was you doing Cech. The team
has played brilliant and then you go and do that. FFS!!
8 In the English football leagues, teams are awarded three
Come on boys. Heads up.. #CFC
points for a win, a point for a draw and no points if they lose.
When teams are playing against opponents that are close to Such examples demonstrate that:
them in the league table it is important not only to win all
three points, but also to deny the other team the opportunity 9 Liverpool midfielder.
to score any points. Hence the fact that these games are referred 10 Liverpool goalkeeper.
to – with more poetry than numeracy, perhaps – as “real six- 11 Liverpool winger.
pointers.” 12 Liverpool midfielder.
• Fans of a given team are likely to use swearing From this we can conclude that:
in disapprobation of their own team, players or
manager. • Fans rarely criticise players from the opposing
teams for poor performances but they will crit-
• We also note that fans in these English Premier icise them for foul play
League games were much more likely to express a
negative sentiment that was intensified by swear- • Fans may criticise their own players for foul play
ing about their own team than about an opposing and are apparently keen to do so if the player is
team. performing poorly.
• Therefore negative sentiment intensified by swear- 4.3 Fuck’s Sake! v. Fuck, Yeah! Or Swearing
ing provide strong evidence of a tweeter’s affilia- Not Necessarily Considered Harmful
tion but, perhaps counterintuitively to someone As noted earlier, there is a tendency to consider swear-
not familiar with “terrace culture”, they are most ing in social media messages as a sign that the au-
likely to be affiliated with the team that they are thor of that message is expressing a negative senti-
criticising. ment [4, 6]. However, inspection of the content of these
tweets demonstrates that this is sometimes wrong as
4.2 Off, Off, Off: on Bad Sportsmanship and these two tweets from Liverpool fans demonstrate:
Bad Players 16:04:57 I NEVER get to see #LFC and the #Oilers win
During the Liverpool-West Ham game of 14th Decem- on the same day but I’m feeling good about it today.
ber, West Ham captain Kevin Nolan received a red Don’t let me down you fucks. #fucks
card for deliberately stamping on Liverpool’s Jordan 16:06:00 Fuckin get in reds, 2-0 #LFC
Henderson. This meant that he missed the rest of the The tweet from the Oilers fan at 16:04 contains
match and – as it was his fifth red card of the season an affectionate mock threat, with the tongue-in-cheek
– he also received a three-match ban. nature of the tweet being reinforced by the use of a
The tweets from both Liverpool and West Ham fans dafttag13 . The more succinct “Fuckin get in reds, 2-0
indicated strong disapprobation. Liverpool fans re- #LFC” is a more straightforward expression of cele-
acted predictably on Twitter: bration.
16:39:29 Go on fuck off Nolan you prick #LFC However we can pair that second tweet from a Liv-
16:39:36 Kevin Nolan you fucking twat. #LFC erpool fan with a tweet from a West Ham United fan
16:39:47 Fucking dirty bastard #LFC in response to the same event:
16:40:02 NOLAN YOU FUCKING PRICK! #LFC 16:05:20 We are fucked... Hello championship!14 Big
16:40:07 Fuck you nolan.#lfc Sam15 out!! #whufc
16:40:16 Nolan u dirty fuckin cunt! #LFC Further analysis reveals the frequency of differing
16:40:42 Fuck off Nolan! Deserved red #LFC #Pre- phrases using the word ‘fuck’ both positively and neg-
mierLeague atively. Table 1 summarises the total frequency with
which each phrase is used across all matches for the
While we might usually expect a certain amount of three weeks shown. They include variant word forms
dismay at the loss of the captain for three matches (e.g. “fuck sake”, “fuck’s sake”, “fucks sake”)
from a team‘s fans, Kevin Nolan‘s ban was greeted From this we can infer that
with an unexpected amount of positive sentiment by
• The same event, seen as positive by one tweeter
the West Ham fans.
and negative by another, can prompt tweets that
16:39:18 FUCKING HAPPY DAYS KEVIN NOLAN IS
contain swearing in both cases.
GOING TO BE BANNED FOR 3 GAMES!!!!! GET IN
THERE HAHAHA #thereisagod #whufc • While swearing may indicate that the sentiment
16:39:34 thank fuck Nolan banned for a while #coyi of the tweeter is intense, it does not unambigu-
#WHUFC ously demonstrate a positive or negative valence
16:40:08 Nolans last game for us? Let’s fucking hope
13 Here we define “dafttag” as a hashtag that is added, usu-
so! #WHUFC
ally to the end of a tweet that communicates the tweeter’s senti-
16:40:38 Yes come on now he is banned Wahoo!fuck off ment in a manner that is self-parodying or an expression of the
Nolan u prick #coyi #whufc #whu tweeter’s covert message content. These dafttags often indicate
sarcasm, humour or other modes of communication and indicate
to the reader that they should look beyond the tweet’s surface
As we will see when considering humour (Section meaning.
4.6), this may be more to do with Kevin Nolan‘s per- 14 The division below the Premiership.
formance as a player than his infraction of the rules. 15 Sam Allardyce, the West Ham manager.
Phrase 7/12 14/12 21/12 Totals various elements of context – makes this a particularly
Fuck sake 80 43 54 Negative information-dense message.
Fuck this/fuck that 30 17 14 238 From these examples we can infer:
Fuck yeah 11 3 5 Positive
Fucking get in 28 8 11 244 • Tweets, although brief, can contain multiple,
thank fuck 42 6 8 sometimes oppositely-valenced sentiments.
Table 1: Frequency and usual sentiment of various
phrases collected during three Saturdays in Dec 2013. • That individual tweets may therefore be too broad
a unit of analysis if we wish to identify sentiment.
4.5 Around the Grounds: The Story is not
to that sentiment, so we cannot universally clas- Restricted to the Match
sify swearing as positive or negative.
While in the main, fans tweet about their own team
and the match that they are currently engaged in,
4.4 And Another Thing... Multiple Senti- occasionally they will tweet about other concurrent
ments in 140 Characters or Less games. This makes automatic event detection a chal-
For a communication act that is limited to 140 charac- lenging task. A Liverpool fan watching the game
ters, football fans’ tweets can display surprisingly rich against West Ham may tweet about an event in the
and complex sentiments. For example, in the case of Chelsea-Stoke City match; for example, when Ous-
the Chelsea-Stoke City game (7th December), as Stoke sama Assaidi, on loan from Liverpool to Stoke, scored
equalized, one Chelsea fan managed to express disap- the final goal in a match that ended Stoke City 3 - 2
pointment, despair and hope in the same tweet: Chelsea, the Liverpool tweeters reacted exuberantly:
15:45:00 Every fucking match something stupid gets 16:49:04 YES STOKE I FUCKING LOVE YOU! And an
#cfc in trouble. I’m hoping for quick response. ex liverpool player does it ahahahhahaahahhaaha #LFC
Likewise in the Liverpool–West Ham game, Liver- 16:49:33 Fucking Assiadi go on lad! Haha doing his club
pool fans were pleased with a West Ham own-goal that a favour #lfc
took Liverpool into the lead, but this didn’t prevent 16:49:55 Assaidi you absolute beaut!! Fuck you
them from expressing their disappointment with their Chelsea!!! #Assaidi #LFC
own team’s performance, particularly by the players 16:50:05 #Assaidi you fucking beauty!! #LFC #cheers
Sterling and Allen: #ChelseaKiller
15:42:39 Luck as fuck goal but I’ll take it after that 16:50:13 ASSAIDI YOU FUCKING BEAUTY!!!!!!!!!!
Sterling miss. #LFC #LFC
15:44:18 Even though its an own goal we deserve the
lead. Dominating the match but how shit is Joe Allen, At the beginning of the Chelsea-Crystal Palace
seriously #LFC game the following week (14th December), one Chelsea
fan tweeted the following:
Fans on Twitter also use simile, wordplay and allu- 15:03:48 Get drunk as fuck & wake up to Arsenal losing
sion to comment on games. For example, this Chelsea 6-3 & a chance to cut the lead to 2 points. Fuck yes.
fan is expressing displeasure at the scoreline against #CFC
Crystal Palace, a team not generally thought of as For context, Chelsea and Arsenal are both London
strong opponents: teams with an historical rivalry. They also both oc-
16:36:34 This is bollocks... we at home to Palace, not cupied places in the top three of the table throughout
away to Barcelona, fucking painful just waiting for them December and were jockeying for the upper hand.
to equalize. #CFC From these examples it is possible to infer that
This user alludes to the relative strengths of • Fans tweet about games that their teams are not
Chelsea, Palace and Barcelona16 , the relative ease of playing in.
playing at home versus away, the scoreline (a one goal
difference between Chelsea and Palace) and the run of • There appears to be a higher likelihood of this
play, all within the space of a single tweet. The use of happening if there is either a positive link between
allusion – which relies on the reader’s understanding of the teams (e.g. a player is on loan to another
16 Barcelona FC is widely regarded as playing some of the most
team) or a negative link (e.g. a long-standing ri-
beautiful football in the world, as well as being one of the most
valry or a close position in the league table.)
objectively successful teams. Crystal Palace, based in a London
suburb, have been relegated from the English Premier League • Thus the “story” from a particular tweeter’s point
more often than any other team. of view may not be restricted to a given game.
Rather it is anything that affects their team’s po- • Humour is not in and of itself a sign of positive-
sition in the table, or that has some relationship valenced sentiment. For example “LFC in fuck
to an affiliation or a rivalry with another club. this up ‘shock”’ and “If Moses was a horse he’d be
a Pritt Stick” are both jokes, but the originators
4.6 Funny Old Game: Humour in Fan Tweets of these jokes are not happy – as evidenced by the
hashtag #fuckingawful.
The British are proud of their ready wit, even (perhaps
especially) in times of emotional stress. It is unsurpris- • Jokes can rely on apparent or actual denigration
ing, then, that we find a lot of humour in the tweets of a player or team, usually one’s own (“Fuck stur-
sent by English Premier League football fans. Humour ridge..this own goal is some player,” “Autoglass
is interesting from a storytelling point of view as it of- trophy we’ve won it 2 times.”) These are often
ten relies on ambiguity or wordplay for its effect; thus examples of self-effacing humour from fans.
we need to parse humorous twitterances with the same
• Humorous tweets often carry a heavy freight of
care that we parse humorous utterances.
context and can be challenging to mine for sen-
Examples include the following from the Liverpool– timent without thorough understanding of that
West Ham match where the first two goals came from context.
West Ham own goals.
16:06:33 fuck sturridge17 ...this own goal is some player 4.7 Sick as a Parrot, Fluffy as a Kitten: the
#LFC use of Creative Language.
And where West Ham captain Kevin Nolan was ex-
pected to turn in a disappointing performance, this Football has a shared lexicon of formal and infor-
tweet was very widely retweeted: mal terminology20 that has reached the level of cliché
through widespread use on terraces and in newspaper
15:59:01 Kevin Nolan is very adaptable. He is equally
coverage.
shit in a number of positions #rubbishplayer #lfcswhu
However, despite a fairly stable football cliché lex-
#whufc
icon, football fans on social networks are inventive in
When West Ham pulled back a goal later in the
their use of language. We see several examples of
match to bring the scoreline to 2-1, Liverpool fans of-
tmesis – the insertion of a word into the middle of
fered the following:
another word or phrase.
16:35:58 #LFC in fuck this up ”shock” 16:48:46 Un-fucking-believable! #CFCLive
16:36:21 You know if Moses18 was a horse he’d be a Pritt 16:49:03 Assi-fucking-idi21 !!!! what a strike son #scfc
Stick by now #fuckingawful #LFC We also see the use of relatively uncommon forms of
Likewise in the Chelsea–Stoke City game, fans re- swearing use, for example, the imperative form “moth-
ported some on-terrace humour. In the last decade, erfuck”:
Chelsea have won at least one major national or in- 15:53:22 Mother FUCK this scoreline...
ternational competition in all but one season. Stoke And the noun “fuckery”:
City, however, have only managed to win the Auto- 16:49:47 This is some major fuckery.... #cfc
glass Trophy – a competition for teams in the bottom As well as deliberate misspellings such as “bollox”
two divisions of the English top-flight football league - possibly used to turn a swear word into a “minced”
– since 197219 . variant22 .
15:26:39 #scfc fans: Ur gonna win fuck all This Liverpool fan uses an unusual but evocative
#cfc fans: U’ve never won fuck all pair of similes to convey the change in emotional state
#scfc fans: Autoglass trophy we’ve won it 2 times #clas- that they have experienced through the recent portions
sicBanter of the game:
From these examples we can infer that: 16:40:58 From being 2-1 and shitting our pants to being
4-1 and feeling as fluffy as kittens. #LFC
• Humour is an important part of fans’ storytelling We also see this example of ironic litotes from West
both at games and online. Effective attempts at Ham fan who is commenting on another Twitter user’s
humour are picked up and passed around as either assessment of the performance of Manchester United
retweets or reports of stadium banter. and comparing it with their own team’s performance:
17 Daniel Sturridge, striker for Liverpool and the English na- 20 “A real six-pointer,” “A game of two halves,” “Sick as a
tional team and seen in a generally positive light. Here, ‘fuck’ parrot,” etc.
can be interpreted as ‘forget’ or ‘never mind’. 21 A misspelling of Oussama “Assa-fucking-idi”, Stoke City
18 Victor Moses, Liverpool winger. winger on loan from Liverpool.
19 This is not the kind of trophy that Premier League teams 22 A minced oath is a deliberate misspelling, mispronunciation
often boast of winning. or other mis-rendering of a word in order to render a euphemism.
15:47:51 ”@[USER REDACTED]: Haha man united lost There are also many tweets where the sentiment
again! They are so shite” wish #WHUFC was as shit as is not hard to infer, but where the context has been
them stripped by the tweeter. For example, during the
From these examples we must infer that: Chelsea vs Stoke City game, shortly after Stoke scored
to take the game to 3-2 in their favour, Chelsea fans
• While tweets are short and ephemeral, they nev- responded:
ertheless contain rhetorical figures such as sim- 16:49:59 I fucking hate football sometimes #CFC
ile, litotes, irony, metaphor and allusion. They 16:50:34 Fuck Fuck Fuck #CFC #STKvCHE
also contain wordplay in forms that include pun- In the Crystal Palace game, it is possible to infer
ning, euphemism and the use of uncommon parts that this fan is responding to some form of commen-
of speech. tary:
15:16:38 ”Dominated possession”?? Fuck off. #CPFC
• As a result, we must be aware that tweets are not
But it is impossible to know who made the original
always simple utterances, and that they should be
comment, or what they were commenting upon.
assessed with care.
From these examples we can infer that:
4.8 “I Fucking Hate Football Sometimes”: • Fans assume that their audiences share their con-
the Importance of Assumed Context in textual information.
Matchday Tweets
• It is not possible to provide a thorough analysis
Football inspires a devoted following partly because it
of sentiment on Twitter if we do not also mine for
creates shared experiences on multiple timescales. For
context.
example, an event in a match (“Did you see Assaidi‘s
goal?”), a match itself (“Can you believe we’re 3-2
4.9 Speed is of the essence: Urgency and en-
up against Chelsea?”) and a competition as a whole
tropy
(“I still think we‘ll be facing relegation come April”)
provide many opportunities for bonding over shared When a significant event happens, such as a goal be-
joys and sorrows. ing scored, fans tend to communicate their responses
However that intense sharing can baffle newcomers with great urgency. This is shown by an increase in
because so much common context is taken for granted. the number of tweets sent in the following minutes
Both at the grounds themselves and on Twitter, fans and with a simultaneous shortening of those tweets.
assume that fellow fans will be aware of a team’s recent To investigate this, we examined the goals from three
performance history, a player’s injury worries, or the randomly-selected matches from our collection. We se-
likelihood of winning a competition. This may in part lected the tweets sent by both sets of fans for five min-
be deliberate obscurantism in order to highlight in- utes immediately before each goal, and for five minutes
group vs. out-group differences. By demanding deep immediately after. Although there is considerable vari-
knowledge of a team’s history before admitting some- ation between the matches and the goals, the pattern
one to an inner social group of ‘true fans’, that group is clear: in the relatively uneventful period before a
claims a stronger social identity. goal, few tweets are sent (mean = 128.58 per minute
This can lead to tweets that are impossible to assess in these cases) and they are relatively long (mean =
for sentiment unless the assessor is aware of this same 78.04 characters). In the minutes following a goal, fans
context. For example, from the Liverpool vs West from both teams respond rapidly with a surge in the
Ham United game, these tweets come from Liverpool number of tweets (mean 448.1 per minute, an increase
fans: of 248%) that are typically short (down to 62.8 char-
16:12:27 ahh fuck Gerrard #lfc acters, a drop of 19.5%). The details of these events
16:37:51 luis suarez you lil shit #lfc are shown in Table 2.
16:40:40 Luis. Fucking. Suarez. Again. #LFC Typical examples of longer tweets sent when no goal
16:42:54 Suarez Is UnFuckingBelievable #LFC #YNWA has been scored recently include (from West Ham-
Manchester Utd, 21st December):
In the absence of context it may appear that the 15:32:35 Stoke, Southampton, West Brom etc all go to
Liverpool fans are venting their displeasure at Steven Old Trafford and look decent. We go there and still look
Gerrard and Luis Suárez. However, examining the shit! #WHUFC #GoingDown
context shows us that at around 16:12, Steven Gerrard 16:10:00 This is the Manchester United we have been
picked up a hamstring injury – a recurrent problem watching all these years. Scaring shit out of their oppo-
that has plagued his career – and that Suárez scored nents when they attack.#Mufc.#GGMU
twice, at 16:37 and 16:40. 16:13:57 Fuck sakes, Welbeck injured! What sort of
training is Moyes putting these boys to, getting injuries year and their memories of previous triumphs and dis-
like picking cherries? #MUFC asters. The events are also complex and overlapping –
16:04:08 Why the fuck is Taylor not on the wing?! Fat a goal within a match within a season for example.
Sam aint got a bloody clue. Useless fat fuck #COYI
We have seen that fans of English Premier League
#whufc
teams often swear about or at their own team, and rel-
Typical examples of shorter tweets sent just after
atively rarely about an opposing team or match offi-
goal include:
cials. It may be that swearing is being used as a means
15:36:18 FUCK YEAH ADNAN!! #MUFC
of demonstrating affiliation to a particular group, or
16:40:12 That was fucking offside..#MUFC
to demonstrate greater passion about their own team
and the terse: and an apparent indifference to all others. In either
16:31:06 3-0 fuckers #MUFC case, intense expressions of negative sentiment actu-
From these results we can infer that: ally provide strong evidence of a tweeter’s affiliation
towards the target of their criticism. Note, however,
• Fans respond rapidly to significant events by send-
that an event which is seen as positive by one tweeter
ing short focussed messages.
and negative by another can prompt tweets that con-
tain swearing from both. This means the storytellers
• Fans send many messages when a goal is scored.
can be regarded as “unreliable narrators” – likely to
disparage what outsiders may see as neutral or positive
5 Conclusions in that team’s performance.
Through a careful analysis of a collection of tweets Humour is an important part of fans’ storytelling
about football that contain swearing, we have shown both at games and online. Effective attempts at
that bad language is not always negative; that the humour are picked up and passed around as either
wider context is often crucial to interpret meaning; and retweets or reports of stadium banter. However, hu-
that perhaps counter-intuitively, some of the strongest mour is not in and of itself a sign of positive-valenced
sentiments expressed are self-critical. sentiment. English fans demonstrate a striking ability
The examples in this paper demonstrate that swear to joke even (perhaps especially) when things are go-
words are used to intensify both positive and nega- ing against their wishes. Humorous tweets often carry
tive sentiments (‘fucking beauty’ v ’fucking painful’). a heavy freight of context and can be challenging to
However, even when the sentiment seems apparent, mine for sentiment without thorough understanding of
widespread irony makes it necessary to consider con- that context.
text before interpreting the valence of the sentiment.
As noted earlier, the tweet “Luis. Fucking. Suarez. Even very complex information, including mixed
Again.” is actually a positive sentiment by a Liv- sentiments, can be expressed in fewer than 140-
erpool fan in response to Suárez scoring, but the characters. In response to significant events, such as
automated sentiment analysis systems discussed ear- goals, fans tend to respond very rapidly with shorter,
lier [5, 6] would see “fucking” as the only sentiment- more focused messages than usual; but at other times,
carrying word of the message and so denote the whole tweets can be packed densely with information and
message as negative. contain rhetorical figures, deliberate ambiguities and
novel wordplay. It is important therefore to be aware
This is one case where fans implicitly assume that
that tweets are not always simple utterances despite
their audiences share their context, meaning that ap-
their brevity, and that they should be analyzed and
parently ambiguous expressions of sentiment will in
assessed with care.
fact be correctly interpreted by their intended audi-
ence. We suggest therefore, that it is often impossible In light of these observations, we would recommend
to accurately analyze sentiment on Twitter if the con- that any automated sentiment analysis system should
text of utterances is not also analyzed. explicitly consider the nature of the evidence of senti-
The narratives that football fans tell are simpler to ment, including the wider context. If the only evidence
interpret than many of the stories played out on social is from syntax or a lexicon, great care should be taken.
media. For example, there is a very low incentive for The simple assignment of fans to teams used here is
a fan to give a false signal, and we have access to a sufficient, but this process could be improved by con-
ground truth – in the form of a match report – that sidering tweets from each account over a longer period
indicates the likely valence of the sentiments expressed of time as fans tend not to change loyalties. Match
at different points in the narrative. analysis may be simpler if only one match is being
However, the narrative itself is open-ended – even played. This is more often true for internationals and
in the close season fans talk about their hopes for next cup final matches [2].
Date/time Event Prior count Prior length Post count Post length
Chelsea vs Stoke City, 7/12/13
15:10:00 Chelsea goal 898 82.346 3752 52.215
15:42:00 Stoke goal 392 81.719 1402 73.932
16:08:00 Stoke goal 646 80.517 2962 54.520
16:11:00 Chelsea goal 1155 66.762 3342 53.621
16:47:00 Stoke goal 655 79.988 2033 64.483
Southampton v Newcastle, 14/12/2013
15:27:00 Newcastle goal 70 94.900 601 64.057
16:23:00 Southampton goal 214 61.243 458 67.590
Manchester United v West Ham United, 21/12/2013
15:25:00 Manchester goal 400 83.933 2198 55.720
15:36:00 Manchester goal 709 79.609 3640 63.106
16:29:00 Manchester goal 546 70.495 3136 65.466
16:39:00 West Ham goal 1387 76.958 1122 76.647
Mean 642.9 78.04 2240.5 62.85
Table 2: Number and lengths of tweets collected for five minutes before and after goals in three matches
References social media. In Proceedings of @NLP can u
tag #usergeneratedcontent?! Workshop at LREC
[1] Y. Baruch and S. Jenkins. Swearing at work
2012, Turkey, 2012.
and permissive leadership culture: When anti-
social becomes social and incivility is acceptable. [7] B. A. Plester and J. Sayers. “Taking the piss”:
Leadership & Organization Development Journal, Functions of banter in the IT industry. Humor:
28(6):492–507, Apr. 2007. International Journal of Humor Research, 20(2),
Jan. 2007.
[2] D. Corney, C. Martin, and A. Goker. Spot the
ball: Detecting sports events on Twitter. In [8] R. Stephens and C. Allsop. Effect of manipulated
European Conference on Information Retrieval state aggression on pain tolerance. Psychologi-
ECIR2014, pages 449–454, Amsterdam, Holland, cal Reports, 111(1):311–321, Aug. 2012. PMID:
2014. 23045874.
[3] N. Daly, J. Holmes, J. Newton, and M. Stubbe. [9] M. Thelwall. Fk yea i swear: Cursing and gender
Expletives as solidarity signals in FTAs on the in MySpace. Corpora, 3(1):83–107, 2008.
factory floor. Journal of Pragmatics, 36(5):945–
964, May 2004. [10] G. van Oorschot, M. van Erp, and C. Dijkshoorn.
Automatic extraction of soccer game events from
[4] M. Hu and B. Liu. Mining and summarizing
Twitter. In Proc. of the Workshop on Detection,
customer reviews. In Proceedings of the ACM
Representation, and Exploitation of Events in the
SIGKDD International Conference on Knowledge
Semantic Web, 2012.
Discovery & Data Mining, Seattle, Washington,
USA, Aug. 2004. [11] S. Zhao, L. Zhong, J. Wickramasuriya, and V. Va-
[5] B. Liu. Sentiment analysis and subjectivity. In sudevan. Human as real-time sensors of social
N. Indurkhya and F. J. Damerau, editors, Hand- and physical events: A case study of Twitter and
book of Natural Language Processing. Chapman & sports games. arXiv preprint arXiv:1106.4300,
Hall, 2nd edition, 2010. 2011.
[6] D. Maynard, K. Bontcheva, and D. Rout. Chal-
lenges in developing opinion mining tools for