<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards Predicting the Subscription Status of Twitch.tv Users</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Konstantin Kobs</string-name>
          <email>kobs@informatik.uni-wuerzburg.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Martin Potthast</string-name>
          <email>martin.potthast@uni-leipzig.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Matti Wiegmann</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Albin Zehe</string-name>
          <email>zehe@informatik.uni-wuerzburg.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Benno Stein</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andreas Hotho</string-name>
          <email>hotho@informatik.uni-wuerzburg.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Julius-Maximilians Universität Würzburg</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Leipzig University</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <abstract>
        <p>We investigate whether the subscription status of active users of Twitch can be inferred from their activity patterns in the chats of streamers. To enable a diversity of solutions to this problem, this task was advertised as an ECML-PKDD discovery challenge 2020, called Chat Analytics for Twitch (ChAT). Four participants submitted their working prediction models, which were evaluated at our site. The winning approach achieved an F1 score of 0.343, outperforming the baseline by a significant margin. The most salient conclusion that can be drawn at this time is that interaction behavior plays a crucial role in solving this task, meriting further analysis into this direction.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        https://events.professor-x.de/dc-ecmlpkdd-2020/
The popularity of game streaming and corresponding platforms, such as Twitch,2
for entertainment and professional e-sports is on the rise. The basic function of
such platforms is to enable a screencast along with commentary to a live
audience. While most streamers focus on streaming and commenting the
gameplay of games they are currently playing, other video and audio content is also
streamed, though at a much smaller volume. The audience, in turn, can interact
with the streamers in a channel’s chat and by other means. This enables
streamers to engage with their audience in real time in order to build a followership
and, eventually, to monetize their channel. The audience can donate to streamers
and subscribe to the channel by paying a monthly fee, which is typically between
Copyright c 2020 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).
2 https://www.twitch.tv
five to twenty-five dollars per channel subscription at the time of writing. This
comes with exclusive chat and channel features such as exclusive channel-specific
emotes, which are still or moving images approximately the size of a standard
emoticon. A channel’s earnings are split 50/50 between the streamer and Twitch,
incentivizing both to convert watching users to subscribed users [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Targeted
advertising can be a useful tool to accomplish this, but depends on identifying
users open to subscribe to a channel. If a classifier can be developed that
predicts the subscription status of a user-channel combination (based on the chat
comments and activity patterns of the user in the channel), then applying this
model to currently unsubscribed user-channel combinations can result in
potential targets for advertisement. At the ECML-PKDD Discovery Challenge “Chat
Analytics for Twitch” (ChAT), the task was to build such a binary classification
system.
      </p>
      <p>
        In order to enable the task, we provide a large training dataset consisting of
over 400 million public Twitch comments published along this novel task.
Additionally, we constructed a test dataset with certain characteristics that can be
used as both a benchmark for comparison and as a basis for future research and
analysis. The training dataset was provided to participants, who developed their
prediction models at their own site and then deployed their trained model to our
online evaluation platform TIRA [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. Each team was assigned a virtual machine
to facilitate the installation of required dependencies, where at no point
participants were given direct access to the test data. TIRA enables blind evaluation
by preventing outside access while a software is executed. A participant’s
software was executed remotely on the test data and its outputs were recorded and
evaluated. The virtual machines on which the participants deployed their
software were archived for reproducibility purposes. After the submission deadline
all data has been made publicly available.
      </p>
      <p>This paper describes the task and the evaluation data for this challenge,
summarizes the submissions, and presents an analysis of their performance. Our
contributions are threefold: (1) An original task to predict the subscription status
of Twitch users at given channels based on their interactions and chat messages.
(2) A training dataset with Twitch chat messages, and a test dataset with
characteristics useful for further analysis. (3) An overview and evaluation of the
approaches that were submitted as part of the challenge. The remainder of the
paper is organized as follows: Section 2 gives an overview on related work.
Section 3 introduces the task and the datasets as well as the evaluation metric and
the baseline. Section 4 describes the approaches developed by the participants,
including the employed features and models. Section 5 analyzes the results of
the submitted approaches.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        As Twitch is one of the most popular streaming platforms for games,
numerous studies have been conducted to understand the use, the impact, and the
challenges of this new form of media. While many publications examine social,
cultural, and economic dynamics of the platform [
        <xref ref-type="bibr" rid="ref1 ref14 ref24 ref5 ref8">1, 5, 8, 14, 24</xref>
        ], implications for
the media and community landscape [
        <xref ref-type="bibr" rid="ref12 ref13 ref22 ref23 ref7">7, 12, 13, 22, 23</xref>
        ], and its language [
        <xref ref-type="bibr" rid="ref15 ref18">15, 18</xref>
        ],
only few also utilize machine learning methods to automatically process the chat
and interaction data [
        <xref ref-type="bibr" rid="ref15 ref2">2, 15</xref>
        ].
      </p>
      <p>
        Kobs et al. investigate the usefulness of emotes as indicators for the sentiment
of chat messages [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. They build an emote sentiment lexicon via crowdsourcing
to improve sentiment analysis models based on dictionaries and convolutional
neural networks, showing that common word-based dictionaries cannot capture
the sentiment in the setting of chat messages on Twitch due to the
platformspecific slang. Barbieri et al. define two Twitch-specific tasks: predicting emotes
that are likely to be used in a message, and detecting troll messages, for which
they also utilize emotes to generate ground truth labels [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Both tasks focus
on the text of chat messages. Multiple experiments show that a Bidirectional
LSTM [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] performs best for both tasks.
      </p>
      <p>
        In this work we introduce a novel task: the prediction of the subscription
status of users based on their channel interaction; obviously this is valuable
knowledge that can be used in a subscription recommendation setting. Many
recommendation algorithms, such as collaborative filtering, try to predict whether
a user is interested in an item (in our case a channel) by correlating her
preferences with similar users [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. In our setting, users who have subscribed to similar
channels might be recommended the channels to which other similar users are
subscribed to. However, personal interactions are not trivial to be included in
such algorithms. For example, Twitter relies on a graph-based recommendation
system that models users another user follows [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. YouTube recommends
individual videos instead of entire channels. Twitch subscriptions cost a monthly fee,
while other social media platforms such as Facebook or Instagram allow users to
follow or subscribe to channels free of charge. Note that approaches outside of
the domain of social media are also related: E.g., in a digital newspaper setting,
potential subscribers are identified using different user engagement features, such
as the number of articles read, or the average time spent on an article [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
Instead of a single newspaper, Twitch hosts hundreds of channels that a user can
subscribe to. The direct interaction of users with their channels’ hosts based on
the chats and comments hence play an important role.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Subscription Prediction for Twitch</title>
      <p>We define the task of subscription prediction for Twitch as follows: Given the
chat messages of a user in a channel on Twitch (including metadata such as
timestamps and the currently streamed game), predict whether or not the user
is subscribed to the channel. We instructed the participants not to augment their
models using data other than the training data we supplied. The only exception
to this rule is the use of pretrained models or dictionaries of emotes and their
text representations. We ensured that the training and test datasets are disjunct
in terms of user-channel combinations in order to prevent leakage of ground
truth.3
3.1</p>
      <sec id="sec-3-1">
        <title>Twitch Crawl</title>
        <p>A large dataset of nearly all publicly available Twitch comments in January 2020
was crawled using Twitch’s official API. Only channels labeled as English were
considered. All user-channel combinations for which the subscription status
changed during the recorded time period were omitted, i.e., if a user subscribed
to or unsubscribed from the channel during January 2020. For each user-channel
combination for which at least one comment has been recorded, the following
metadata was recorded:
– Name of the channel (anonymized).
– Name of the commenting user (anonymized).
– Whether or not the user is subscribed to the channel.
– All public chat messages of the user in the channel in the recorded time
period, each containing the timestamp when the user commented, the game
that was played in the channel when the user commented in the form of a
string label, and the chat comment/message itself.</p>
        <p>
          Many messages contain so-called emotes, i.e., still or moving images
approximately the size of a standard emoticon. Emotes are very popular on Twitch
and are used to express the emotional state of a user while watching [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. Every
emote has a text representation that is present in the text field. For example, in
the message “awesome LUL”, ‘LUL’ is the text representation of the emote ,
which indicates general laughter and amusement.
        </p>
        <p>The crawled dataset was split into training and test datasets. The training
dataset has a size of approximately 37 GB. In what follows, we describe the
sampling strategy for the test dataset and report on a brief corpus analysis.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Test Dataset</title>
        <p>For the test dataset, we sampled 90;000 user-channel combinations with
corresponding metadata, resulting in 636;452 messages. To ensure that different user
and channel activities are covered in the test dataset, we employed the following
sampling procedure. Given the number of messages in the dataset per user and
channel, we categorized each user and channel into three activity classes based
on the number of comments: 25 % of the users/channels with the lowest message
counts are considered to be of low activity, and 25 % with the largest message
counts are considered to be of high activity. All other users and channels are
considered to be of normal activity. We sampled 10;000 user-channel combinations
3 By mistake, emotes accessible to already subscribed users were not removed; one
of the participants exploited this “feature” (without notifying us), rendering their
approach infeasible in practice. Nevertheless, it provides for an interesting baseline.
channels a user comments in
channels a user is subscribed to
comments per user in channel
comments per user in channel (subscribed)
comments per user in channel (not subscribed)
comments in a channel
Mean
for each combination of user and channel activity, yielding 90;000 user-channel
combinations in total for the test dataset.</p>
        <p>These user-channel combinations were removed from the training dataset.
Additionally, for a randomly sampled 50 % of users in the test set, we removed
their comments in other channels as well, such that half of the users are not
present in the training dataset at all. This allowed us to analyze whether models
perform better for already known users, even though no messages in the desired
channel were present.
3.3</p>
      </sec>
      <sec id="sec-3-3">
        <title>Training Dataset</title>
        <p>After removing the test data and the messages of 50 % of the test users in all
other channels, the training dataset contained 29;539;420 user-channel
combinations and a total of 410;686;442 public Twitch comments. Table 1 overviews
key figures of the data. The training dataset has records of 146;537 channels
and 7;923;774 users. On average, each user has written comments in fewer than
four channels, while being subscribed to an average 1:5 channels. Among the
29 million user-channel combinations in the training data, 2;368;323 (8:02 %)
users were subscribed, and 27;171;097 (91:98 %) were not. Subscribed users have
a higher mean comment count than non-subscribed users. The high difference
of standard deviation between subscribed and not subscribed users can be
explained by the use of bots that are not subscribed to channels, but comment
very often in order to engage viewers, or to notify users and the streamer about
certain events.</p>
        <p>Figure 1 depicts the number of messages that a user has sent or a
channel has received. Both histograms show exceedingly active users and channels
having sent and received an extensive number of comments. The user with the
most comments, “streamelements”, is a bot used by many streamers to send
notifications about the channel to the chat. The channel that received the most
comments, “xqcow”, belongs to a professional gamer and streamer.</p>
        <p>Regarding the content, the word cloud in Figure 2 gives an impression of the
words used in the comments. Emotes play an important role in Twitch comments,
e.g., LUL and PogChamp are heavily used. The Twitch chat is
casesensitive and only displays emotes if they are typed correctly; emotes mostly
contain capital letters. Therefore, most words in Figure 2 containing capital</p>
        <p>
          0
20,000
40,000
60,000
107
106
105
letters depict emotes; “normal” words are mostly written lower-case. Besides
emotes, ASCII-style emoticons such as :) or :D are popular on Twitch, too. It is
not surprising that gaming and online slang words, such as “u” as a short form
for “you”, “stream”, “play”, and “lol”, are often used. Kobs et al. [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] provides a
more detailed analysis of the Twitch comment’s usage and activity patterns.
Given the 90;000 user-channel combinations including their metadata from the
test dataset, the participants’ models were supposed to predict whether or not
the user is subscribed to the channel. Submissions were evaluated using the F1
measure, which is the harmonic mean of precision and recall with respect to
subscribed user-channel combinations. Owing to the high class-imbalance between
subscribed and unsubscribed users, a majority baseline yields an F1 score of 0.0.
We further provide a random baseline which assigns class labels according to
their distribution found in the the training dataset (8:02 % subscribed, 91:98 %
not subscribed). Finally, the submission ItsBoshyTime provides a baseline
based on the usage of subscriber-only emotes.
4
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Survey of Submitted Approaches</title>
      <p>From the 23 registered teams only four submitted their approaches, three of
which submitted also a notebook paper describing their approach.4 All
approaches rely on certain machine learning methods (see Table 2 for an overview),
4 Given the importance of emotes on Twitch, and for a bit of fun, we asked participants
to choose one of the most common Twitch emotes as their team name.
but model the input data in different ways. Table 3 gives an overview of the used
features, categorized into four groups: (1) stylometric features describing the
writing style of the users, (2) user activity features modeling the behavior of the
users, (3) channel activity features modeling the behavior of the channels, and
(4) interaction features modeling the relationship between a user and a channel.
In the following the approaches are reviewed in greater detail.</p>
      <p>
        VoyTECH by Bayer and Zouzias [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] is the winning approach. It is based
on gradient boosting trees (CatBoost [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]) with hand-engineered features that
model the user and channel behavior without considering the content of the chat
text. Since the approach does not use the textual content, the chat messages are
not preprocessed. Instead, the input is modeled completely by the 26 features
shown in Table 3. Some features represent superficial stylometric information
that encode the message length, some interaction features encode interaction
duration, but most features model the activity of users and channels. It is
noteworthy that VoyTECH uses the game as an anchor to assess the relationship
between unseen users and channels, where eleven of the 26 features indicate the
relationship between games, channels, and users. The authors subsample a
validation dataset from the training data that is structurally similar to the test
data, having a balanced distribution over user and channel activity levels as well
as a balanced number of known users. To find the optimal configuration of their
model, Bayer and Zouzias carry out several experiments with varying features,
differently-sized subsets of the training data, and diverse hyperparameters. They
conclude that the best model on their validation dataset uses as many features
and as much data as possible, as opposed to using a specific subset of the data
or using only a selection of features.
      </p>
      <p>
        CoolStoryBob by Gärtner et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] uses a feature-based gradient
boosting model (XGBoost [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]), but focuses on representing the users’ texts rather
than their activity or their interaction with the channels. The chat messages
are preprocessed including lower-casing, removing the most and the least
frequent words, common colloquial terms, stop words (NLTK’s stopword list [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]),
and single-character tokens, replacing emojis and emoticons with corresponding
text tokens, lemmatization (WordNet), and collapsing repetitions. The approach
combines three sets of features: count vectors of the game titles, TF-IDF vectors
computed from the chat messages with regard to subscription status, and
handcrafted numerical features primarily describing stylometric information as shown
in Table 3. The authors subsample the original training dataset to balance the
subscription status of the user-channel combinations. To find the optimal
configuration of their model, the authors carry out a feature-value analysis and
compare different model configurations with a five-fold cross validation.
      </p>
      <p>ItsBoshyTime exploits some shortcomings of our dataset. Since subscribers
of channels can use channel-specific emotes, the usage of such emotes from the
target channel reveals the ground truth about the user in question. While this
approach is impractical, it provides as an interesting baseline since not all
subscribed users make use of the channel-specific subscriber-only emotes available
to them. To extract subscriber emotes from the training data, a dictionary of
channels and their emotes was constructed via a heuristic to extract emotes from
the messages in the training dataset: If a word begins with a lowercase letter
and contains either a capital letter or a number, it is assumed it to be an emote.
While most globally available Twitch emotes begin with a capital letter (e.g.</p>
      <p>LUL or PogChamp), subscriber emotes have a lower-case prefix based on
the username which is usually automatically generated by Twitch.5 Based on
this heuristic, an emote list for each channel in the training data is available.
If a new user-channel combination is to be predicted, it is checked whether the
channel has already been seen. If the channel is unknown, the approach defaults
to predicting “not subscribed”; if an emote list for the channel is available, it
is matched with the user’s list of used emotes. In case of a match the user is
probably subscribed to the respective channel.</p>
      <p>
        StinkyCheese by Loures et al. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] is based on a neural network, combining
an LSTM [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] with hand-crafted features tp model the verbosity, the
participation, and the attendance of users towards channels. The chat messages of
user-channel combinations are not preprocessed, but concatenated and fed to an
LSTM layer for encoding. The resulting textual encoding is concatenated with
5 https://help.twitch.tv/s/article/subscriber-emote-guide
      </p>
      <p>Rank</p>
      <p>Team
1
2
3
4</p>
      <p>VoyTECH
CoolStoryBob
ItsBoshyTime</p>
      <p>StinkyCheese</p>
      <p>Random Baseline
hand-crafted features covering all of our four categories, but each less extensively
than in the other submissions. The concatenated feature vector is fed through a
fully connected layer for classification. In order to handle the large dataset, the
training dataset is split into chunks of 100;000 user-channel combinations, and
the model trained on one of these chunks. To improve the model, the authors
optimize the hyperparameters on a second 100;000 user-channel chunk.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Results and Discussion</title>
      <p>
        The achieved performance of the participants and the random baseline are shown
in Table 4. VoyTECH outperforms the competition by a fair margin.
Relevant Features. A general trend we identify is that activity and interaction
between games, channels, and users are more important than textual features.
Gärtner et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] reports that there is little difference in word usage between
subscribed and not subscribed users. They also find that content features are of
little significance in their model. In addition, the winning approach VoyTECH
does not use content features at all, but only models interaction and stylometrics
when it represents activity. StinkyCheese, which relies most on content—
using an LSTM to directly incorporate the chat message contents—achieves the
weakest performance. The top two approaches VoyTECH and
CoolStoryBob explore the influence of activity groups on their performance while using
very similar models. The authors of VoyTECH additionally resampled their
validation dataset based on activity.
      </p>
      <p>Generalization to Unseen Users. As described in Section 3.2, 50 % of the users in
the test set do not appear in the training dataset, enabling an analysis whether
there are any differences in prediction performance between known and new
users. Table 5 shows the F1 scores for all submissions on the test set, dependent
on whether users are or are not part of the training data (Known Users and New
Users, respectively). For each approach, except StinkyCheese, users already
present in the training data were more often classified correctly than new users.
The drop in performance is the largest for VoyTECH, as it relies on many
user-centered features. Given only the messages of a user in the target channel
and thus missing additional information from the user’s interactions with other
channels, the extracted features are less representative. Still, VoyTECH
achieves better performance than the other approaches.
Results by User and Channel Activity. Table 5 also shows the performance of the
submitted approaches based on different channel and user activities, respectively,
as defined in Section 3.2. For the most part, it can be said that, the higher the
activity of a user or a channel, the better the model can predict the subscription
status of a user-channel combination.</p>
      <p>A more fine-grained activity analysis can be found in Table 6, considering all
combinations of activity classes of users and channels. Again, the performance is
mostly best for highly active users and channels and worst for users and channels
with low activity. Most extracted features are based on the interaction of users
and channels as well as their content. Having few interactions leads to less data
and thus less robust features for a given user-channel combination.
Ensemble Approaches. Given that the three approaches that rely on different
features and classifiers (excluding ItsBoshyTime), it is interesting to explore
ensemble classification. We evaluated four different ensembles:
1. Majority vote, where users were classified as subscribed to a channel if at
least two approaches say so,
2. An “any” ensemble, which classifies users as subscribed to a channel if at
least one approach says so,
3. An “all” ensemble, which classifies users as subscribed to a channel if all
approaches say so, and
4. A “ VoyTECH or else” ensemble, which follows the classification of the
best-performing approach, VoyTECH, unless both other approaches
disagree with it.</p>
      <p>All ensembles lead to overall worse F1 scores than the VoyTECH approach
by itself. However, as can be expected, the “any” ensemble has a notably higher
recall at a lower precision, while the “all” ensemble has higher precision at lower
recall. Thus, these ensembles may still be relevant when optimizing for one of
these metrics. The full results for all ensembles are given in Table 7.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>
        This paper presents the results of the ECML-PKDD ChAT Discovery
Challenge 2020. It outlines the task, the datasets, the approaches, as well as the results
achieved by the submissions. Our analysis of the models covers different user and
channel activity groups, as well as the generalizability towards new users. We are
convinced that there is potential to further improve the predictions—examples:
adding message contents in the winning submission VoyTECH for raising
the model fidelity, or using ideas from StinkyCheese for better predicting
new users. While most approaches work best with highly active users and
channels, the CoolStoryBob seems to work particularly well with normally active
users. Combining their features and ideas into future models may further
improve the prediction quality. In addition, adding Twitch-specific features such as
the sentiment of Twitch comments (e.g., extracted using the technique described
by Kobs et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]) appears promising. In this challenge, the channel and user
names were anonymized for privacy. However, the name of a user may give hints
on the subscription status, e.g., for users who include their favorite game into
their screen name. Altogether, our challenge takes a first step towards solving
the task of predicting the subscription status of users at channels, giving rise to
new opportunities for marketing on game streaming platforms.
      </p>
      <sec id="sec-6-1">
        <title>Acknowledgments</title>
        <p>We thank all participating teams for submitting their models and papers, and the
ECML-PKDD organizers for hosting our shared task as a discovery challenge.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S. L.</given-names>
            <surname>Anderson</surname>
          </string-name>
          .
          <article-title>Watching People Is Not a Game: Interactive Online Corporeality, Twitch.tv and Videogame Streams</article-title>
          .
          <source>Game Studies</source>
          ,
          <volume>17</volume>
          (
          <issue>1</issue>
          ),
          <year>July 2017</year>
          .
          <article-title>ISSN 1604-7982</article-title>
          . URL http://gamestudies.org/1701/articles/anderson.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>F.</given-names>
            <surname>Barbieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. Espinosa</given-names>
            <surname>Anke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ballesteros</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Soler</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Saggion</surname>
          </string-name>
          .
          <article-title>Towards the Understanding of Gaming Audiences by Modeling Twitch Emotes</article-title>
          .
          <source>In Proceedings of the 3rd Workshop on Noisy User-generated Text</source>
          , pages
          <fpage>11</fpage>
          -
          <lpage>20</lpage>
          , Copenhagen, Denmark,
          <year>2017</year>
          .
          <article-title>Association for Computational Linguistics</article-title>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>W17</fpage>
          -4402. URL http://aclweb.org/anthology/W17-4402.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>I.</given-names>
            <surname>Bayer</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Zouzias</surname>
          </string-name>
          .
          <article-title>Team voyTECH: User Activity Modeling with Boosting Trees</article-title>
          . In K. Kobs,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zehe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A</surname>
          </string-name>
          . Hotho, editors,
          <source>Proceedings of the ECML-PKDD Discovery Challenge: Chat Analytics for Twitch (ChAT</source>
          <year>2020</year>
          ),
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>T.</given-names>
            <surname>Chen</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Guestrin</surname>
          </string-name>
          .
          <article-title>Xgboost: A scalable tree boosting system</article-title>
          .
          <source>In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining</source>
          , pages
          <fpage>785</fpage>
          -
          <lpage>794</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>B. C.</given-names>
            <surname>Churchill</surname>
          </string-name>
          and
          <string-name>
            <given-names>W.</given-names>
            <surname>Xu</surname>
          </string-name>
          .
          <article-title>The Modem Nation: A First Study on Twitch.TV Social Structure and Player/Game Relationships</article-title>
          .
          <article-title>In 2016 IEEE International Conferences on Big Data and Cloud Computing (BDCloud), Social Computing and Networking (SocialCom), Sustainable Computing</article-title>
          and
          <string-name>
            <surname>Communications (SustainCom) (BDCloud-SocialCom-SustainCom)</surname>
          </string-name>
          , pages
          <fpage>223</fpage>
          -
          <lpage>228</lpage>
          , Oct.
          <year>2016</year>
          . doi:
          <volume>10</volume>
          .1109/
          <string-name>
            <surname>BDCloud-SocialCom-SustainCom</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <volume>43</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>H.</given-names>
            <surname>Davoudi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zihayat</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>An</surname>
          </string-name>
          .
          <article-title>Time-aware subscription prediction model for user acquisition in digital news media</article-title>
          .
          <source>In Proceedings of the 2017 SIAM International Conference on Data Mining</source>
          , pages
          <fpage>135</fpage>
          -
          <lpage>143</lpage>
          . SIAM,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>T.</given-names>
            <surname>Faas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Dombrowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Young</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A. D.</given-names>
            <surname>Miller</surname>
          </string-name>
          .
          <source>Watch Me Code: Programming Mentorship Communities on Twitch.tv. Proceedings of the ACM on Human-Computer Interaction</source>
          ,
          <volume>2</volume>
          (CSCW):
          <fpage>1</fpage>
          -
          <lpage>18</lpage>
          , Nov.
          <year>2018</year>
          . ISSN 2573-
          <issue>0142</issue>
          ,
          <fpage>2573</fpage>
          -
          <lpage>0142</lpage>
          . doi:
          <volume>10</volume>
          .1145/3274319. URL https://dl.acm.org/doi/10.1145/3274319.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>E.</given-names>
            <surname>Gandolfi</surname>
          </string-name>
          .
          <article-title>To watch or to play, it is in the game: The game culture on Twitch.tv among performers, plays and audiences</article-title>
          .
          <source>Journal of Gaming &amp; Virtual Worlds</source>
          ,
          <volume>8</volume>
          (
          <issue>1</issue>
          ):
          <fpage>63</fpage>
          -
          <lpage>82</lpage>
          , Mar.
          <year>2016</year>
          . ISSN 1757191X,
          <volume>17571928</volume>
          . doi:
          <volume>10</volume>
          .1386/jgvw.8.1.
          <issue>63</issue>
          _
          <article-title>1</article-title>
          . URL http://openurl.ingenta.com/content/xref?genre= article&amp;issn=
          <fpage>1757</fpage>
          -
          <lpage>191X</lpage>
          &amp;volume=
          <volume>8</volume>
          &amp;issue=1&amp;spage=
          <fpage>63</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Gärtner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Theissler</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Fernandes</surname>
          </string-name>
          .
          <article-title>Detecting Potential Subscribers on Twitch: A Text Mining Approach with XGBoost - Discovery challenge ChAT: CoolStoryBob</article-title>
          . In K. Kobs,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zehe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A</surname>
          </string-name>
          . Hotho, editors,
          <source>Proceedings of the ECML-PKDD Discovery Challenge: Chat Analytics for Twitch (ChAT</source>
          <year>2020</year>
          ),
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>P.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Goel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Wang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Zadeh</surname>
          </string-name>
          . Wtf:
          <article-title>The who to follow service at twitter</article-title>
          .
          <source>In Proceedings of the 22nd international conference on World Wide Web</source>
          , pages
          <fpage>505</fpage>
          -
          <lpage>514</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S.</given-names>
            <surname>Hochreiter</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Schmidhuber</surname>
          </string-name>
          .
          <article-title>Long short-term memory</article-title>
          .
          <source>Neural computation</source>
          ,
          <volume>9</volume>
          (
          <issue>8</issue>
          ):
          <fpage>1735</fpage>
          -
          <lpage>1780</lpage>
          ,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Johnson</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Woodcock</surname>
          </string-name>
          . “
          <article-title>It's like the Gold Rush”: The Lives and Careers of Professional Video Game Streamers on Twitch</article-title>
          .
          <source>tv. Information, Communication &amp; Society</source>
          ,
          <volume>22</volume>
          (
          <issue>3</issue>
          ):
          <fpage>336</fpage>
          -
          <lpage>351</lpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Johnson</surname>
          </string-name>
          and
          <string-name>
            <surname>J. Woodcock.</surname>
          </string-name>
          <article-title>The impacts of live streaming and Twitch.tv on the video game industry</article-title>
          .
          <source>Media, Culture &amp; Society</source>
          ,
          <volume>41</volume>
          (
          <issue>5</issue>
          ):
          <fpage>670</fpage>
          -
          <lpage>688</lpage>
          ,
          <year>July 2019</year>
          . ISSN 0163-
          <fpage>4437</fpage>
          . doi:
          <volume>10</volume>
          .1177/0163443718818363. URL https://doi.org/10.1177/0163443718818363. Publisher: SAGE Publications Ltd.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Johnson</surname>
          </string-name>
          and
          <string-name>
            <surname>J. Woodcock.</surname>
          </string-name>
          “
          <article-title>And Today's Top Donator is”: How Live Streamers on Twitch.tv Monetize and Gamify Their Broadcasts</article-title>
          .
          <source>Social Media + Society</source>
          ,
          <volume>5</volume>
          (
          <issue>4</issue>
          ):
          <fpage>2056305119881694</fpage>
          ,
          <string-name>
            <surname>Oct</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>ISSN 2056-3051</article-title>
          . doi:
          <volume>10</volume>
          .1177/2056305119881694. URL https://doi.org/10.1177/2056305119881694. Publisher: SAGE Publications Ltd.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>K.</given-names>
            <surname>Kobs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zehe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bernstetter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chibane</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pfister</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tritscher</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Hotho.</surname>
          </string-name>
          Emote-Controlled:
          <article-title>Obtaining Implicit Viewer Feedback Through Emote-Based Sentiment Analysis on Comments of Popular Twitch. tv Channels</article-title>
          .
          <source>ACM Transactions on Social Computing</source>
          ,
          <volume>3</volume>
          (
          <issue>2</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>34</lpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>E.</given-names>
            <surname>Loper</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Bird</surname>
          </string-name>
          .
          <article-title>Nltk: the natural language toolkit</article-title>
          .
          <source>arXiv preprint cs/0205028</source>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>T.</given-names>
            <surname>Loures</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Fernandes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Araújo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Martins</surname>
          </string-name>
          , and P. Vaz de Melo.
          <article-title>StinkyCheese: Chat-Based Model for Subscription Classification</article-title>
          . In K. Kobs,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zehe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A</surname>
          </string-name>
          . Hotho, editors,
          <source>Proceedings of the ECML-PKDD Discovery Challenge: Chat Analytics for Twitch (ChAT</source>
          <year>2020</year>
          ),
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>J.</given-names>
            <surname>Olejniczak</surname>
          </string-name>
          .
          <article-title>A LINGUISTIC STUDY OF LANGUAGE VARIETY USED ON TWITCH</article-title>
          .
          <article-title>TV: DESRIPTIVE AND CORPUS-BASED APPROACHES</article-title>
          . page 6.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Gollub</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          . TIRA Integrated Research Architecture. In N. Ferro and C. Peters, editors,
          <source>Information Retrieval Evaluation in a Changing World, The Information Retrieval Series</source>
          . Springer, Sept.
          <year>2019</year>
          . ISBN 978-3-
          <fpage>030</fpage>
          -22948-1. doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -22948-1\_5.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>L.</given-names>
            <surname>Prokhorenkova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Gusev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vorobev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. V.</given-names>
            <surname>Dorogush</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Gulin</surname>
          </string-name>
          .
          <article-title>Catboost: unbiased boosting with categorical features</article-title>
          .
          <source>In Advances in neural information processing systems</source>
          , pages
          <fpage>6638</fpage>
          -
          <lpage>6648</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>J. B. Schafer</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Frankowski</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Herlocker</surname>
            , and
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Sen</surname>
          </string-name>
          .
          <article-title>Collaborative filtering recommender systems</article-title>
          .
          <source>In The adaptive web</source>
          , pages
          <fpage>291</fpage>
          -
          <lpage>324</lpage>
          . Springer,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>J.</given-names>
            <surname>Woodcock</surname>
          </string-name>
          and
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Johnson</surname>
          </string-name>
          .
          <source>The Affective Labor and Performance of Live Streaming on Twitch.tv. Television &amp; New Media</source>
          ,
          <volume>20</volume>
          (
          <issue>8</issue>
          ):
          <fpage>813</fpage>
          -
          <lpage>823</lpage>
          , Dec.
          <year>2019</year>
          . ISSN 1527-
          <fpage>4764</fpage>
          . doi:
          <volume>10</volume>
          .1177/1527476419851077. URL https://doi.org/10.1177/1527476419851077. Publisher: SAGE Publications.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>J.</given-names>
            <surname>Woodcock</surname>
          </string-name>
          and
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Johnson</surname>
          </string-name>
          . Live Streamers on Twitch.
          <article-title>tv as Social Media Influencers: Chances and Challenges for Strategic Communication</article-title>
          .
          <source>International Journal of Strategic Communication</source>
          ,
          <volume>13</volume>
          (
          <issue>4</issue>
          ):
          <fpage>321</fpage>
          -
          <lpage>335</lpage>
          , Aug.
          <year>2019</year>
          .
          <article-title>ISSN 1553-118X</article-title>
          . doi:
          <volume>10</volume>
          .1080/1553118X.
          <year>2019</year>
          .1630412. URL https://doi.org/10.1080/1553118X.
          <year>2019</year>
          .
          <volume>1630412</volume>
          . Publisher: Routledge _eprint: https://doi.org/10.1080/1553118X.
          <year>2019</year>
          .
          <volume>1630412</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhang</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Liu</surname>
          </string-name>
          .
          <article-title>On crowdsourced interactive live streaming: a Twitch.tv-based measurement study</article-title>
          .
          <source>In Proceedings of the 25th ACM Workshop on Network and Operating Systems Support for Digital Audio and Video - NOSSDAV '15</source>
          , pages
          <fpage>55</fpage>
          -
          <lpage>60</lpage>
          , Portland, Oregon,
          <year>2015</year>
          . ACM Press.
          <source>ISBN 978-1-4503-3352-8</source>
          . doi:
          <volume>10</volume>
          .1145/2736084.2736091. URL http://dl.acm.org/citation.cfm?doid=
          <volume>2736084</volume>
          .
          <fpage>2736091</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>