An empirical biometric-based study for user identification
   from different roles in the online game League of Legends
                 Valmiro Ribeiro da Silva, Márjory Da Costa-Abreu1
                  1
                  Departamento de Informática e Matemática Aplicada
                 Universidade Federal do Rio Grande do Norte (UFRN)
                                 Natal – RN – Brazil
                                 marjory@dimap.ufrn.br

    Abstract. The popularity of computer games has grown exponentially in the
    last few years. In some games, players can choose to play with different char-
    acters from a pre-defined list, exercising distinct roles in each match. Although
    such games were created to promote competition and promote self-improvement,
    there are several recurrent issues. One that has received the least amount of at-
    tention is the problem of ”account sharing” so far is when a player pays more
    experienced players to progressing in the game. The companies running those
    games tend to punish this behaviour, but this specific case is hard to identify.
    The aim of this study is to use a database of mouse and keystroke dynamics
    biometric data of League of Legends players as a case study to understand the
    specific characteristics a player will keep (or not) when playing different roles
    and distinct characters.

1. Introduction
Online games have become very popular and diverse since their beginning in the 80’s, and
each device gives us several biometric modalities to be exploited, such as gait, keystroke
dynamics, mouse dynamics, touch-screen dynamics, etc. The diversity of input data is
endless and it makes the game an unique experience for the player.
         Even though all the previously listed biometrics are used for the same purpose
in the gaming universe, they are fundamentally very different when analysed in the tra-
ditional security and authentication applications. Thus, if we intend to investigate the
identity predictability of game users using these modalities, it is important to understand
its differences from the traditional approaches[da Silva Beserra 2017, Camara 2017].
        As a very simple example of how different the security application is from a tra-
ditional authentication task to a game authentication using biometric data, take it the
keystroke dynamics modality in a continuous verification scenario:
     • In a traditional verification problem, the user’s behaviour is expected to suffer very
       little variation while typing an e-mail.
     • On the other hand, in a game verification problem, the user’s behaviour is expected
       to change and that change will be based on the configurations of the game he/she
       is using, e.g. the role in the game, the abilities it chose to use, the character he/she
       is playing with and so on.
        The second case can not be considered the same as the first, because, despite
the fact they are both verification problems and are using the same base data, the user’s
                                                                                              1
behaviour is different which makes the security system to model it in a different way. To
the best of our knowledge, no other work has tried to investigate this specific problem.
       Thus, this paper aims to investigate what are the real differences (or if there really
are any) in biometric behaviour when the security problem changes from the traditional
systems to the gaming universe. We have chosen to investigate some users in the online
game League of Legends with the idea of analysing keystroke dynamics and mouse dy-
namics data from the same users playing with different characters and roles in the game.

2. Biometric modalities used in desktop based game playing
The market of egames is huge and, with the recent advancement of virtual reality, the
range of consoles (the hardware you need to play games) has increased greatly. The
kinds of devices used to play go from simple keyboard and mouse to very expensive
virtual reality glasses. However, the most popular is still the computer-based one for it is
indisputable the cheapest [da Silva Beserra et al. 2016].
        In a security point of view, each different kind of console will have different vul-
nerabilities, but the ”black-box” type, the ones you buy and does not need to install
any software are, until certain extent, more secure. When we are talking about private
computer-based games, we have a limitation in devices that we can use to play, but the
possibility of a user to play with another user’s account is more evident.
       Considering that we are using League of Legends as our case study, the modalities
chosen to investigate are mouse and keystroke dynamics, because both peripherals are
mandatory used together during the matches.
        Keystroke dynamics is the unique timing patterns embedded in an individual’s
typing and is most often developed in a personal way, hence the use of keyboard dynamics
as a biometrics-based identification modality. Processing of such data includes extracting
keystroke timing features such as the duration of a key press and the time elapsed between
successive key presses [Bergadano et al. 2002, Banerjee and Woodard 2012].
        Mouse dynamics is the unique speed movements and frequency of clicks gener-
ated by a user using the mouse. The move speed is how fast the user moves the mouse in
the 8 possible mapped directions and frequency of clicks is the amount of clicks the users
performs in a time interval [Bours and Fullu 2009].
       Since we are using League of Legends as our case study, it is important to un-
derstand how keystroke and mouse dynamics are used in the context of the game. The
next subsection will introduce the basics of the game, as well as how our modalities can
be used in context, followed by subsections enumerating the related work to mouse and
keystroke dynamics.

2.1. League of Legends
League of Legends is a Multiplayer Online Battle Arena (MOBA) game. The game is
based around matches with two teams of (normally) five players each, where each team
tries to destroy the main base of the other. Before each match starts, each player chooses
a champion to play, which is an avatar that already exists in the game, with predeter-
mined statistics (stats) and skills. Two players of the same team cannot choose the same
champion.
                                                                                            2
         Each champion has four unique skills, where three are common skills and the last
is a ultimate skill. Skills can grant passive or active abilities, where each skill is activated
by the keys ’Q’, ’W’, ’E’ and ’R’, the last one used to activate the ultimate ability.
        Each team member follows one of the defined roles during a match:
     • Top Laner: know as ”top”, this player starts at the top of the map, and usually is a
       melee attacker;
     • Jungler: This player spends most of his time defeating jungle’s monsters in order
       to gain bonus statistics to the team. Champions with high mobility usually take
       this role;
     • Mid Laner: Also known as mid, this player starts in the middle of the map, and
       uses his skill set to create combos to deal damage. Champions with synergic skill
       set usually take the role;
     • Carry: also known as ADC (attack damage carry) is responsible take down build-
       ings and clear minions waves. Ranged champions with high damage usually take
       the role.
     • Support: Starts in the bottom lane and is responsible to support the ADC ans later
       the whole team. Champions with good supporting abilities or tanks usually take
       the role;
        The main features used in League of Legends selected for the analysis of this
paper from both keyboard and mouse dynamics can be described as follows:
     • Keystroke dynamics: ’Q’, ’W’, E’, ’R’ (for the unique skills) and ’SPACE’ (used
       to make the camera follow the player’s champion);
     • Mouse dynamics: Move the character (point and click in a empty space using
       the right mouse button), basic attacks (clicking in a enemy using the right mouse
       button) and target skills (using the left mouse button);
        As already said previously, to the best of our knowledge, there is no other work
which has investigates the individual variations of keystroke dynamics and mouse dynam-
ics (biometrics) in the context of online games. Section 2.2 will present the main works
that can be found using mouse and keystroke dynamics.

2.2. Keystroke dynamics, mouse dynamics and game-related work
Keystroke dynamics is a much older biometrics modality than the mouse dynamics, thus
the number of databases available is larger. This is expected because the use of ”mouse”
is very much associated with the personal computer whereas the keystroke exists since
the use of Morse code.
        In [Idrus et al. 2013] a database containing soft-biometrics and keystroke dynam-
ics from 110 volunteers from France and Norway is presented, where users were classi-
fied using Support Vector Machine (SVM) with an EER (Equal Error Rate) of 21% to
4% when the soft-biometrics were added. The work was expanded in [Idrus et al. 2015]
using a fusion approach, where the SVM algorithm was used to classify the fused data,
with an EER of 10%.
      In [Lv et al. 2008] a new approach to emotion recognition using pressure sensor
keyboards was described. Fear, happiness, anger, sadness, surprise and neutral emotions


                                                                                               3
were tested, with a EER of 12.02% when using only traditional keystroke methods to
classify the subjects with the KNN algorithm.
       In [Thanganayagam and Thangadurai 2015], a database using various fusion ap-
proach on keystroke dynamics was collected. Each user was allowed to choose their
preferable username and password during the enrolment process and they were asked to
type one fixed text for fifteen consecutive times, with an EER of 9% using SVM and
combining features.
         A login method for accessing computer systems using mouse dynamics was de-
scribed by [Bours and Fullu 2009]. 28 users performed a fixed task of moving the mouse
between two lines. They were classified using Levenshtein distance to calculate similari-
ties, also know as edit distance, with an EER of 26.8%.
       Pattern-growth-based mining was used to extract frequent behavior segments in
obtaining stable mouse characteristics in [Shen et al. 2012], using classification algo-
rithms to perform continuous user authentication. 22 users performed Internet surfing,
word processing, online chatting and programming for 30 minutes. The best result was
an EER of 1.49% using a One-Class SVM detector.
       The literature does not have a large amount of multimodal systems using mouse
and keystroke biometric data. Additionally, work related to online games are very limited.
        An approach using game-play activities was proposed in [Chen and Hong 2007]
with the purpose to attack the account sharing problem, where the idle time distribution
of a player in-game was proved to be a representative feature, and the RET scheme was
proposed for user identification, which is based on the Kullback-Leibler divergence be-
tween idle time distributions. The results showed that the RET scheme achieves higher
than 90% accuracy with a 20-minute detection time given a 200-minute history size.
        According to [Yampolskiy and Govindaraju 2006], the behavior of a player in a
match can be used as a metric for identification in some cases. The authors used poker
as case study, calculating the percentage of folds, calls, checks, raises, re-raises and all-
ins, using euclidean distance to calculate similarity to verify 30 players identities, with an
EER of 22.67%.
        For this work we have used the biometrics database collected in
[da Silva Beserra et al. 2016] using League of Legends as case study. Data from
56 different users were collected, using the same type of keyboard and mouse to all
volunteers, where 18 users played more than one time, sometimes using different
characters and/or positions. Our analysis will focus on this group.
        The goal in [da Silva Beserra et al. 2016], and later in [da Silva Beserra 2017] and
[Camara 2017], was to use the database for identification. For this purpose, the soft-
ware WEKA was utilised in order to run machine learning algorithms trying to iden-
tify correctly each user. The best result combining keystroke and mouse dynamics in
[da Silva Beserra 2017] and [Camara 2017] was 90.77% using the Random Forest algo-
rithm, as shown in both works.


                                                                                             4
       Each sample collected in these works have data of 33 different features:
     • 13 keystroke features:
            – Three combination of keys, using the distance between keys, C1 (Q W, W
             E, E R), C2(Q E, W R) and C3(Q R) , also called combos;
           – Frequency (per minute of match) for each key pressed (FQ, FW, FE, FR
             and FSPACE);
           – Latency for each key pressed (Q, W, E, R and SPACE).
     • 20 mouse features:
           – Move speed of the 8 directions - ’Down’, ’Down + Left’, ’Left’, ’Up +
             Left’, ’Up’, ’Up + Right’, ’Right’ and ’Down + Right - represented by D1,
             D2, D3, D4, D5, D6, D7 and D8, respectively;
           – The acceleration for each direction, represented by AD1, AD2, AD3, AD4,
             AD5, AD6, AD7 and AD8, respectively;
           – Frequency and Latency for right and left clicks, represented as CFR, CFL
             (for frequency) and CTR and CTL (for latency).

3. Experimental and statistical analysis
Numerous online games are marketed around the idea of ”different characters for different
people”, and these games lead the players to do one of these things:
     • Always pick the same character for every match, or;
     • Pick a different character for each match.
       The first point may imply that using the same character for every match leads the
players to stay in the same role, but that not always true. A similar assumption could
be made about the second point, inferring that changing a character always changes the
players’ role is not correct. For both cases, this can lead us to the idea that characters
defines how the player behave, but this idea may not be true when we are dealing with
biometrics. As suggested in [Leavitt et al. 2016] we can not assume that a player can be
represented by the characters he/she uses.
         Table 1 shows the 18 users who played more than one match in our database, the
champions they used and the roles they played. Users 20 and 48 are the most represen-
tative, because they played with at least four of the five roles and played every match with
a different character. The other users in this group also have a good value because some
of then changed roles between matches, while the others remained in the same role, even
when they changed characters. User 55, for example, played both matches in the top lane,
first using a melee tank champion (Gnar) and then using a ranged mask-man (Kennen),
while user 16 also played his first match using a melee tank (Darius) and then a ranged
mask-man (Lucian), but played in different roles.
        In order to examine whether a sample x is similar to another sample y statistically,
we used the Mann-Whitney test, which tests the null hypothesis that data in x and y are
samples from continuous distributions with equal medians, against the alternative that
they are not. The test assumes that the two samples are independent, and x and y can have
different lengths [Hart 2001]. This test can be particularly useful when behavioural effects
are being studied [Tallarida and Murray 1987]. Mann-Whitney test is equivalent to the
Wilcoxon signed-rank test. Other statistical tests were discarded because the conditions
to perform them were not always satisfied.
                                                                                            5
       For our experiments, we will use the Mann-Whitney test to observe if the data
from attribute i collected from a user A is similar to i from user B. Ideally, if A = B the

           Table 1. Users who played multiple matches of League of Legends
  User      Character                                                   Role
 user3      Braum, Leona                                      sup, sup
 user6      Karma, Zyra                                       sup, sup
 user10     Fizz, Zilean                                      mid, mid
 user14     Braum, LeeSin                                     sup, jng
 user16     Darius, Lucian                                    top, adc
 user20     AurelionSol, Caitlyn, Illaoi,                     jng, adc, top, adc, jng,
            Jinx, LeeSin, Leona, Malphite,
            Thresh, XinZhao                                   sup, jng, jng, sup, jng
 user23     Sejuani, Sejuani                                  jng,jng
 user24     Hecarim, Kindred                                  jng,jng
 user36     ChoGath, Tristana                                 top, adc
 user42     Rammus, Shyvana                                   jng, jng
 user43     Azir, Orianna                                     mid, mid
 user45     Irelia, Kennen                                    top, top
 user48     Fizz, Jinx, Lucian, Morgana                       mid, adc ,adc, sup
            Taric, Thresh, Tryndamere, Twitch, Vi             sup, sup, top, adc, jng
 user49     Ashe, Jinx                                        adc, adc
 user51     Sivir, Vayne                                      adc, adc
 user52     Caitlyn, Sivir                                    adc, adc
 user53     Corki, Yasou                                      adc, mid
 user55     Gnar, Kennen                                      top, top
test will always accept the null hypothesis, however, this will not be possible for every
case, because different characters can play differently, even when both characters have
the same role.
        All the possible pairs [α, β] of the samples will be tested with selected users,
where α is a sample from user A and β is a sample from user B, for all 33 attributes in
the database. As each sample can have different sizes and are independent (each sample
is from a different match), Mann-Whitney test fits perfectly to conduct our analysis. All
tests were conducted with a 5% significance level, using the two most representative users
(user20 and user48).

3.1. Results when the samples are from the same user
For this experiment, each sample α from user20 was compared to his 9 other samples, and
then the amount of times the null hypothesis H was denied (H = 1) when comparing α
to the others was counted, for every feature. After comparing all the samples, the median
and average mean of the amount of times where H = 1 for user 20 was measured.
                                                                                          6
       Figure 1 A) shows us the results of comparing user 20 with himself. The vertical
bars represent the number of times the null hypothesis was denied for a certain
feature, in other words, the smaller the bar, the better the results.


Figure 1. Median and average of A) samples from user 20 against himself, B) samples
from user 48 against himself, C) samples from user 20 agains user 48 and D) samples
from user 48 against user 20.


        We can see that the null hypothesis was accepted on average at least half the times
for almost all the characteristics, with a ”high” standard deviation caused when two very
different characters were compared (for example, a melee jungler with high mobility and a
ranged ADC with low mobility). The median shows us a more reliable result, because half
of the time the Mann-Whitney test accepted the null hypothesis for all but two features.
These two features - CTR and FW - can be explained by the player’s role in a match.
Players taking roles like support and jungler need to be constantly moving all over the
map, while the other roles usually stay in their positions for longer periods, with junglers
and supports always changing their paths to fit the match, whether by attacking enemies
in different lanes or conquering neutral objectives, like map visibility, thus, affecting the
mouse usage. ’FW’ high disparity is explained by the difference between characters. The
’W’ key is often associated with passive skills that does not require the constant pressing
of a button to be activated, or with an ability that does not fit every situation, explaining
why the null hypothesis was not accepted.
        Figure 1 B) shows the results of comparing user 48 with himself. Much like the
previous experiment, each sample β from user 48 was compared with his other 8 samples
at a level of significance of 5%.

                                                                                            7
       Resembling the previous comparisons, the medians in Figure 1 shows that the
null hypothesis was accepted at least half of the times for almost every feature tested,
with some features being more accurate within user 48 samples, like C2 and CFR, and
other being more divergent, like Q, FQ and SPACE.
       From Figure 1 A) and B), we can see that user 20 has a better accuracy related
to SPACE when compared to user 48, implying that user 20 uses the ’SPACE’ key more
consistently. It would be hard to infer some information about the other features without
proper tests because they are strongly related to roles and characters. The next session
presents tests between users, where more information can be gathered.

3.2. Results when the samples are from distinct users

In order to make these tests, each sample α from user 20 was tested against all samples β
from user 48 using the Mann-Whitney test, with a level of significance of 5%, then, the
opposite tests were made, putting user 48 against user 20.
        Figure 1 C) and D) shows samples from an user against the other. We can see that the
median here is greater than half of the maximum H for more than two features, unlike the the
previous experiment. This indicates that these features with a bigger H sum have a greater
impact comparing user 20 and user 48. We can also see the ’SPACE’ features with a high
value, reinforcing the affirmative that both players use the key distinctly, no matter the match.
        The opposite comparison, putting user 48 opposed to user 20 gives us similar re-
sults. The set of characteristics with a median H greater than half of the the maximum
H is almost the same as the previous comparison, reinforcing the value of these features
when comparing these two users. One could argument that other features have H results
similar to those shown previously, however, the set of characteristics with a high discrep-
ancy have more value, due to exposing the differences between the behaviour of the users,
increasing their biometric value.
        This first analysis putting users against each others exposed what may be a pattern
for every two distinct players, pointing that some characteristics have a high variability
between them, exposing what are the most significant features of an individual. It is
imperative to note that comparing every sample from a user α with a user β to calculate
the median and average mean is not commutative, but a simple comparison between two
given samples is. For example, both medians of ’Q’ are high in Figure 1 C) and D), but
the results of comparing a sample of user 48 with all the other samples of user 20 tend
to negate the null hypothesis of the Mann-Whitney test more times than the opposite,
indicating that user 20 ’Q’ feature is closer to user 48 than user 48 is closer to user 20. If
the general comparison were commutative this would not be the case.

4. Final Remarks
The results of this work showed us that distinct users can be statistically compared by
using the Mann-Whitney test to verify if their characteristics resemble one another, or
even if the same player using different characters in different roles resemble himself. The
comparisons made here demonstrate how to identify distinctions between users, revealing
the value of their behaviour through comparisons. The case study using user20 and user
48 can be expanded to other users, identifying what features have more biometric value
to a given user. With a bigger number of samples from other users, this study could be
expanded to have a better understanding of how these important features can be identified.
                                                                                                8
       This work point us in a direction where players can be identified no matter what
characters or roles they played, dismissing the idea that a player is only defined by the
characters they play, thus, reinforcing the idea that biometrics can be used to combat the
problem of ”account sharing”.

References
Banerjee, S. P. and Woodard, D. L. (2012). Biometric authentication and identification us-
  ing keystroke dynamics: A survey. Journal of Pattern Recognition Research, 7(1):116–
  139.
Bergadano, F., Gunetti, D., and Picardi, C. (2002). User authentication through
  keystroke dynamics. ACM Transactions on Information and System Security (TIS-
  SEC), 5(4):367–397.
Bours, P. and Fullu, C. J. (2009). A login system using mouse dynamics. In Intelli-
  gent Information Hiding and Multimedia Signal Processing, 2009. IIH-MSP’09. Fifth
  International Conference on, pages 1072–1077. IEEE.
Camara, L. (2017). Acquisition and analysis of the first mouse dynamics biomet-
  rics database for user identification in the online collaborative game League of Leg-
  ends. Master’s Thesis (Systems and Computing), UFRN (Universidade Federal do Rio
  Grande do Norte), Natal, Brazil.
Chen, K.-T. and Hong, L.-W. (2007). User identification based on game-play activity
  patterns. In Proceedings of the 6th ACM SIGCOMM workshop on Network and system
  support for games, pages 7–12. ACM.
da Silva Beserra, I. (2017). Using keystroke dynamics for user identification in the online
   collaborative game League of Legends. Master’s Thesis (Systems and Computing),
   UFRN (Universidade Federal do Rio Grande do Norte), Natal, Brazil.
da Silva Beserra, I., Camara, L., and Da Costa-Abreu, M. (2016). Using keystroke and
   mouse dynamics for user identification in the online collaborative game league of leg-
   ends.
Hart, A. (2001). Mann-whitney test is not just a test of medians: differences in spread can
  be important. BMJ: British Medical Journal, 323(7309):391.
Idrus, S. Z. S., Cherrier, E., Rosenberger, C., and Bours, P. (2013). Soft biometrics
   database: a benchmark for keystroke dynamics biometric systems. In Biometrics Spe-
   cial Interest Group (BIOSIG), 2013 international conference of the, pages 1–8. IEEE.
Idrus, S. Z. S., Cherrier, E., Rosenberger, C., Mondal, S., and Bours, P. (2015). Keystroke
   dynamics performance enhancement with soft biometrics. In Identity, Security and
   Behavior Analysis (ISBA), 2015 IEEE International Conference on, pages 1–7. IEEE.
Leavitt, A., Clark, J., and Wixon, D. (2016). Uses of multiple characters in online games
  and their implications for social network methods. In Proceedings of the 19th ACM
  Conference on Computer-Supported Cooperative Work & Social Computing, pages
  648–663. ACM.
Lv, H.-R., Lin, Z.-L., Yin, W.-J., and Dong, J. (2008). Emotion recognition based on pres-
  sure sensor keyboards. In Multimedia and Expo, 2008 IEEE International Conference
  on, pages 1089–1092. IEEE.
Shen, C., Cai, Z., and Guan, X. (2012). Continuous authentication for mouse dynamics:
                                                                                          9
  A pattern-growth approach. In Dependable Systems and Networks (DSN), 2012 42nd
  Annual IEEE/IFIP International Conference on, pages 1–12. IEEE.
Tallarida, R. J. and Murray, R. B. (1987). Mann-whitney test. In Manual of Pharmaco-
  logic Calculations, pages 149–153. Springer.
Thanganayagam, R. and Thangadurai, A. (2015). Fusion approach on keystroke dynam-
  ics to enhance the performance of password authentication. In Electrical, Computer
  and Communication Technologies (ICECCT), 2015 IEEE International Conference on,
  pages 1–6. IEEE.
Yampolskiy, R. V. and Govindaraju, V. (2006). Use of behavioral biometrics in intrusion
  detection and online gaming. In Proc. of SPIE Vol, volume 6202, pages 62020U–1.


                                                                                     10