<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Applications of Machine Learning in Dota 2: Literature Review and Practical Knowledge Sharing</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Aleksandr Semenov</string-name>
          <email>avsemenov@hse.ru</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Peter Romov</string-name>
          <email>peter@romov.ru</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kirill Neklyudov</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Daniil Yashkov</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Daniil Kireev</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>International Laboratory for Applied Network Research, National Research University Higher School of Economics</institution>
          ,
          <addr-line>Moscow</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Moscow Institute of Physics and Technology</institution>
          ,
          <addr-line>Moscow</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Moscow State University</institution>
          ,
          <addr-line>Moscow</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Yandex Data Factory</institution>
          ,
          <addr-line>Moscow</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We present the review of recent applications of Machine Learning in Dota 2. It includes prediction of the winning team from the drafting stage of the game, calculating optimal jungling paths, predict the result of team ghts, recommendataion engine for the draft, and detection of in-game roles with emphasis on win prediction from team composition data. Besides that we discuss our own experience with making Dota 2 Machine Learning hachathon and Kaggle competitions.</p>
      </abstract>
      <kwd-group>
        <kwd>Dota 2</kwd>
        <kwd>MOBA</kwd>
        <kwd>eSports</kwd>
        <kwd>Machine Learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>1.1</p>
    </sec>
    <sec id="sec-2">
      <title>Previous Research on Dota 2</title>
      <p>
        The rst article mentioning Dota 2 was qualitative and analyzed correlation of
leadership styles of players with roles in the game they choose to play [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In
the rst quantitative research of Dota 2, authors analyzed cooperation within
teams, national compositions of players, role distribution of heroes and other
stats based on information from Dota 2 web forums [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Rioult et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] analyzed topological patterns of DotA teams based on area,
inertia, diameter, distance and other features derived from their positions and
movements of the players around the map to identify which of them are related
with winning or losing the game. Drachen et al. used Neural Networks and
Genetic Algorithms to analyze and optimize patterns of heroes' movements on
the map in DotA [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>
        Another direction of research is encounter detection and ght results
prediction. Yang et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] applied graph theory to identify patterns in combat hence
analyzing teams' tactics and predict ght results with them with 80% accuracy
on test data. Schubert et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] build up on this approach and took into account
range of attack and spells for each hero to make a better algorithm for encounter
detection and team performance evaluation.
      </p>
      <p>
        Finally, another branch of research in Dota 2 is detection and classi cation
of heros' roles and positions in the game. Gao et. al [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] used Logistic Regression
and Random Forest for that purpose and managed to detect hero roles with 75%
accuracy for hero ids for both public and professional games and 85 % and 90%
accuracy for hero positions respectively. Eggert et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] continued this work and
got even better results with 96.15% test accuracy with Logistic Regression.
1.2
      </p>
    </sec>
    <sec id="sec-3">
      <title>Game Outcome Prediction</title>
      <p>
        Most popular topic in applications of Machine Learning to Dota 2 is win
probability prediction from team drafts. Conley &amp; Perry were the rst to demonstrate
the importance of information from the draft stage of the game with Logistic
Regression and k-Nearest Neighbors (kNN) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. They trained a Logistic
Regression classi er on 18,000 examples and obtained 69.8% test accuracy. kNN with
custom weights for neighbors and distance metrics with 2-fold cross-validation
on 20,000 matches got 67.43% accuracy on cross-validation and 70% accuracy
on 50,000 matches in the test set.
      </p>
      <p>
        Although their work was the rst to show the importance of draft alone, the
interaction among heroes within and between teams were hard to capture with
such a simplistic approach. Agarwala &amp; Pearce tried to take that into account
including the interactions among heroes into the logistic regression model [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
To de ne a role of each hero and model their interactions they used PCA
analysis of the heroes' statistics (kills, deaths, gold per minute etc.). However, their
results showed ine ciency of such approach, because it got them only 57%
accuracy while the model without interactions got 62% accuracy. Its worth noticing
that although the PCA-based models couldn't match predictive accuracy of
logistic regression, the composition of teams they suggested looked more balanced
and reasonable from the game's point of view. Besides that, they tried to nd
meaningful strategies with K-Means clustering on end-game statistics but could
not nd clusters which means that no patterns of gameplay could be detected
on their data.
      </p>
      <p>
        Another approach to that problem of modeling heroes' interactions was
proposed by Kuangyan Song et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. They took 6,000 matches and manually
added 50 combinations of 2 heroes to the features set and used forward stepwise
regression for feature selection. With 10-fold CV for Logistic Regression on 3,000
matches they got 54% accuracy. They concluded that only addition of particular
heroes improves the model while the others might cause the prediction go wrong.
      </p>
      <p>
        Kalyanaraman was the rst to implicitly introduced the roles of the heroes
as a feature in the model of win prediction [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Author took 30,426 matches
ltered by Match Making Ranking (MMR) to select only skilled players and
used an ensemble of Genetic Algorithms and Logistic Regression on 220 matches.
Logistic Regression alone obtained accuracy of 69.42% and an ensemble with
Genetic Algorighm and Logistic Regression approached 74.1% accuracy on the
test set. Although it is the highest result among all the articles in the review,
lack of AUROC information and the small sample of matches, chosen for the
Genetic Algorithm, hampers its reliability.
      </p>
      <p>
        Another attempt to include interaction among heroes was done by Kinkade
&amp; Lim took 62,000 matches with \very high" skill level without leavers and game
duration at least 10 minutes [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] and divided it into 52,000 matches for training,
5,000 for testing and 5,000 for validation. On this data they tried Logistic
Regression and Random Forest with such feature as pairwise winrate for Radiant
and Dire. The feature in theory could capture such relationships as matchup,
synergy and countering and each of them increased the quality of the model up
to 72.9%. Logistic Regression and Random Forest on picks data only got 72.9%
test accuracy for Logistic Regression and over tted Random Forest which gave
them only 67% test accuracy after tuning parameters. It is worth mentioning
that their baseline, which included highest combined individual win rate for the
heroes, had 63% accuracy.
      </p>
      <p>
        Several authors expanded the scope of win prediction from draft information
to other sources of data from the game. Johansson &amp; Wikstrom trained Random
Forest on the information from the game (such as amount of gold for each hero,
his kills, deaths assists for each minute etc.) which had 82.23% accuracy at the
ve minute point [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Although such accuracy seem to be very high, that fact
that it is based on data from the game events makes its use limited, because it
demand real-time data to be practically useful.
      </p>
      <p>The key results from papers described in this sections are summarized in the
following table.</p>
      <p>Training (+ Validation) set size Validation set size Test Accuracy
56691 5669 69,80%
50000 6691 67,43%
40000 4000 62,00%
40000 4000 57,00%
6000 600 58,00%
18500 1500 69,42%</p>
      <p>220 74,10%
62000 5000 72,90%
62000 5000 67,00%</p>
      <p>Our Projects on Machine Learning in Dota 2
Based on the previous research we have conducted several projects in that eld
which we would like to describe and discuss at the workshop:
{ mining and preparing of large and consistent datasets of DotA 2 matches for
creating, testing and comparing Machine Learning algorithms 1;
{ paper, introducing Factorization Machines for the task of game outcome
prediction, which was presented at the 5th conference on Analysis of Images,
Social Networks, and Texts (AIST 2016) 2;
{ In-class Kaggle competition for Machine Learning course at Coursera 3;
{ hackathon for real-time prediction of the winner during the Dota 2 Shanghai</p>
      <p>Major 4.</p>
      <p>We are eagerly looking forward to share our experience from these projects with
other participants of the workshop.
1 http://dotascience.com/papers/aist2016
2 In print. Draft available at http://dotascience.com/papers/aist2016/
aist2016-ml-dota2-drafts_preprint.pdf
3 https://inclass.kaggle.com/c/dota-2-win-probability-prediction
4 http://dotascience.com</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>T.</given-names>
            <surname>Nuangjumnonga</surname>
          </string-name>
          and
          <string-name>
            <given-names>H.</given-names>
            <surname>Mitomo</surname>
          </string-name>
          , \
          <article-title>Leadership development through online gaming," in 19th ITS Biennial Conference: : Moving Forward with Future Technologies: Opening a Platform for All,</article-title>
          (Bangkok), pp.
          <volume>1</volume>
          {
          <issue>24</issue>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>N.</given-names>
            <surname>Pobiedina</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Neidhardt</surname>
          </string-name>
          , \
          <article-title>On successful team formation," tech. rep</article-title>
          .,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>F.</given-names>
            <surname>Rioult</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-P.</given-names>
            <surname>Metivier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Helleu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Scelles</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Durand</surname>
          </string-name>
          , \
          <article-title>Mining Tracks of Competitive Video Games,"</article-title>
          <source>AASRI Procedia</source>
          , vol.
          <volume>8</volume>
          , no.
          <source>Secs</source>
          , pp.
          <volume>82</volume>
          {
          <issue>87</issue>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>A.</given-names>
            <surname>Drachen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Yancey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Maguire</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mahlmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Schubert</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Klabajan</surname>
          </string-name>
          , \
          <article-title>Skill-based di erences in spatio-temporal team behaviour in defence of the Ancients 2 (DotA 2)," Games Media Entertainment (GEM</article-title>
          ),
          <year>2014</year>
          IEEE, vol.
          <volume>2</volume>
          , no.
          <source>DotA 2</source>
          , pp.
          <volume>1</volume>
          {
          <issue>8</issue>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>P.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Harrison</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D. L.</given-names>
            <surname>Roberts</surname>
          </string-name>
          , \
          <article-title>Identifying Patterns in Combat that are Predictive of Success in MOBA Games,"</article-title>
          <source>in Proceedings of Foundations of Digital Games</source>
          , (Miami, Florida), pp.
          <volume>1</volume>
          {
          <issue>8</issue>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>M.</given-names>
            <surname>Schubert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Drachen</surname>
          </string-name>
          , and T. Mahlmann, \
          <article-title>Esports Analytics Through Encounter Detection,"</article-title>
          <source>in MIT SLOAN Sports Analytics Conference</source>
          , pp.
          <volume>1</volume>
          {
          <issue>18</issue>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>L.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Judd</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Wong</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Lowder</surname>
          </string-name>
          , \
          <source>Classifying Dota 2 Hero Characters Based on Play Style and Performance</source>
          ,"
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>C.</given-names>
            <surname>Eggert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Herrlich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Smeddinck</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Malaka</surname>
          </string-name>
          , \
          <article-title>Classi cation of Player Roles in the Team-Based Multi-player Game Dota 2,"</article-title>
          in Entertainment Computing - ICEC
          <string-name>
            <surname>2015 (K. Chorianopoulos</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Divitini</surname>
            ,
            <given-names>J. Baalsrud</given-names>
          </string-name>
          <string-name>
            <surname>Hauge</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Jaccheri</surname>
          </string-name>
          , and R. Malaka, eds.), vol.
          <volume>9353</volume>
          of Lecture Notes in Computer Science, (Cham), pp.
          <volume>112</volume>
          {
          <issue>125</issue>
          , Springer International Publishing,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>K.</given-names>
            <surname>Conley</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Perry</surname>
          </string-name>
          , \
          <article-title>How Does He Saw Me? A Recommendation Engine for Picking Heroes in Dota 2," tech. rep</article-title>
          .,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>A.</given-names>
            <surname>Agarwala</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Pearce</surname>
          </string-name>
          , \
          <article-title>Learning Dota 2 Team Compositions," tech. rep</article-title>
          ., Stanford University,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>K.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , and C. Ma, \
          <article-title>Predicting the winning side of DotA2," tech. rep</article-title>
          ., Stanford University,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12. K. Kalyanaraman, \
          <article-title>To win or not to win? A prediction model to determine the outcome of a DotA2 match," tech. rep</article-title>
          ., University of California San Diego,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>N.</given-names>
            <surname>Kinkade</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jolla</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Lim</surname>
          </string-name>
          , \DOTA 2
          <string-name>
            <given-names>Win</given-names>
            <surname>Prediction</surname>
          </string-name>
          ,
          <article-title>" tech. rep</article-title>
          ., University of California San Diego,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>F.</given-names>
            <surname>Johansson</surname>
          </string-name>
          , J. Wikstrom, and
          <string-name>
            <given-names>F.</given-names>
            <surname>Johansson</surname>
          </string-name>
          , \
          <article-title>Result Prediction by Mining Replays in Dota 2," Master's thesis</article-title>
          , Blekinge Institute of Technology,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>