<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Fig.</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Predicting video game properties with deep convolutional neural networks using screenshots</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Przemyslaw Buczkowski</string-name>
          <email>pbuczkowski@opi.org.pl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Antoni Sobkowicz</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Faculty of Electronics and Information Technology, Warsaw University of Technology</institution>
          ,
          <addr-line>Warsaw</addr-line>
          ,
          <country country="PL">Poland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>National Information Processing Institute</institution>
          ,
          <addr-line>Warsaw</addr-line>
          ,
          <country country="PL">Poland</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2015</year>
      </pub-date>
      <volume>2</volume>
      <issue>2015</issue>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>The amount of visual data is enormous nowadays and it constantly increases.
On the other hand information in images is still almost not readable for computer
algorithms. In most of image classi cation problems humans still outperform
computers. Sometimes it is not obvious who will perform better, because little
is known about the classi cation problem. Good example of such case might be
predicting various video game properties based only on in-game screen shots.</p>
      <p>The authors have crawled the Steam platform storing information about
genre, date of release, PEGI rating, etc. for thousands of video games. It is
di cult to pinpoint a model that can infer those properties by looking only on
images of gameplay. During research authors have focused on using standard
deep convolutional neural networks in the rst part of the network and fully
connected layers at the end. As well as working on model we are preparing
human annotator based baseline benchmark.</p>
      <p>Interesting regularities have been found, e.g. approximation of release year is
very di cult task on small images, because lot of details are lost. Downscaling
an image is similar to anti-aliasing - in-game models look much less edgy. Both
humans and CNNs struggle with this task. For now it is hard to tell who performs
better because of small amount of human-annotated data. One of the notices
made during the research was that people who don't describe themselves as
gamers perform poorly on this task, and experienced players recognise the game
from the image and then recall release date from memories. In the nearest future
authors plan to use bigger images or even 1:1 crops for the task of release year
approximation and extend our database of human-annotated images.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>