<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Predicting Tags for Stack Over ow Questions</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sayan Pal</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ammar Shaker</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eyke Hullermeier</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Intelligent Systems Group Paderborn University</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2017</year>
      </pub-date>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Stack Over ow (https://stackover ow.com/) is one of the major
communitydriven Question and Answer (Q&amp;A) websites, focusing on topics related to
computer programming. It has nearly 7 million users, who ask more than 6700
questions every day. Each question can be associated with up to ve di erent
tags, which serve as metadata to facilitate information retrieval.</p>
      <p>In this paper, we consider the problem of supporting this process through
automatic tagging. From a machine learning point of view, we are facing a problem
of Extreme Multi-Label Classi cation (XMLC), as Stack Over ow allows for
choosing from several thousands of tags. Besides, instead of learning in a standard
batch mode, it is desirable to learn incrementally on the continuous stream of
questions entering the system, with the capability to capture changes and drifts
in the data; for example, many tags (such as `facebook') have a lifetime, rst
gaining popularity, then reaching a peak and eventually diminishing.</p>
      <p>
        Thus, we end up with an extremely challenging problem of XMLC for data
streams with a non-stationary set of labels. To tackle this problem, we build on
an XMLC method based on probabilistic label trees (PLT), which has recently
been proposed in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. We extend this approach in two directions. First, instead
of specifying the entire PLT beforehand, we develop an adaptive version that
starts with only a single node and expands the tree whenever a new label is
observed in a training example. As a second contribution, we further improve
adaptive PLTs through stream-based boosting [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. More speci cally, we apply the
online boosting method by Oza and Russell, which we tailor for minimizing the
F-measure as a performance metric (instead of 0/1 loss, which is not appropriate
in the context of XMLC).
      </p>
      <p>In addition to the methodological contributions, we present empirical results
based on extensive experiments with real data from Stack Over ow. Our
experimental setting is focused on evaluating the usefulness of the extensions we
proposed for PLTs, i.e., the adaptive handling of labels and online boosting.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Kalina</given-names>
            <surname>Jasinska</surname>
          </string-name>
          , Krzysztof Dembczynski, Robert Busa-Fekete,
          <article-title>Karlson Pfannschmidt, Timo Klerx, and Eyke Hullermeier. Extreme F-measure maximization using sparse probability estimates</article-title>
          .
          <source>In Proc. ICML</source>
          <year>2016</year>
          , New York City, NY, USA, pages
          <volume>1435</volume>
          {
          <fpage>1444</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Nikunj</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Oza</surname>
            and
            <given-names>Stuart</given-names>
          </string-name>
          <string-name>
            <surname>Russell</surname>
          </string-name>
          .
          <article-title>Online bagging and boosting</article-title>
          .
          <source>In In Arti cial Intelligence and Statistics</source>
          <year>2001</year>
          , pages
          <fpage>105</fpage>
          {
          <fpage>112</fpage>
          . Morgan Kaufmann,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Gene</given-names>
            <surname>Smith</surname>
          </string-name>
          .
          <article-title>Tagging : people-powered metadata for the social web</article-title>
          . New Riders, Berkeley, CA,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>