<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>CIRGIRDISCO at RepLab2014 Reputation Dimension Task: Using Wikipedia Graph Structure for Classifying the Reputation Dimension of a Tweet</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Muhammad Atif Qureshi</string-name>
          <email>muhammad.qureshi@nuigalway.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Arjumand Younus</string-name>
          <email>arjumand.younus@nuigalway.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Colm O'Riordan</string-name>
          <email>colm.oriordan@nuigalway.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gabriella Pasi</string-name>
          <email>pasi@disco.unimib.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computational Intelligence Research Group, National University of Ireland Galway</institution>
          ,
          <country country="IE">Ireland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Information Retrieval Lab</institution>
          ,
          <addr-line>Informatics, Systems and Communication</addr-line>
          ,
          <institution>University of Milan Bicocca</institution>
          ,
          <addr-line>Milan</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <fpage>1512</fpage>
      <lpage>1518</lpage>
      <abstract>
        <p>Social media repositories serve as a significant source of evidence when extracting information related to the reputation of a particular entity (e.g., a particular politician, singer or company). Reputation management experts manually mine the social media repositories (in particular Twitter) for monitoring the reputation of a particular entity. Recently, the online reputation management evaluation campaign known as RepLab at CLEF has turned attention to devising computational methods for facilitating reputation management experts. A quite significant research challenge related to the above issue is to classify the reputation dimension of tweets with respect to entity names. More specifically, finding various aspects of a brand's reputation is an important task which can help companies in monitoring areas of their strengths and weaknesses in an effective manner. To address this issue in this paper we use dominant Wikipedia categories related to a reputation dimension in a random forest classifier. Additionally we also use tweet-specific features, language-specific features and similarity-based features. The experimental evaluations show a significant improvement over the baseline accuracy.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Over the past few years social media has emerged as an effective marketing
platform with brands using it for broadening their reach and enhancing their
marketing. At the same time social media users excessively voice out their
opinions about various entities (e.g. musicians, movies, companies) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. This has given
birth to a new area within the marketing domain known as “online reputation
management” whereby automated methods for monitoring reputation of entities
are essential requiring novel computational algorithms to facilitate the work of
reputation management experts [
        <xref ref-type="bibr" rid="ref1 ref2 ref4">1, 2, 4</xref>
        ]. This paper describes our experience in
devising an algorithm for dealing with the “reputation dimension classification”
challenge in the context of RepLab2014 where we are given a set of entities
within two domains (i.e., automotives and banking) and for each entity a set of
tweets, which contain labels along eight different dimensions.
      </p>
      <p>We utilize the Wikipedia category graph structure of an entity to observe
the amount of discussion related to a reputation dimension within the tweet.
The experimental results show an improved accuracy over the baseline and our
system stands at position 5 among the overall submissions.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Task Overview and Dataset</title>
      <p>
        The main aim of the RepLab activity within CLEF is to focus on online
reputation of companies on Twitter, and the RepLab activity for 2014 comprises of
reputation dimension classification which is a fairly new task and differs from past
two year’s tasks (i.e., RepLab 2012 and RepLab 2013) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The task comprises
classification of tweets according to the dimension of the reputation thereby
implying identification of various aspects significant to a company’s reputation and
following are the standard categories used3:
– Products &amp; Services
– Innovation
– Workplace
– Citizenship
– Governance
– Leadership
– Performance
– Undefined
      </p>
      <p>As an example, a tweet containing information about employees’ resignations
within a company would fall under the dimension “Workplace” whereas a tweet
containing information about net profits earned by the company in a financial
period is likely to fall under the dimension “Performance.”</p>
      <p>The corpus is a multilingual collection of tweets referring to a set of 31 entities
spread into two domains: automotive and banking4. Table 1 shows a summary
of the tweets within each dimension from within the automotive and banking
domain; we show the numbers from within the tweets we were able to crawl and
these numbers may differ from the ones available at time of evaluation. For each
entity, at least 2,200 tweets have been collected. The 700 first tweets have been
taken to compose the training set, and the other ones are for the test set.
3 Note that these are the standard categories provided by the Reputation Institute.
4 Note that the reputation dimension classification task within RepLab uses the same
dataset as RepLab2013; however, it utilizes tweets within two domains out of the
four domains.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Methodology</title>
      <p>Our algorithm uses the following sets of features:
– Wikipedia-based features
– Statistical features which we further categorize into tweet-specific features,
language-specific features and word-occurrence features.</p>
      <p>
        In this section we first present a background on Wikipedia category-article
structure followed by a description of the feature set we have utilized.
Our algorithm makes use of the encyclopedic structure in Wikipedia; more
specifically the knowledge encoded in Wikipedia’s graph structure is utilized for the
classification of reputation dimension within tweets. Wikipedia is organized into
categories in a taxonomical structure (see Figure 1). Each Wikipedia category
can have an arbitrary number of subcategories as well as being mentioned inside
an arbitrary number of supercategories (e.g., category C4 in Figure 1 is a
subcategory of C2 and C3, and a supercategory of C5, C6 and C7.) Furthermore, in
Wikipedia each article can belong to an arbitrary number of categories, where
each category is a kind of semantic tag for that article [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. As an example, in
Figure 1, article A1 belongs to categories C1 and C10, article A2 belongs to
categories C3 and C4, while article A3 belongs to categories C4 and C7.
3.2
      </p>
      <sec id="sec-3-1">
        <title>Wikipedia-Based Feature Set: Relatedness Score Based on</title>
      </sec>
      <sec id="sec-3-2">
        <title>Wikipedia Category-Article Structure</title>
        <p>
          In this section we present a description of the Wikipedia-based feature set we
utilize. The extracted phrases (i.e., n-grams) from a tweet are matched with
Wikipedia categories and a voting mechanism is used to score the frequently
matched Wikipedia categories. Note that we use the technique similar to the one
proposed in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] with the only difference being that we directly utilize Wikipedia
categories corresponding to a reputation dimension. We maintain a voting count
corresponding to each Wikipedia category through which the probability of a
Wikipedia category belonging to a particular reputation dimension is
calculated, the final phase involves a manual analysis for fetching the categories most
representative of a particular dimension. For the manual analysis, we plot the
obtained categories using Gephi whereby probabilities are plotted to select the
Wikipedia categories most closely related to a given reputation dimension. As an
example, Figure 2 illustrates the graph of Wikipedia categories corresponding
to the reputation dimension of “Innovation” for the automotive domain. The
red-colored nodes in Figure 2 represent the Wikipedia categories that occur in a
particular dimension with a probability of 1.0, the white-colored nodes represent
a probability of 0.0, and the various green-colored nodes represent probabilities
around 0.5. Using these graphs, we select the Wikipedia categories that
represent high probabilities for a dimension and these categories are then used in our
relatedness scoring framework.
        </p>
        <p>After the selection of Wikipedia categories we extract possible n-grams from
a tweet and then we score relatedness of n-grams of a tweet with a given
dimension. The extracted phrases from a tweet which are contained in the selected
Wikipedia categories are used to calculate the relatedness score. The following
summarizes important factors which contributes in calculating our relatedness
score for a tweet
– Depthsignificance denotes the significance of category depth at which a matched
phrase occurs; the deeper the match occurs in the taxonomy the less its
significance to the dimension under consideration. This implies that the matched
phrases in the parent category of the dimension under investigation are more
likely to be relevant to the dimension than those that are at a deeper depth.
– Catsignificance denotes the significance of a matched phrase’s categories
corresponding to the dimension under investigation.</p>
        <p>
          The relatedness scores constitute our Wikipedia-based feature set for the
reputation dimension classification task.
We also use three additional set of statistical features which similar to the ones
proposed by Kothari et al.[
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] fall under the following categorization
– Tweet-specific features: We used four tweet-specific features that relate to
how a tweet is written. They are: (1) presence of hashtag (#tag); (2) presence
of user mention (some user); (3) presence of url in a tweet; (4) language of
the tweet (i.e., English or Spanish).
– Language-specific features: We used three language-specific features that
relate to various aspects of reputation dimension for a brand. They are: (1)
occurrence of a percentage symbol in a tweet; (2) occurrence of currency
symbol in a tweet; (3) proportion of common-noun POS tags, proper-noun
POS tags, adjective POS tags and verb POS tags in a tweet.
– Word-occurence features: We used two word-occurrence features of which the
first checks for the presence of other entity names of same domain; note that
products and services dimension contains a lot of tweets whereby other
entities are mentioned in the tweet. The second feature first counts the number
of times a word occurs in a given dimension for different entities (i.e., checks
for word occurrence in 20 entities of automotive domain, and 11 entities of
banking domain) and if the number of occurrences is above an
empiricallyset threshold we add that particular word to our dictionary of dimension
terms. The number of dimension terms present are then used as features.
3.4
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Machine Learning and Experimental Runs</title>
        <p>Using the feature sets described in Section 3.2 and 3.3, we train a random forest
classifier over the training data and then use it to predict labels for the test data.
We perform three machine learning runs as follows:
1. For the first run, we use only Wikipedia-based features of Section 3.2 whilst
training random forest classifier per domain i.e. combining all tweets related
to a a particular domain into one training and one test set
2. For the second run, we use only the additional features of Section 3.3 whilst
training a random forest classifier per domain i.e. combining all tweets related
to a a particular domain into one training and one test set
3. For the third run, we use all features i.e. both Wikipedia-based features and
additional features of Section 3.2 and 3.3 whilst training a random forest
classifier per domain i.e. combining all tweets related to a a particular domain
into one training and one test set</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experimental Results</title>
      <sec id="sec-4-1">
        <title>Dataset</title>
        <p>
          We performed our experiments by using the data set provided by the organizers
of RepLab 2014 [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. In this data set 31 entities were provided, and for each entity
at least 2200 tweets were collected: the first 700 constituted the training set, and
the rest served as the test set. The measures used for the evaluation purposes
are Accuracy, Precision and Recall.
4.2
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>Results</title>
        <p>of precision and recall show an average performance and we believe these errors
come from some noisy Wikipedia categories. As future work, our aim is to remove
the noisy Wikipedia categories along with automation of the category selection
process of Section 3.2.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. E. Amig´o,
          <string-name>
            <given-names>J.</given-names>
            <surname>Artiles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gonzalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Spina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Corujo. WePS3 Evaluation</surname>
          </string-name>
          <article-title>Campaign: Overview of the On-line Reputation Management Task</article-title>
          .
          <source>In 2nd Web People Search Evaluation Workshop (WePS</source>
          <year>2010</year>
          ), CLEF 2010 Conference, Padova Italy,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>E.</given-names>
            <surname>Amigo</surname>
          </string-name>
          , J. Carrillo de Albornoz, I. Chugur,
          <string-name>
            <given-names>A.</given-names>
            <surname>Corujo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gonzalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Martin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Meij</surname>
          </string-name>
          , M. de Rijke, and
          <string-name>
            <given-names>D.</given-names>
            <surname>Spina</surname>
          </string-name>
          . Overview of replab 2013:
          <article-title>Evaluating online reputation monitoring systems</article-title>
          .
          <source>In Fourth International Conference of the CLEF initiative, CLEF</source>
          <year>2013</year>
          ,
          <article-title>Valencia, Spain</article-title>
          . Proceedings, Springer LNCS,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3. E. Amig´o,
          <string-name>
            <given-names>J.</given-names>
            <surname>Carrillo-de-Albornoz</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Chugur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Corujo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gonzalo</surname>
          </string-name>
          , E. Meij, M. de Rijke, and
          <string-name>
            <given-names>D.</given-names>
            <surname>Spina</surname>
          </string-name>
          .
          <source>Overview of RepLab</source>
          <year>2014</year>
          :
          <article-title>author profiling and reputation dimensions for Online Reputation Management</article-title>
          .
          <source>In Proceedings of the Fifth International Conference of the CLEF initiative, Sept</source>
          .
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4. E. Amig´o,
          <string-name>
            <given-names>A.</given-names>
            <surname>Corujo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gonzalo</surname>
          </string-name>
          , E. Meij, and M. d. Rijke. Overview of replab 2012:
          <article-title>Evaluating online reputation management systems</article-title>
          .
          <source>In CLEF 2012 Labs and Workshop Notebook Papers</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>C.</given-names>
            <surname>Dellarocas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. F.</given-names>
            <surname>Awad</surname>
          </string-name>
          , and
          <string-name>
            <given-names>X. M.</given-names>
            <surname>Zhang</surname>
          </string-name>
          .
          <article-title>Exploring the value of online reviews to organizations: Implications for revenue forecasting and planning</article-title>
          .
          <source>In MANAGEMENT SCIENCE</source>
          , pages
          <fpage>1407</fpage>
          -
          <lpage>1424</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>A.</given-names>
            <surname>Kothari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Magdy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Darwish</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Taei</surname>
          </string-name>
          .
          <article-title>Detecting comments on news articles in microblogs</article-title>
          .
          <source>ICWSM</source>
          <year>2013</year>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Qureshi</surname>
          </string-name>
          , C. ORiordan, and
          <string-name>
            <given-names>G.</given-names>
            <surname>Pasi</surname>
          </string-name>
          .
          <article-title>Exploiting wikipedia for entity name disambiguation in tweets</article-title>
          .
          <source>In 15th International Conference on Applications of Natural Language to Information Systems, NLDB</source>
          <year>2014</year>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>T.</given-names>
            <surname>Zesch</surname>
          </string-name>
          and
          <string-name>
            <surname>I. Gurevych.</surname>
          </string-name>
          <article-title>Analysis of the Wikipedia Category Graph for NLP Applications</article-title>
          .
          <source>In Proceedings of the TextGraphs-2 Workshop (NAACL-HLT)</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>