<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>QASM: a Q&amp;A Social Media System Based on Social Semantic</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Zide Meng</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fabien Gandon</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Catherine Faron-Zucker</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>INRIA Sophia Antipolis Mediterranee</institution>
          ,
          <addr-line>06900 Sophia Antipolis</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Univ. Nice Sophia Antipolis</institution>
          ,
          <addr-line>CNRS, I3S, UMR 7271, 06900 Sophia Antipolis</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, we describe the QASM (Question &amp; Answer Social Media) system based on social network analysis to manage the two main resources in CQA sites: users and contents. We rst present the QASM vocabulary used to formalize both the level of interest and the expertise of users on topics. Then we present our method to extract this knowledge from CQA sites. Finally we show how this knowledge is used both to nd relevant experts for a question and to search for similar questions. We tested QASM on a dataset extracted from the popular CQA site StackOver ow.</p>
      </abstract>
      <kwd-group>
        <kwd>Community Question Answering</kwd>
        <kwd>Social Media Mining</kwd>
        <kwd>Semantic Web</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Community Question Answering (CQA) services provide a platform where users
can ask expert for help. Since questions and answers can be viewed and searched
afterwards, people with similar questions can also directly nd solutions by
browsing this content. Therefore, e ectively managing these content is a key
issue. Previous research works on this topic mainly focus on expert detection [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ],
similar question retrieval [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In this paper, we describe QASM (Question &amp;
Answer Social Media), a system based on social network analysis (SNA) to manage
the two main resources in CQA sites: users and contents. We rst present the
QASM vocabulary used to formalize both the level of interest and the expertise
of users on topics. Then we present our method to extract this knowledge from
CQA sites. Our knowledge model and knowledge extraction method is an
extension of our work presented in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] on social media mining for detecting topics from
question tags in CQA sites. Finally we show how this knowledge is used both to
nd relevant experts for routing questions (users interested and experts in the
question topics) and to nd answers to questions by browsing CQA content and
by identifying relevant answers to similar questions previously posted. We tested
QASM on a dataset extracted from the popular CQA site StackOver ow.
      </p>
    </sec>
    <sec id="sec-2">
      <title>Overview</title>
      <sec id="sec-2-1">
        <title>QASM System Description</title>
        <p>The QASM vocabulary2 enables to model the level of user interests and
expertise and topics of questions and answers from Q&amp;A sites. Figure 2 provides an
overview of it. It reuses both the SIOC ontology and the Weighting ontology3.
{ qasm:Topic represents a set of tags related to a speci ed topic. In our
models, tags belong to instances of qasm:Topic, we also consider di erent tags
have di erent weights for each topic.</p>
        <sec id="sec-2-1-1">
          <title>1 http://sioc-project.org/ontology 2 It is available online at http://ns.inria.fr/qasm/qasm.html 3 http://smiy.sourceforge.net/wo/spec/weightingontology.html</title>
          <p>QASM: a Q&amp;A System Based on Social Semantic
{ qasm:WeightedObject is used to describe the weight that a speci ed subject
has with regard to a speci ed object. This class has four subclasses which
represent question topics, users' interests, users' expertise and tag topics
respectively. In fact, this class is used to model the distributions we extracted
from the original data. For example, topic-tag distribution, user-interest
distribution.
{ qasm:interestIn is used to describe the user-interest distribution. This
property is di erent from foaf:interest for its range. In FOAF people are
interested in documents, while in QASM a user is interested in a topic to a
certain degree (a weight).
{ qasm:expertiseIn is used to describe the user-expertise distribution. A user
has di erent weights for di erent topics.
2.3</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Knowledge Extraction by Social Media Mining</title>
      <p>
        Topics, interests and levels of expertise are implicit information in the available
raw CQA data. We use social media mining techniques to extract this knowledge.
{ Topics &amp; User Interests In [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], we proposed a light-weight model to extract
topics from question tags. The output of this model is a topic-tag distribution
where each tag belonging to a topic is given a weight (probability) indicating
to what extent the tag is related to the topic. A user answering a question
acquires the tags attached to this question and can therefore be represented
by a list of tags. Then we use the topic-tag distribution to compute a
usertopic distribution indicating to what extent each user is related to a topic.
{ User Expertise The users interested in a question may provide answers to it
or comments to other answers. Each question or answer may get votes from
other users and an answer may be chosen as the best answer. By exploiting
the tags attached to a question and the topic-tag distribution, the users
providing questions or answers with a high number of votes or the best
answers can be considered as experts in the topics to which their questions
belongs. Equation 1 de nes how we use the vote information to compute
users' levels of expertise. Eu;k denotes the expertise of user u on topic k, m
denotes the number of answers provided by user u, Pt;k denotes the weight
of tag t for topic k, Qi and Ai;j denote the votes on question i and its jth
answer, where Aj is the jth answer provided by user u to question Qi.
(1)
2.4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Experimental Evaluation</title>
      <p>
        We rst built an RDF dataset from Stackover ow raw data which comprises
15327727 triples4. Then we randomly chose several questions and for each
question we recorded 10 or 20 users provided by our system. Then for each question,
we computed the proportion of the recorded users who actually answered it.
Compared to [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], our results are much better.
      </p>
      <p>
        100 500 1000 average [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]
precision@10 0.021 0.0188 0.0187 0.0195 0.0167
precision@20 0.016 0.0134 0.0134 0.0143 0.0118
      </p>
      <sec id="sec-4-1">
        <title>Conclusion and Future Work</title>
        <p>We presented QASM, a Q&amp;A system combining social media mining and
semantic web models and technologies to manage Q&amp;A users and content in CQA
sites. There are many potential future directions for this work. We are currently
considering constructing a benchmark for Q&amp;A system based on our
Stackoverow dataset. In a near future we will also enrich the linking of QASM with the
LOD which may help to improve question routing and similar question search.</p>
        <sec id="sec-4-1-1">
          <title>4 It is available online at https://wimmics.inria.fr/data</title>
        </sec>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Anderson</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huttenlocher</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kleinberg</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leskovec</surname>
          </string-name>
          , J.:
          <article-title>Discovering value from community activity on focused question answering sites: a case study of stack overow</article-title>
          .
          <source>In Proc. of the 18th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining</source>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qiu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gottipati</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jiang</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sun</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          :
          <article-title>CQArank: jointly model topics and expertise in community question answering</article-title>
          .
          <source>In Proc. of the 22nd ACM Int. Conf. on Information &amp; Knowledge Management</source>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Zide</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fabien</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Catherine</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ge</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Empirical Study on Overlapping Community Detection in Question and Answer Sites</article-title>
          .
          <source>In Proc. of the Int. Conf. on Advances in Social Networks Analysis and Mining</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <issue>4</issue>
          .
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Routing questions for collaborative answering in community question answering</article-title>
          .
          <source>In Proc. of Int. Conf. on Advances in Social Networks Analysis and Mining (ASONAM)</source>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>