<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards Opinion Mining Through Tracing Discussions on the Web</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Selver Softic</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Hausenblas</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute of Information Systems &amp; Information Management, JOANNEUM RESEARCH</institution>
          ,
          <addr-line>Steyrergasse 17, 8010 Graz</addr-line>
          ,
          <country country="AT">Austria</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper reports on our ongoing work regarding opinion mining from Web-based discussion forums in the realm of the Understanding Advertising (UAd) project. Our approach to opinion mining is to first RDFise discussion forums in SIOC, and in a second phase to interlink the so created data with linked datasets such as DBpedia. We are confident that this should allow a market researcher to formulate queries using domain semantics and hence understand what people think about a certain product or service. The system's architecture, preliminary results, and the current available demonstrator are discussed in this work.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>The analysis performed in UAd is twofold, (i) by visual interpretation of
advertisements (from print media, Web and TV), and (ii) by using information
available on the Web. Fig. 1 depicts the overall UAd system architecture,
consisting of (i) the UAd Analyser (the front-end for the end-user), (ii) the “Public
Knowledge Interface”, and (iii) the Visual Analysis module. Information about
products and services are gathered from the Web through the so called UAd
“Public Knowledge Interface” (PKI). We have developed three methods
converting plain (HTML) Web content into structured data represented in RDF
allowing us to be both flexible and comprehensive:
1. Plain old screen scraping (in the so called UAd Harvester/Mapper module);
2. Pattern-based RDFising and Interlinking for online discussions (the UAd</p>
      <p>Discussion Tracer);
3. Schema-based a-priori RDFising and Interlinking (for statistical data from</p>
      <p>
        Eurostat; described elsewhere [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]);
In this paper we focus on tracing discussions on the Web, hence the two
components involved in this task (Discussion Tracer and Analyser) are highlighted in
Fig. 1.
      </p>
      <p>This paper is structured as follows: First, we review related work in section 2.
Then, in section 3 we discuss our approach representing discussions and opinions.
In section 4 we present the system’s architecture, discuss the data acquisition
and the market researcher’s interface. We present preliminary results in section 5.
Finally, we discuss our findings and highlight future work in section 6.</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Recent research on opinion mining has focused on sentiment analysis, simple
“pro” and “cons” classification [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and determination of semantic orientation in
opinion models using feature-based opinion summarisation on word, sentence
or document level. Typically, Natural Language Processing (NLP) [
        <xref ref-type="bibr" rid="ref5 ref6 ref7">5–7</xref>
        ] and
machine learning techniques [
        <xref ref-type="bibr" rid="ref4 ref8">4, 8</xref>
        ] have been utilised in supervised or
unsupervised modes [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ] allowing the extraction and classification of sentiment and
opinions polarisation. The workflow usually comprises three major phases:
extraction, structuring and summarisation of results. In general we subscribe to
this pattern, however differ in a number of details mostly regarding the explicit
representation of the information.
      </p>
      <p>
        Motivated by earlier experiences [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ] our approach is based on Semantic
Web technologies (RDF, SPARQL, etc.). Further, in contrast to existing work, we
use widely deployed vocabularies—e.g. Semantically-Interlinked Online
Communities (SIOC)—along with existing APIs [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] for the extraction and structuring
phase. Regarding the formal representation of products and their characteristics
it is worth noting that the W3C has recently launched the “Product Modelling
Incubator Group“2 aiming at creating a product modelling ontology.
      </p>
      <p>
        We aim at orienting the opinion holder context on domain semantics [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] along
with exploiting linked datasets (such as DBpedia [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]) and domain delimited
query expansion [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Furthermore, the creation of opinion ranking primary for
sentiment classification [
        <xref ref-type="bibr" rid="ref16 ref17">16, 17</xref>
        ] will be considered in greater detail in our future
work.
      </p>
      <p>To the best of our knowledge there exists no other work in the area of opinion
mining that deals with explicitly modelled opinions along with linked data sets
for its domain knowledge.</p>
      <p>
        The basic idea of linked data was outlined by Sir Tim Berners-Lee [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] in
2006. The Linking Open Data (LOD) community project3 is an open,
collaborative effort applying the linked data principles. It aims at bootstrapping the Web
of Data by publishing datasets in RDF on the Web and creating large numbers
of links between these datasets [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. The datasets included in the project are
diverse in both nature and size. Currently, the project includes some 30 different
datasets, ranging from rather centralized ones (such as DBpedia [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]) to those
that are very distributed (for example the FOAF-o-sphere). While some of the
datasets focus on certain domains (for example the Eurostat data [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]), others
are more of a generic type, such as Revyu.com [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ].
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Representing Discussions and Opinions</title>
      <p>To support a market researcher in analysing a certain market, one of the sources
used in the UAd PKI are Web-based discussion forums. For enabling structured</p>
      <sec id="sec-3-1">
        <title>2 http://www.w3.org/2005/Incubator/w3pm/</title>
      </sec>
      <sec id="sec-3-2">
        <title>3 http://linkeddata.org/</title>
        <p>queries and browsing it is necessary to represent the discussions in a
machineinterpretable way and enhance it with domain semantics. Web-based discussion
forums offer a well-structured source for this purpose, hence the idea to exploit
them along with linked datasets.</p>
        <p>Our goal is it to explicitly model the opinions in a discussion being compliant
to the Web of Data. We decided to reuse an existing vocabulary to represent
the discussions rather than reinventing the wheel. Due to its popularity and
wide-spread use, the Semantically-Interlinked Online Communities (SIOC)
vocabulary4 has been selected to represent discussion threads and posts.</p>
        <p>However, in case of explicitly representing opinions we did not manage to find
an appropriate vocabulary. Although one could for example use a review
vocabulary5 as a base and extend it, we found it better suited to define a dedicated
vocabulary for this task.</p>
        <p>Our “Opinion Mining Core Ontology”6 (cf. Fig. 2) basically defines the
following classes and properties:
– opm:DiscussionOpinion, the central hub that connects discussion threads
with opinions about a certain entity;
– opm:Opinion, an abstract representation of an opinion;
– opm:Topic, a proxy concept to trigger aspects of a certain topic.</p>
      </sec>
      <sec id="sec-3-3">
        <title>4 http://www.sioc-project.org/ontology</title>
      </sec>
      <sec id="sec-3-4">
        <title>5 Such as http://danja.talis.com/xmlns/rev_2007-11-09/index.html</title>
      </sec>
      <sec id="sec-3-5">
        <title>6 http://sw.joanneum.at/uad/u-opm/schema/core-u-opm.rdf</title>
        <p>
          We use skos:Concept of SKOS [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] to represent what a discussion is about,
for example, a certain car such as the Alfa Romeo 156; we note that this design
decision also supports the straight-forward utilisation of data from DBpedia.
Further, we use the sioc:Thread from the SIOC vocabulary to indicate where
the discussion has been taken place.
        </p>
        <p>
          It has to be noted that opm:Opinion is currently deliberately underspecified.
We intend to extend and refine this part to the ontology based on our experiences
with the system and regarding earlier work from [
          <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
          ]. Further, we want to
point out that the opm:Topic concept is used to represent a certain aspect of
a discussion, that is, it might indicate that users discuss about the pricing,
about problems with a certain product or simply express their satisfaction. The
semantics of this concept are such that if one of the assigned trigger words has
been found in a discussion, the topic is believed to match (hence the labelling of
the datatype property opm:hasTrigger).
        </p>
        <p>The introduced lightweight ontology above plays a decisive role in our opinion
mining process. In order to achieve better scalability and reusability, it acts as a
nexus between the domain of concern and the RDFised data. This is why it makes
no difference for our opinion mining model if there is the DBpedia categorisation
behind or some other domain specific ontology. Therefore, our approach offers
flexibility by choice of domain and yields a generic opinion creation.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Discussion Tracing</title>
      <p>In the process of discussion tracing in UAd, two major components are involved
(Fig. 3), namely (i) the UAd Data Acquisition (highlighted), where Web-based
discussions are harvested, RDFised and interlinked, and (ii) the UAd Analyser,
allowing to query and access the data.
4.1</p>
      <sec id="sec-4-1">
        <title>Data Acquisition</title>
        <p>The data acquisition in UAd is performed in three phases; in a first step the
most common data in a Web-based discussion forum, such as title, author,
creation date, etc. is RDFised using SIOC. In a second phase the entities occurring
in the discussion posts are identified and interpreted regarding a certain
domain (in our demonstrator this domain is “cars”). This second step involves
the interlinking to linked datasets such as DBpedia or instances of some other
domain specific ontology. For our purposes DBpedia offers enough adequate
instances and well formed domain model respectively area of interest. However,
as mentioned in Section 3 this is not mandatory and DBpedia can be easily
replaced by any other domain ontology. Currently, interlinking with DBpedia is
done manually, however in the final version we are aiming to automate this task.
In a third phase the (subjective) statements of participants are analysed, and
further added to the knowledge base. This is mainly achieved by the creation
of opm:DiscussionOpinion instances and their respective properties. To this
end, we use a manually pre-configured list of possible topics, that is instances of
opm:Topic to trigger the creation of opinions.</p>
        <p>We have implemented a client/server system (Fig. 3, left and bottom) to
perform the data acquisition in UAd. Within the scope of our research we support
RDFising popular discussion forum types7 such as vBulletin or phpBB. Data
extraction occurs automatically using extracion profiles, manually defined for
several forum types; a single acquisition task represents a single job on the server.
The server has been implemented using a Java application server (Tomcat) along
with a Jena 2/PostgreSQL RDF store taking care of the scheduling and execution
of the acquisition tasks.</p>
        <p>At the client side, a Firefox plug-in (Fig.4) allows a user to define, control
and monitor the tracing tasks. The plug-in has been developed in JavaScript and
XUL8. A user typically adds the link of a discussion forum and selects the forum
type. Currently, only entire forums can be extracted. We plan to support the
selection of sub-forums independently from each other for the extraction task.
The user can also specify time parameters for the acquisition tasks, for example
how often per week a job should be triggered to update the store.
4.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Analyser</title>
        <p>The UAd Analyser is a Web Application allowing a market researcher to examine
the data gathered by the UAd Acquistion Server. In Fig. 5 the current state of
the implementation (implemented with the Google Web Toolkit9) is depicted.
The user can limit the data by selecting certain car classifications and issues</p>
        <sec id="sec-4-2-1">
          <title>7 http://www.big-boards.com/statistics</title>
        </sec>
        <sec id="sec-4-2-2">
          <title>8 http://developer.mozilla.org/en/docs/XUL</title>
        </sec>
        <sec id="sec-4-2-3">
          <title>9 http://code.google.com/webtoolkit/</title>
          <p>as well as by restricting the time period. The queried data is visualised with a
Simile Timeplot10 module, displaying the time on the X-axis and the number of
discussion posts in the Y-axis. Discussion threads are illustrated as red vertical
lines; the users may retrieve detailed information by clicking on the line and
browse to the discussion thread where the matching post is located.</p>
          <p>The post count, respectively a single time unit, in the X-axis reflects the
occurrence frequency of topic. Additionally information about, diversity of authors
who posted that day can be explored. The knowledge about authors diversity
can be used to underline for instance how reliable or unreliable is the sentiment
in chosen posts. The most important contribution of this visualisation is to offer
an overview on diverse discussion forums regard a topic of interest.
5</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Preliminary Results</title>
      <p>In order to assess our opinion mining system, a baseline-evaluation using two
standard information retrieval measures (precision and recall) has been
performed. We have compared our approach to a full-text index (Lucene11). The
domain is currently limited to “cars” (as we have mostly advertisements for the
visual analysis available) although we note that the methodology is expected to
yield similar results for other domains. The flexibility of our approach is mainly
determined by the availability of appropriate instances from DBpedia.
10 http://simile.mit.edu/timeplot/
11 http://lucene.apache.org/java/docs/
Our reference data set contains approximately 1000 posts that have been
extracted from a single discussion forum12, focusing on the content of three
subforums including all threads and posts about certain car types. Two of the
extracted sub-forums contain discussions about cars belonging to the mid-sized
car class according to categorisation from DBpedia13. The working data set
includes 60 representative posts (20 per car type). We have manually selected
posts containing discussion on topics such as “performance and problems” and
“popularity”.</p>
      <p>The extracted posts were firstly used to generate opinions on the discussion
topics, and secondly for the initialisation of the index over the reference test
data (for Lucene). We have converted each of them into a single file containing
information on the posting date, author, post URI, the content and the title
of the thread the post belongs to, allowing to create an index searchable by
Lucence. The Lucene index contains the fields author, title, summary, content
and link to post corresponding to the properties in RDFised data and with the
intention to provide as similar as possible initial point to RDFised data, for
comparison and measurement of results.</p>
      <p>
        Prior to the manual creation of the triggers for discussion topics we have
analysed the initialised fields of the Lucene index for occurrence frequency of
specific keywords and the “Zipf” distributions [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. As depicted in listing 1.1
topic triggers contain words or word stems that serve as annotation events.
Opinion generation is initiated by accordance of trigger words with words from
the content or title of posts. An example discussion opinion generated in this
way is shown in listing 1.2.
12 http://www.automotiveforums.com
13 http://dbpedia.org/resource/Category:Mid-size_cars
1 @prefix : &lt; http :// sw . joanneum . at / uad / cars / topics # &gt; .
2 @prefix dc : &lt; http :// purl . org / dc / elements /1.1/ &gt; .
3 @prefix opm : &lt; http :// sw . joanneum . at / uad /u - opm / schema / core -u - opm . rdf # &gt; .
4
5 : performance_and_problems a opm : Topic ;
6 dc : subject " performance and problems ";
7 opm : hasTrigger " damage " ,
" performance " ,
...
      </p>
      <p>" problem " .</p>
      <p>Listing 1.1. Sample discussion topic snippet.
1 @prefix : &lt; http :// sw . joanneum . at / uad / cars / opinions # &gt; .
2 @prefix utop : &lt; http :// sw . joanneum . at / uad / cars / topics # &gt; .
3 @prefix opm : &lt; http :// sw . joanneum . at / uad /u - opm / schema / core -u - opm . rdf # &gt; .
4
5 : do11 a opm : DiscussionOpinion ;
6 opm : about &lt; http :// dbpedia . org / resource / Alfa_Romeo_156 &gt;;
7 opm : in &lt; http :// www . automotiveforums . com / vbul ... php ?t =173469 &gt;;
8 opm : onTopic utop : performance_and_problems .</p>
      <p>Listing 1.2. Sample generated discussion opinion.
5.2</p>
      <sec id="sec-5-1">
        <title>Results</title>
        <p>For the evaluation we have compared our method with the standard Lucene
retrieval results of simple queries. Additionally we had a look at extended
Lucenequeries; these extended queries have been used to decrease the influence of a
single trigger. Listing 1.3 shows a sample SPARQL query we have used for our
approach.</p>
        <p>1 prefix owl : &lt; http :// www . w3 . org /2002/07/ owl # &gt;
2 prefix utop : &lt; http :// sw . joanneum . at / uad / cars / topics # &gt; .
3 prefix opm : &lt; http :// sw . joanneum . at / uad /u - opm / schema / core -u - opm . rdf # &gt;
4
5 SELECT * FROM &lt; http :// sw . joanneum . at / uad &gt;
6 WHERE {
7 ? do a opm : DiscussionOpinion ;
8 opm : about ? about ;
9 opm : in ? in ;
10 opm : onTopic utop : performance_and_problems .
11 ? about owl : sameAs &lt; http :// dbpedia . org / resource / Alfa_Romeo_156 &gt; .
12 }</p>
        <p>Listing 1.3. Sample opinion mining SPARQL query.</p>
        <p>From table 5.2 we learn that regarding recall our method unsurprisingly
seems to outperform simple full-text indexing. Even in the extended mode Lucene’s
precision and recall values are below our approach.</p>
        <p>Precision
Recall</p>
        <p>Lucene UAd Analyser
“performance “popularity” “performance “popularity”
and problems” and problems”
simple 0.4 1 0.76 0.86
extended 0.2–0.62 0.56–0.86
simple 0.1 0.05 0.95
extended 0.05–0.8 0.3–0.7
Table 1. Results from the Evaluation (Lucence vs. UAd).</p>
        <p>Although we have used a rather limited working set in this evaluation we
are optimistic that the results scale well both regarding size and other domains;
further evaluations are in the scope of our current research.
6</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>In this paper we have proposed a novel approach to opinion mining on the Web
by using Web of Data technologies and linked datasets. Our goal is to explicitly
model opinions found in discussions on the Web; we have developed an according
vocabulary to represent these opinions formally (in RDF) and have reported on
an implementation of this approach.</p>
      <p>We contemplate on using GoodRelations14—an ontology for linking
product descriptions and business entities on the Web—in order to more accurately
describe the target of an discussion in our realm.</p>
      <p>
        To increase the precision we ponder about extending our opinion mining core
mechanism with Natural Language Processing techniques and/or use neural
networks to categorise topics automatically. As a part of the sentiment classification
we aim to use SentiWordnet [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] or other similar approaches for the creation of
opinion ranking based on trigger occurrences and the so called PN-polarity 15
of the content.
      </p>
      <p>Currently, we summarise results visually respectively topics, identities, time
and occurrence frequency to mirror the sentiment intention in opinions
environment. However, currently we do not dive into sentiment interpretation of
opinions. Considering the visual analysis, it is important to mention that
sentiment interpretation underlies the judgement of end user and his observation
standpoint. Anyway, objective parameters such as time period, identities,
number of posts etc. can be evaluated independent of matter of particular interest.
For further evaluations, user annotated content like reviews or similar will be
used additionally.
14 http://www.heppnetz.de/projects/goodrelations/
15 P stands for “Positive” and N for “Negative” in this context.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgements</title>
      <p>The research reported in this paper has been carried out in the “Understanding
Advertising” (UAd) project, funded by the Austrian FIT-IT Programme. The
authors would like to thank their colleagues Magdalena Lauber, Wolfgang Weiss,
and Werner Bailer for their support and valuable comments.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>P.</given-names>
            <surname>Mika</surname>
          </string-name>
          .
          <article-title>Microsearch: An Interface for Semantic Search</article-title>
          .
          <source>In Proc. of the Workshop on Semantic Search (SemSearch</source>
          <year>2008</year>
          )
          <article-title>at the 5th European Semantic Web Conference (ESWC</article-title>
          <year>2008</year>
          ) , Tenerife, Spain, volume
          <volume>334</volume>
          <source>of CEUR Workshop Proceedings. CEUR-WS.org</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>W.</given-names>
            <surname>Halb</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Raimond</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Hausenblas</surname>
          </string-name>
          .
          <article-title>Building Linked Data For Both Humans and Machines</article-title>
          .
          <source>In WWW 2008 Workshop: Linked Data on the Web (LDOW2008)</source>
          , Beijing, China,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>M.</given-names>
            <surname>Hausenblas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Halb</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Raimond</surname>
          </string-name>
          .
          <article-title>Scripting User Contributed Interlinking</article-title>
          .
          <source>In 4th Workshop on Scripting for the Semantic Web (SFSW08)</source>
          , Tenerife, Spain,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>S.</given-names>
            <surname>Kim</surname>
          </string-name>
          and
          <string-name>
            <given-names>E.</given-names>
            <surname>Hovy</surname>
          </string-name>
          .
          <article-title>Automatic identification of pro and con reasons in online reviews</article-title>
          .
          <source>In Proceedings of the COLING/ACL on Main conference poster sessions</source>
          , pages
          <fpage>483</fpage>
          -
          <lpage>490</lpage>
          , Morristown, NJ, USA,
          <year>2006</year>
          .
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>M.</given-names>
            <surname>Hu</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Liu</surname>
          </string-name>
          .
          <article-title>Mining opinion features in customer reviews</article-title>
          .
          <source>In American Association for Artificial Intelligence at AAAI-04</source>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>M.</given-names>
            <surname>Hu</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Liu</surname>
          </string-name>
          .
          <article-title>Mining and summarizing customer reviews</article-title>
          .
          <source>In Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining at KDD-2004</source>
          , pages
          <fpage>168</fpage>
          -
          <lpage>177</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>K.</given-names>
            <surname>Dave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lawrence</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Pennock</surname>
          </string-name>
          .
          <article-title>Mining the peanut gallery: Opinion extraction and semantic classification of product reviews</article-title>
          .
          <source>In WWW2003 - The Twelfth International World Wide Web Conference</source>
          , Budapest, HUNGARY,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>N.</given-names>
            <surname>Kobayashi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Inui</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Matsumoto</surname>
          </string-name>
          .
          <article-title>Opinion Mining from Web Documents: Extraction and Structurization</article-title>
          .
          <source>Informational and Media Technologies</source>
          <volume>2</volume>
          (
          <issue>1</issue>
          ),
          <volume>12</volume>
          (
          <issue>1</issue>
          ):
          <fpage>326</fpage>
          -
          <lpage>337</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>A.</given-names>
            <surname>Ghose</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Ipeirotis</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Sundararajan</surname>
          </string-name>
          .
          <article-title>Opinion Mining using Econometrics: A Case Study on Reputation Systems</article-title>
          .
          <source>In Proceedings of the Association for Computational Linguistics (ACL)</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>M.</given-names>
            <surname>Gamon</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Aue</surname>
          </string-name>
          .
          <article-title>Automatic identification of sentiment vocabulary: Exploiting low association with known sentiment terms</article-title>
          .
          <source>In Proceedings of the ACL05 Workshop on Feature Engineering for Machine Learning in Natural Language Processing</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>M.</given-names>
            <surname>Hausenblas</surname>
          </string-name>
          and
          <string-name>
            <given-names>H.</given-names>
            <surname>Rehatschek</surname>
          </string-name>
          .
          <article-title>mle: Enhancing the Exploration of Mailing List Archives Through Making Semantics Explicit</article-title>
          .
          <source>In Semantic Web Challenge 2007 at the 6th International Semantic Web Conference (ISWC07)</source>
          , Busan, South Korea,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>S.</given-names>
            <surname>Fernandez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Berrueta</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.E.</given-names>
            <surname>Labra</surname>
          </string-name>
          .
          <article-title>Mailing Lists Meet The Semantic Web</article-title>
          .
          <source>In Proc. of the BIS 2007 Workshop on Social Aspects of the Web</source>
          , Poznan, Poland,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>S.</given-names>
            <surname>Fernandez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Giasson</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Idehen. SIOC Ontology</surname>
          </string-name>
          <article-title>: Applications and Implementation Status</article-title>
          . http://www.sioc-project.org/applications#creating-api,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>S.</given-names>
            <surname>Auer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          , G. Kobilarov,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lehmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cyganiak</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Z. G.</given-names>
            <surname>Ives. DBpedia</surname>
          </string-name>
          :
          <article-title>A Nucleus for a Web of Open Data</article-title>
          .
          <source>In The Semantic Web, 6th International Semantic Web Conference, 2nd Asian Semantic Web Conference, ISWC 2007 + ASWC</source>
          <year>2007</year>
          , pages
          <fpage>722</fpage>
          -
          <lpage>735</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>J. Bhogal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Macfarlane</surname>
            , and
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Smith.</surname>
          </string-name>
          <article-title>A review of ontology based query expansion</article-title>
          .
          <source>Inf</source>
          . Process. Manage.,
          <volume>43</volume>
          (
          <issue>4</issue>
          ):
          <fpage>866</fpage>
          -
          <lpage>886</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>A. Esuli. Opinion</given-names>
            <surname>Mining</surname>
          </string-name>
          .
          <article-title>Presentation slides, Language</article-title>
          and Intelligence Reading Group, June 14,
          <year>2006</year>
          , Pisa, Italy, Istituto di Scienza e Tecnologie dell'
          <source>Informazione Consiglio Nazionale delle Ricerche</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <given-names>B.</given-names>
            <surname>Liu</surname>
          </string-name>
          .
          <article-title>Opinion Mining and Summarization, Sentiment Analysis</article-title>
          .
          <source>Presentation slides, Tutorial given at WWW-2008, April</source>
          <volume>21</volume>
          , 2008 in Beijing, China, Department of Computer Science University of Illinois at Chicago,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>T.</surname>
          </string-name>
          Berners-Lee.
          <article-title>Linked Data</article-title>
          . http://www.w3.org/DesignIssues/LinkedData. html,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>C. Bizer</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Heath</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Ayers</surname>
            , and
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Raimond</surname>
          </string-name>
          .
          <article-title>Interlinking Open Data on the Web (Poster)</article-title>
          .
          <source>In 4th European Semantic Web Conference (ESWC2007)</source>
          , pages
          <fpage>802</fpage>
          -
          <lpage>815</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <given-names>T.</given-names>
            <surname>Heath</surname>
          </string-name>
          and
          <string-name>
            <given-names>E.</given-names>
            <surname>Motta</surname>
          </string-name>
          . Revyu.
          <article-title>com: a Reviewing and Rating Site for the Web of Data</article-title>
          .
          <source>In The Semantic Web, 6th International Semantic Web Conference, 2nd Asian Semantic Web Conference, ISWC 2007 + ASWC</source>
          <year>2007</year>
          , pages
          <fpage>895</fpage>
          -
          <lpage>902</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21. Semantic Web Deployment Working Group.
          <article-title>SKOS Simple Knowledge Organization System Reference</article-title>
          . W3C Working Draft, Semantic Web Deployment Working Group,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Zipf</surname>
          </string-name>
          .
          <article-title>Human Behaviour and the Principle of Least Effort: an Introduction to Human Ecology</article-title>
          . Addison-Wesley,
          <year>1949</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <given-names>A.</given-names>
            <surname>Esuli</surname>
          </string-name>
          and
          <string-name>
            <given-names>F.</given-names>
            <surname>Sebastiani. SentiWordnet: A Publicly Available</surname>
          </string-name>
          <article-title>Lexical Resource for Opinion Mining</article-title>
          .
          <source>In 5th Conference on Language Resources and Evaluation (May 22-28</source>
          ,
          <year>2006</year>
          ), Genova, Italy,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>