<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Identification of Opinions in Arabic Texts using Ontologies</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Farek Lazhar</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tlili-Guiassa Yamina</string-name>
        </contrib>
      </contrib-group>
      <fpage>61</fpage>
      <lpage>64</lpage>
      <abstract>
        <p>A powerful tool to track opinions in forums, blogs, ebusiness sites, etc., has become essential for companies, politicians as well as for customers, and that because of the huge amount of texts available which make the manual exploration more and more difficult and useless. In this paper, we present our approach of identification of opinions based on an ontological exploration of texts. This approach aims to study the role of domain ontologies and their contributions in the identification phase. In our approach, domain ontology and sentiments lexicon are needed as pre-requirements.</p>
      </abstract>
      <kwd-group>
        <kwd>Knowledge</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        The views available on the Internet have a significant impact on
users, for example, if users have already researched opinions on
a product, they are willing to pay more for a product whose
opinion is more favorable than another, and the product will be
more marketed than another whose opinion is less favorable
[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>
        Companies, politicians, and customers need a powerful tool to
track opinions, sentiments, judgments, and beliefs that people
can express in blogs, comments, or in the form of texts, toward a
product, a service, a person or an organization, etc. [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
In opinion mining area, the use of expressions as a “bag of
sentiment words” to detect the semantic orientation of the
overall content of a text needs to give values to those
expressions as positive, negative or neutral towards a given
topic [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>Generally, research works in this area can be grouped into three
main categories:</p>
      <p>
        Development of linguistic and cognitive models for
opinion mining where all approaches based on
dictionary or corpus are used automatically or
semiautomatically to extract opinions based on the semantic
orientations of words and phrases [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ];
Opinions extraction from texts, where all the local
opinions are aggregated to determine the overall
orientation of a text [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ],[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ];
Features based opinion mining, where all the opinions
expressed towards the characteristics of a product or an
object are extracted and summarized [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>This article focuses on identification and classification of
opinions in Arabic texts, which aims to calculate the semantic</p>
      <sec id="sec-1-1">
        <title>How to get this set of features?</title>
      </sec>
      <sec id="sec-1-2">
        <title>What features are related to each other? What model of knowledge representation to be used to produce an understandable summary for the studied domain?</title>
        <p>To answer these questions, we propose in this paper to
study the role of ontologies used in opinion mining, and
more specifically, our goal is to study how domain
ontology can be used to:</p>
      </sec>
      <sec id="sec-1-3">
        <title>Structure the features; Extract explicit and implicit features from the texts; Produce summaries based on reviews and user comments.</title>
        <p>The paper is organized as follows: We present in Section 2, state
of the art of the main approaches used in the field and the
motivations of our work. We present in the next section, our
approach and the general architecture of opinions identification
process.
2
1.1</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>STATE OF THE ART</title>
    </sec>
    <sec id="sec-3">
      <title>Related Work</title>
      <p>Overall, two main types of work are distinguished, those that are
based on simple features extraction from the texts, and those
who organize features into a hierarchy using taxonomies or
ontologies. The extraction process mainly concerns explicit
features. We can distinguish two main families:
 Opinion Mining</p>
    </sec>
    <sec id="sec-4">
      <title>Representation Models without Knowledge</title>
      <p>
        All approaches that do not use knowledge representation
models are based on the use of algorithms to discover the
different characteristics of a product or an object. Only the
expressions of opinions (adjectival and adverbial) are
extracted, then a summary is produced to show for each
characteristic, the positive and the negative opinions and the
total number of these categories [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        The main limitation of these approaches is that there is a large
number of extracted features and a lack of organization. In
addition, similar concepts are not grouped (for example, in
some domains, the words “دعوم ” and “ءاقل” witch have the
same meaning “appointment”), and possible relationships
between the features of an object are not recognized
(example: “جوهق” “coffee” is a specific term of “بسش”
“drink”). Thus, analysis of polarity (positive, negative or
neutral) of the text is done by assigning the dominant polarity
of opinion words, regardless of the polarities associated with
each feature individually [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
 Opinion Mining
      </p>
    </sec>
    <sec id="sec-5">
      <title>Representation Models with</title>
    </sec>
    <sec id="sec-6">
      <title>Knowledge</title>
      <p>
        The family itself can be divided into two subfamilies:
(a) Use of Taxonomies
This kind of approaches does not seek a list of features, but
rather a hierarchical organized list by the use of taxonomies.
We recall that a taxonomy is a list of terms organized
hierarchically through a sort of “is a kind of”. In [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] the
author use predefined taxonomies and semantic similarity
measures to automatically extract the features and calculate
the distances between concepts.
      </p>
      <p>Generally, the use of taxonomies is coupled with a
classification technique; the sentences corresponding to the
leaves of the taxonomy are extracted. At the end of the
process, a summary that can be more or less detailed is
produced.
(b) Use of Ontologies
These approaches aim to organize the features using
elaborated representation models. Unlike taxonomies,
ontology is not restricted to a hierarchical relationship
between concepts, but can describe other types of
paradigmatic relations such as synonymy, or more complex
relationships such as relations of composition or spatial
relationships.</p>
      <p>
        Generally, the extracted features correspond exclusively to
terms contained in the ontology. The feature extraction phase
is guided by a domain ontology, built manually [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], or
semiautomatically [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], which is then enriched by a process of
automatic extraction of terms, corresponding to new features
identification.
      </p>
      <p>Similar features are grouped together using semantic
similarity measures.</p>
      <p>
        Based on the conceptual model described in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], and on the
definition described in[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] witch define an elementary discourse
unit (EDU) as a clause containing at least an elementary opinion
unit (EOU) or a sequence of clauses that address a rhetorical
relation to a segment expressing an opinion. Note that an EOU is
an explicit opinion expression composed of an explicit noun, an
adjective or a verb with its possible modifiers (negation and
adverbs).
      </p>
      <p>In a review, the opinion holder comments a set of features of an
object or a product using opinion expressions. Each feature
corresponds to a concept or a property in the ontology O.</p>
      <p>
        Ontologies have also been used to support polarity mining.
For example, in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], the authors manually built an ontology
for movie reviews and incorporated it in the polarity
classification task which substantially improved the
performance of their approach.
1.2
      </p>
    </sec>
    <sec id="sec-7">
      <title>Ontology Based Opinion Mining</title>
      <p>
        In [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], the use of a hierarchy of features improves the
performance of features based identification systems.
However, works using domain ontologies exploit the ontology
as a taxonomy using only “is a” relations between concepts.
They do not really use all data stored in an ontology, such as
the lexical components and other types of relationships. We
believe that we can get several advantages in the domain of
opinion mining by the full use of domain ontology
capabilities:
      </p>
      <p>Structuring of features: Ontologies are tools that
provide a lot of semantic information. They help to
define concepts, relationships, and entities that
describe a domain with an unlimited number of terms;
Extraction of features: Relationship between concepts
and lexical information can be used to extract explicit
and implicit features.





3
1.3
elements:</p>
    </sec>
    <sec id="sec-8">
      <title>OUR APPROACH</title>
    </sec>
    <sec id="sec-9">
      <title>Description</title>
      <p>For each studied domain, our approach requires three basic
A domain ontology O, where each concept and each
property is associated to a set of labels that correspond
to their semantics;
A lexical resource L of opinion expressions;</p>
      <sec id="sec-9-1">
        <title>A set of texts T as comments and views.</title>
      </sec>
      <sec id="sec-9-2">
        <title>For each extracted EDU, the system:</title>
        <p>For example in the following comment, the EDUs are between
square brackets, the EOUs are underlined, and the characteristics
of the object are in bold. There is an inverse relationship
between the EDUa and the EDUb, representing the review
expressed in the EDUd.
a[فتاه زاهج تيستشا ، سمأ موي ]
b[اشاتمم فتاهلا ناك اذإ ىتح]</p>
        <p>c[ادج طيست ميمصتلا ناف ]
d[ةمعلالا يره يف لاملآل ةيخملا ءيشلا ]
[Yesterday, I purchased a phone] a. [Even if the phone is
excellent]b, [the design is very basic]c, [which is disappointing
in this mark]d.
This step aims to extract for the comment all the labels of the
ontology. As each concept is an explicit feature, we simply
project the lexical components of the ontology on the text to
obtain, for each EDU, all the features. To extract the implicit
features, ontology properties are used. We recall that these
properties are to define the relationships between concepts of the
ontology. For example, the property “قوسي ”,“drive” links the
concepts “قئاس ”,“conductor” and “جزايس”,“car”.




the used opinions expressions. For example, if our
lexicon contains the concept “حعيثط”, “nature”, and
sentiments lexicon contains the word “بخلا ”,
“amazing”, from the EDU “حتلاخ حعيثط ”, “amazing
nature”, it is easy to extract the couple (حعيثط, حتلاخ ),
(nature, amazing) from the text.</p>
        <sec id="sec-9-2-1">
          <title>Known Opinionated Features and Unknown Opinion</title>
          <p>Expressions: Expressions, as in the EDU “حلوثقم جئاتو”,
“acceptable results”, where the opinion word “لوثقم ”,
“acceptable” was not extracted in step (a) (see section
3.1). In this case, the lexicon of opinions can be
automatically updated with the recovered opinion word.</p>
        </sec>
        <sec id="sec-9-2-2">
          <title>Unknown Opinionated Features and Unknown</title>
          <p>Opinion Expressions: As in the EDU “حعئاز حيسطم حتاغ”,
“wonderful rainforest” where the feature
“حيسطم”,“rainforest” has not been extracted in step (b)
(see section 3.1), in this case, the domain ontology can
be updated by adding a new concept or a new property
in the right place.</p>
        </sec>
        <sec id="sec-9-2-3">
          <title>Opinion Expressions Only: As in the EDU “ءيطت ”,</title>
          <p>“It‟s slow”. This kind of EDU expresses an implicit
feature. In this case, we use the ontology properties to
retrieve the associated concept in the ontology.
Features Only: An EDU with features alone can also be
an indicator of the presence of an implicit opinion
expression towards the feature as in “ أجلم تحثصأ حقيدحلا
هيفسحىمل”ل, “the park became a haven for perverts”, witch
express a negative opinion towards “حقيدحلا”, “the park”.
1.4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-10">
      <title>Architecture of our Approach</title>
      <p>In this section, we present the general architecture of our
approach and the different modules constituting our system:
Sentiments
Lexicon</p>
      <p>EOUs Extracting</p>
      <p>EOUs</p>
      <p>EDUs Segmentation</p>
      <p>Texts</p>
      <p>EDUs
Features and EOUs</p>
      <p>Associating</p>
      <p>Classification
Classification Result</p>
      <p>Domain</p>
      <p>Ontology
Features Extracting</p>
      <p>Features</p>
      <p>Classification
Techniques</p>
      <p>As indicated in the last figure, our system contains the
following modules:
Texts EDUs Segmentation: Generally, extraction of
elementary discourse units (EDUs), depends on the
use of delimiters such as “.” , “,”, “?” “!”;
EOUs Extracting: Elementary opinions units EOUs
and semantic orientations are usually extracted using a
lexicon of emotions specific to domain of study;
Features Extraction: Features can be extracted by a
simple projection of the ontology on the elementary
discourse units (EDUs);</p>
      <sec id="sec-10-1">
        <title>Associating UEOs to Features: Each extracted</title>
        <p>feature should be associated to one or more elementary
opinions units in order to extract its semantic
orientation;
Classification: The last phase of our work is to
classify the identified opinions into positive or
negative classes using supervised classification
techniques.
4</p>
      </sec>
    </sec>
    <sec id="sec-11">
      <title>CONCLUSION</title>
      <p>In this paper we presented our approach based on an ontological
exploration of Arabic texts. Our method is promising because
the use of ontologies improves the extraction of features and
facilitates the association between opinions expressions and
opinionated features of the object. On the one hand, domain
ontology is useful within its list of concepts which carry much
semantic data in the system. The use of ontology concepts labels
can recognize terms that refers to the same concepts and
provides a hierarchy between these concepts. On the other hand,
ontology is useful to its list of properties between concepts that
can recognize the opinions expressed on the implicit features.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Pang</surname>
            , Bo,
            <given-names>Lillian</given-names>
          </string-name>
          <string-name>
            <surname>Lee</surname>
          </string-name>
          , and Shivakumar Vaithyanathan, „
          <article-title>Thumbs up? Sentiment Classification using Machine Learning Techniques‟</article-title>
          .
          <source>Proceedings of EMNLP</source>
          , (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Turney</surname>
          </string-name>
          ,
          <string-name>
            <surname>Peter D.</surname>
            ,
            <given-names>and Michael L.</given-names>
          </string-name>
          <string-name>
            <surname>Littman</surname>
          </string-name>
          , „
          <article-title>Unsupervised Learning of Semantic Orientation from a Hundred-Billion-Word Corpus‟</article-title>
          . National Research Council, Institute for Information Technology,
          <source>Technical Report ERB-1094. (NRC#44929)</source>
          , (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Asher</given-names>
            <surname>Nicholas</surname>
          </string-name>
          and Lascarides Alex, „Logics of Conversation‟. Cambridge University Press, (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Pimwadee</given-names>
            <surname>Chaovalit</surname>
          </string-name>
          , Lina Zhou, „
          <article-title>Movie Review Mining: a Comparison between Supervised and Unsupervised Classification Approaches‟</article-title>
          ,
          <string-name>
            <surname>HICSS</surname>
          </string-name>
          , (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Carenini</surname>
          </string-name>
          , Giuseppe, Raymond T. Ng, and Ed Zwart, „
          <article-title>Extracting Knowledge from Evaluative Text‟</article-title>
          ,
          <source>In Proceedings of the 3rd international conference on Knowledge capture</source>
          , (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <surname>Soo-Min</surname>
          </string-name>
          , and Eduard Hovy, „Extracting Opinions, Opinion Holders, and
          <article-title>Topics Expressed in Online News Media Text‟</article-title>
          ,
          <source>In Proceedings of ACL/COLING Workshop on Sentiment and Subjectivity in Text</source>
          , Sydney, Australia, (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Feiguina</surname>
          </string-name>
          , Olga, „
          <article-title>Résumé automatique des commentaires de Consommateurs‟. Mémoire présenté à la Faculté des études supérieures en vue de l'obtention du grade de M.Sc</article-title>
          . en informatique, Département d'informatique et de recherche opérationnelle, Université de Montréal, (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Hu</surname>
          </string-name>
          et al. „
          <article-title>Mining and Summarizing Customer Reviews‟</article-title>
          ,
          <source>In Proceedings of the 10th ACM SIGKDD international conference on Knowledge discovery and data mining</source>
          , (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9] Cheng, Xiwen, and Feiyu Xu. „
          <article-title>Fine-grained Opinion Topic and Polarity Identification‟</article-title>
          ,
          <source>In Proceedings of the Sixth International Language Resources and Evaluation (LREC' 08)</source>
          , Marrakech, Morocco, (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Farek</given-names>
            <surname>Lazhar</surname>
          </string-name>
          et al., „
          <article-title>Identification d‟opinions dans les textes arabes‟</article-title>
          , IC, (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <surname>Lili</surname>
            , and
            <given-names>Chunping</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
          </string-name>
          , „
          <article-title>Ontology Based Opinion Mining for Movie Reviews‟</article-title>
          ,
          <source>In Proceedings of the 3rd International Conference on Knowledge Science, Engineering and Management</source>
          , (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Asher</surname>
            , Nicholas,
            <given-names>Farah</given-names>
          </string-name>
          <string-name>
            <surname>Benamara</surname>
          </string-name>
          , and
          <string-name>
            <surname>Yvette</surname>
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Mathieu</surname>
          </string-name>
          . „Appraisal of Opinion Expressions in Discourse, Lingvisticae Investigationes, John Benjamins Publishing Company, Amsterdam, Vol.
          <volume>32</volume>
          :
          <issue>2</issue>
          , (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Anaïs</given-names>
            <surname>Cadilhac</surname>
          </string-name>
          et al., „
          <article-title>Ontolexical resources for feature based opinion mining: a case study‟</article-title>
          , Beijing, (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Gillot</surname>
            <given-names>Sébastien</given-names>
          </string-name>
          , „Fouille d‟opinions, Rapport de stage‟, (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Alexander</given-names>
            <surname>Pak</surname>
          </string-name>
          et al., „
          <article-title>Classification en polarité de sentiments avec une représentation textuelle à base de sous-graphes d‟arbres de dépendances‟</article-title>
          ,
          <source>TALN</source>
          <year>2011</year>
          ,Montpellier,
          <volume>27</volume>
          <fpage>juin</fpage>
          - 1er juillet, (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>