<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Rich Lexical Knowledge based Q&amp;A System for Ubiquitous Knowledge Service</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Asanee Kawtrakul Mukda Suktarachan</string-name>
          <email>asanee_naist@yahoo.com</email>
          <email>asanee_naist@yahoo.com naist_da_da@yahoo.com Aree Thunkijjanukij ThaiAGRIS Center Kasetsart University Bangkok, Thailand thunkijja@yahoo.com</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Navapat Khantonthong</string-name>
          <email>navapatk@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Patrick Saint-Dizier</string-name>
          <email>stdizier@irit.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Engineering, Kasetsart University Bangkok</institution>
          ,
          <country country="TH">Thailand</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>IRIT-CNRS 118 route de Narbonne</institution>
          ,
          <addr-line>Toulouse</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>U-Know Center and, U-Know Center and, Department of Computer Engineering Department of Computer Engineering, Kasetsart University Bangkok, Kasetsart University Bangkok</institution>
          ,
          <country>Thailand Thailand</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We present the concept of a Question-Answering System for providing knowledge services. The system is based on a rice production and rice disease textual database which has been structured according to a number of ontological conceptual functions, and associated annotations. In this paper, the rich lexical knowledge is utilized for identifying semantic roles in a question, connecting with the domain knowledge base in ontology and text formats to response the questions.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Question-Answering Knowledge</kwd>
        <kwd>Ontology</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        To get a better grasp at the problem and to be able to characterize it
in depth, we got a collection of 1000 questions raised in real life
from farmers. We have annotated those 1000 questions and the
text(s) identified as responses for each query. This allowed us to
understand how questions can be answered. Since the agricultural
knowledge base we are using (derived from Thai AGRIS:
Agricultural Research Information System, specifications) has a
rich conceptual structure [7], about 60% of the questions can be
directly answered by transforming queries into a conjunction of
conceptual functions of this schema via lexical descriptions and
interpretation functions. However, for about 40% of the questions,
this is not possible, in particular for evaluative questions (such as
“what is the largest …”) and How-to questions that are related to
procedures. To deal with this latter set of questions, we developed
a model based on response annotation in order to induce inference
rules to match a question with its answer. This is particularly
crucial when there is no straightforward response, for examples,
when some forms of lexical inference are required, when the
response is not a simple item, but a well-formed fragment of text,
and chain of events leading to a consequence event, or a procedure
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] etc.
      </p>
      <p>
        The project we present here emerged from a need of the real
endusers, the Agricultural Land Reform Office, Ministry of
Agriculture and Cooperative, Thailand, in the project of ALRO
Cyber Brain [
        <xref ref-type="bibr" rid="ref1 ref2">1,2</xref>
        ], which is a social network framework that
combines approaches based on knowledge engineering with
language engineering. Conceptual knowledge is represented in
ontology through ontology workbench [8] for responding the
factoid questions. New knowledge in textual format is extracted for
maintaining ontological knowledge and responding non-factoid
questions. We present below a brief outline of the main problems
we have encountered.
2. TOWARDS A ‘REAL’ QA SYSTEM:
CHALLENGES
First, at the level of QA analysis, several problems arise to identify
the facets of the question: the type of the question, its focus and the
constraints that hold on the focus.
      </p>
      <p>For complex questions, another challenge is to identify its contents.
Our approach is, via a dependency parsing approach, to tag NPs
(noun phrase) and PPs (preposition phrase) by means of semantic
tags, which correspond to the categories of the AGRIS database. In
natural language, a question can indeed be asked with different
words and syntactic forms. This is particularly the case in Thai,
which allows for a lot of optional terms with a large constituent
order freedom.</p>
      <p>
        Next, in most cases, questions and answers do not match directly
because the clue words or focus words in the question never appear
in the answers. This obviously causes difficulty in finding the
expected answer. For this kind of Q&amp;A matching problem, some
lexical semantics devices or more elaborated reasoning schema,
based on domain knowledge are needed to allow appropriate
question-response matching [
        <xref ref-type="bibr" rid="ref5 ref6">5,6</xref>
        ]. This is realized in our project via
text annotation and learning.
      </p>
      <p>In some cases, some information is missing to elaborate a real
diagnosis, in that case the user is asked to provide more details. We
prefer to avoid settling a dialogue, since this may lead to
unexpected data or directions. Users want a relatively fast
response, therefore just asking for more free input is the best
compromise. The second aspect of this problem is to be able to
extract the complete text portion in a text that responds to the
question. For that purpose we have developed an annotation
methodology whose goal is to identify the different processes at
stake and the needed resources. This method allows us to identify
relevant text portions and then to delimit them appropriately.
Ontology with 2322 concepts, 5603 terms, 57 associative relations,
60% of questions can be directly answered by transforming queries
into one or more conceptual functions via lexical inference rules
and interpretation functions.
Our question answering system is based on three sources of
knowledge which interact:
− lexical data and in particular lexical semantics, and lexical
inference,
− the domain data as represented by the rich conceptual functions
,i.e. Rice Ontology,
− some general purpose knowledge, useful for answering
questions.</p>
      <p>Lexical representations of verbs are based on conceptual functions
from Framenet. The general form is: Verb + argument selectional
restrictions: conjunction of conceptual functions (with variables
corresponding to argument positions). For example: resist: verb,
[X:NP, Y:NP],[X:plant, Y:insect ש disease], X ’isResistantTo’ Y.
Nouns are associated with their types as defined in the domain
ontology.</p>
      <p>While the semantics of verbs can be represented on the basis of
conceptual functions, more complex situations, e.g. the adjunction
of constraints, often expressed by syntactic adjuncts, need further
developments. The first difficulty is to develop a compositional
framework that can integrate various modifiers. For that purpose,
we reuse the semantic representations we developed based on the
Lexical Conceptual Structure principles that we have integrated
into the PrepNet lexical base. PrepNet proposes semantic
representations for a large number of forms of adjuncts based on a
notion a prepositional modification, which is what is encountered
in complex questions. For ontology based Question and Answering
system, we apply Thai Rice Ontology [7] as a source of rich
knowledge. The corresponding answer could be extracted with
simple algorithm by matching query with ontological relation and
then grasping it’s associations with inference rules as an answer as
in Figure 2.</p>
      <p>Q: What are the disease of rice caused by fungi ?</p>
      <p>With inference Rule: X isDiseaseOf Y ר X isa Fungi we can
get the answer as the following.</p>
      <p>A: Magnaporthe grisea</p>
      <p>Then the answers will be collected more from full-text by matching
the question to extracted full-text (See Figure 3).</p>
    </sec>
    <sec id="sec-2">
      <title>3. CONCLUSION</title>
      <p>This short paper presents some ideas on how to utilize the rich
lexical knowledge for annotating question and answer in both
indexing and text level. The application of presented methodology
has been implemented on knowledge services for the Thai farmers
in Rice domain and is under testing. Moreover, with the Rice
Function Matching (Question Q, Answer A){</p>
      <p>Match = false;
// Relevant document
If (Q.focus = A.index) then
// Relevant answer
If (Q.type = A.task type) then
//Detect Answer for the Question
If (Q.focus = A.title) then</p>
      <p>Match = true;
Else if (Q.action = A.action and</p>
      <p>Q.theme = A.theme or</p>
      <p>Q.agent = A.agent) then</p>
      <p>Match = true;</p>
      <p>End If</p>
      <p>End If
End If
Return Match;}</p>
    </sec>
    <sec id="sec-3">
      <title>4. ACKNOWLEDGMENTS</title>
      <p>The work described in this paper has been supported by the
NECTEC No. NT-B-22-KE-12-50- 19, within the project, I-Know
II: CAT, EAT, RATs, and Agricultural Question &amp; Answering
Service System, granted by the KURDI, Kasetsart University. We
would also like to thank the French CNRS PICs programme.
[8] Kawtrakul A. et.al.2008. “Ontology based Knowledge Map
Construction for a Smart Knowledge Service” IAALD AFITA
WCCA 2008, Tokyo, Japan, 24 - 27 August.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Kawtrakul</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          et. al.
          <year>2009</year>
          .
          <article-title>Problems-Solving Map Extraction with Collective Intelligence Analysis</article-title>
          and
          <string-name>
            <given-names>Language</given-names>
            <surname>Engineering</surname>
          </string-name>
          .
          <source>Book Chapter 18, Medical Information Science Reference in Information Retrieval in Biomedicine. ISBN: 978-1-60566-274-9</source>
          ; pp 460
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Kawtrakul</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          et. al.
          <year>2009</year>
          ,
          <article-title>From CyberBrain to Q&amp;A Services: A Development of Question - Answering Services System for the Farmer through the SMS</article-title>
          .
          <source>In Proceedings of WCCA2009. Grand Sierra Resort</source>
          , Reno, Nevada, USA.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Moldovan</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          et.al.
          <year>2000</year>
          .
          <article-title>The Structure and Performance of an Open-Domain Question Answering System, Proceedings of the 38th Meeting of the Association for Computational Linguistics (ACL), Hong Kong</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Estelle</given-names>
            <surname>Delpech</surname>
          </string-name>
          , Patrick Saint-Dizier.
          <year>2008</year>
          .
          <article-title>Investigating the Structure of Procedural Texts for Answering How-to Questions, LREC2008</article-title>
          , Marrakech.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Leonard</given-names>
            <surname>Talmy</surname>
          </string-name>
          .
          <year>1985</year>
          .
          <article-title>Lexicalization Patterns: Semantic Structure in Lexical Forms, in Language Typology and Syntactic Description 3: Grammatical Categories and</article-title>
          the Lexicon, T. Shopen(ed.),
          <fpage>57</fpage>
          -
          <lpage>149</lpage>
          , Cambridge University Press.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Takechi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          et.al.
          <year>2003</year>
          .
          <article-title>Feature Selection in Categorizing Procedural Expressions</article-title>
          ,
          <source>The 6th International Workshop on Information Retrieval with Asian Languages (IRAL2003)</source>
          :
          <fpage>49</fpage>
          -
          <lpage>56</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>