=Paper=
{{Paper
|id=Vol-448/paper-1
|storemode=property
|title=Discourse-Based Reasoning for Controlled Natural Languages
|pdfUrl=https://ceur-ws.org/Vol-448/paper1.pdf
|volume=Vol-448
|dblpUrl=https://dblp.org/rec/conf/cnl/Potter09a
}}
==Discourse-Based Reasoning for Controlled Natural Languages==
<pdf width="1500px">https://ceur-ws.org/Vol-448/paper1.pdf</pdf>
<pre>
                     Discourse-Based Reasoning for
                     Controlled Natural Languages

                                        Andrew Potter

                         Sentar, Incorporated, 315 Wynn Drive, Suite 1
                             Huntsville, Alabama, USA 35805
                                   andrew.potter@sentar.com


       Abstract. Logic-based controlled natural languages usually provide some
       facility for compositional representation, minimally including sentence level
       coordination and sometimes subordination. Although these compositional forms
       suffice for representing short passages, they can become unwieldy for
       expressing entire paragraphs and documents. This paper describes an approach
       to representing larger composite texts in a controlled natural language. This
       approach, called discourse-based reasoning, integrates rhetorical structure
       theory with argumentation theory to define a model for defining composite
       structures and argument strategies in an ontological representation. Rhetorical
       structures are used to represent controlled texts, and argument strategies are
       defined for reasoning about interactions between structures. This provides the
       basis for expressing, summarizing, and interacting with explanatory and
       argumentative discourse. This would expand the scope of problems that may be
       addressed using controlled natural languages.
      Keywords: Controlled Natural Language, Rhetorical Structure Theory,
      Argumentation, Discourse-Based Reasoning


1    Introduction

Logic-based controlled natural languages usually provide some facility for
compositional representation. Most well known among these, ACE and PENG define
discourse representation structures that support both sentence level coordination and
subordination [1, 2], and CLCE, CPL, and E2V support sentence level coordination
[3-5]. Although these forms of compositional representation are useful for expressing
short passages of a few sentences, they can become unwieldy for expressing entire
paragraphs or documents. Techniques are needed for representing longer
compositions in a way that is both rhetorically expressive and logically reducible. In
response to this need, we are developing a discourse-based representation technology
that will support high level rhetorical structures, argumentation strategies, and
intertextual synthesis.
   Our approach, called Discourse-Based Reasoning (DBR), is based on underlying
structures of natural discourse and argumentation theory. DBR draws on Mann and
Thompson's Rhetorical Structure Theory (RST) [6], Toulmin's model of
argumentation [7], and Perelman and Olbrechts-Tyteca's strategic argumentative
processes [8]. The Toulmin model provides a framework for argumentation. RST
provides schemas, constraints, and rhetorical relations used in generating discourse
structures. The concept of strategic argumentative processes leads to a definition of
structural interactions which may be discovered and synthesized within one or more
ontologically normalized texts. While DBR has been introduced in some earlier
papers [9-11], it has become clear that implementation will require use of a controlled
natural language. It seems that controlled languages could use DBR as well.

                                                   Warrant
                                        nucleus Instance Situation
                                        satellite Instance* Situation

                                       nucleus satellite*     interactant*

                                                                              Interaction
                                                                   interactant Instance* Warrant
                                         Span                                          Substantiation
                              relation Instance Relation                                  Rebuttal
                             statement Instance Statement             type    Symbol      Backing
                                                                                          Undercut
                                                                                            …
                                     relation

                                                  statement
                          Relation
                                  Antithesis
                                 Background
               identifier Symbol Circumstance       Statement
                                  Concession
                                      …


                                 Fig. 1. DBR Reasoning Model


2 Reasoning Model

The reasoning model defines a mapping between RST and the Toulmin model. This
makes it possible to represent argumentative reasoning using RST discourse
structures. As shown in Fig. 1, the elements of the model are warrants, spans,
statements, relations, and interactions. A warrant establishes a set of links between a
nucleus and zero or more satellites. The nucleus and its satellites are represented as
spans. A span consists of a CNL statement, and in the case of satellites, the satellite’s
RST relation to its nucleus. In argumentative terms, the nucleus corresponds to a
claim, and the satellites corespond to grounds. Each satellite (or ground) links to the
nucleus (or claim) by means of a rhetorical relation. That said, it should be noted that
while some rhetorical relations are clearly argumentative, or at least inferential, others
are merely synthetic, and the reasoning model must take this into account. Examples
of inferential relations include Condition, Evidence, Means, Otherwise, and the causal
relations. Examples of synthetic relations include Background, Circumstance,
Elaboration, Restatement, and Summary. In a synthetic relation, the satellite and
nucleus are logically conjunctive, but the nucleus is more salient than the satellite.
This distinction between synthetic and inferential relations supports application of the
reasoning process, facilitating both explanatory and argumentative discourse.
   Interactions define the rules for synthesizing complex structures to create an
explanation network. An interaction occurs when a nucleus, satellite, or warrant of
one structure can be unified with a nucleus, satellite, or warrant of another structure.
Interactions are defined in terms of the possible relationships between warrants,
satellites, and nuclei. For example, if the claim of one argument unifies with the
ground of another, a substantiation interaction is said to occur. Fig. 2 shows examples
of substantiation and concomitance, and Table 1 defines the full set of interactions.


                 Substantiation                     Concomitance


                         Fig. 2. Examples of Interaction Strategies

   With this reasoning model it is possible to represent highly expressive explanation
networks that may be queried at varying levels of depth. In natural language
processing, Marcu [12] and others have shown that salience-based discourse structure
may be useful in distilling textual summaries. Further, Marcu integrated a set of
metrics that could be used to improve these summaries, such as rhetorical clustering,
explicit markers, and structure shape. If techniques such as these are promising for
summarizing natural language, it would seem of likely utility for controlled languages
as well. Our preliminary experimentation supports this claim. We developed a utility
that distills raw summaries with specifiable depth from RST analyses stored in
RSTtool markup format, and our results thus far have been encouraging.

3 Generating Discourse Structures

For the value of these discourse structures to be realized, it will be necessary to
provide an efficient means for their generation. Although parsing discourse relations
in CNL may be less difficult than natural language, it is not a trivial problem. The
difficulties lie not merely in the complexities of the language, but in the subtleties of
the RST relation definitions themselves. Consider for example the distinction between
Antithesis and Concession. Antithesis prescribes that the writer has positive regard for
the nucleus and that the satellite and nucleus are mutually incompatible, e.g. “Rather
than waste time teaching at the university, Charles pursued a lucrative career in the
publishing industry.” The Concession relation, on the other hand, prescribes that the
writer has positive regard for the nucleus but that there is not necessarily an
incompatibility between it and the satellite, e.g. “Although his mother would have
preferred that he teach, Charles pursued a lucrative career in the publishing industry.”
Following these definitions it might seem that any instance of the Antithesis relation
could also be coded as Concession [13]. Similar difficulties arise when distinguishing
Elaboration from Evidence.
   For CNL, the answer, it seems to us, is that DBR structures would be created the
same way as other CNL discourse structures—namely they would be created as part
of the authoring process. For example, Attempto Controlled English supports several
discourse representation structures for representing composite sentences, such as
conditions, coordinates, and subordinates [1], and the ACE parser is able to recognize
these. In a study of automated parsing of natural language texts, Marcu and Echihabi
[14] were able to achieve 93% accuracy in recognizing discourse relations for a small
subset of relation types. While this success rate is not adequate for CNL, it does
suggest that through a combination of refining the RST relation set and extending the
set of cue phrases available to CNL authors, it may be possible to develop a
hypotactic style that would support automatic DBR structure generation. For example,
if we wish to preserve the distinction between Antithesis and Concession, we could
specify this through the use of cue words such as but, not, and although:

1      An administrator can not verify every system, but it is necessary that if a system is a
       compromised system then the administrator must verify it.
2      Although it is possible that an administrator believes that a system is up-to-date, it is not
       provable that the system is invulnerable.

     Table 1. Interaction Definitions
       Interaction                               Definition
    Substantiation. The claim of one             substantiation(arg(G1,C1,W1) & arg(C1,C2,W2))
    argument is used as the ground of another
    Rebuttal. The claims of two arguments        rebuttal(arg(G1,C1,W1) & arg(G2,C2,W2)) &
    are incompatible                                incompatible(C1,C2))
    Backing. An argument substantiates the       backing(arg(G1,C1,W1) & arg(G2,C2,C1))
    warrant of another
    Undercut. The claim of one argument is       undercut(arg(G1,C1,W1) & arg(G2,C2,W2) &
    incompatible with the ground of another        incompatible(C1,G2))
    Dissociation. The claim of one argument      dissociation(arg(G1,C1,W1) & arg(G2,C2,W2) &
    disputes the warrant of another                incompatible(C1,W2))
    Convergence. Two arguments lead to the       accrual(arg(G1,C1,W1) & arg(G2,C1,W2))
    same claim, with possible accrual
    Concomitance. Two arguments use the          concomitance(arg(G1,C1,W1) & arg(G1,C2,W2))
    same ground to establish distinct claims
    Confusion. The grounds of two                confusion(arg(G1,C1,W1) & arg(G2,C2,W2) &
    arguments are incompatible                      incompatible(G1, G2))

   In addition, we may be able to build on this through recognition of syntactically
recognizable rhetorical forms, such as sorites, hypothetical syllogism, and dilemma.
Paragraph breaks and punctuation cues could also be used to support recognition of
larger composite structures [15]. Through a combination of cue phrases, syntactical
forms, and layout features, it may be possible to arrive at a composition style that is
easy enough for writers to write, readers to read, and automated reasoning systems to
process.
4 Conclusion

   This paper has defined an approach to representing and reasoning about complex
composite structures in controlled natural languages. This is accomplished through
definition of a reasoning model that synthesizes rhetorical structure theory with
Toulmin’s argumentative model and Perelman’s theory of argument strategy. By
defining rules for managing interactions among inferential and synthetic structures,
DBR provides the basis for representing, summarizing, and interacting with
explanatory and argumentative discourse, and it expands the scope of problems that
may be addressed using controlled natural languages. Some anticipated future work
includes identification of an experimental RST relation set for CNL, developing a
prototype for encoding composite texts, and further definition of the reasoning model.

References

1. Fuchs, N.E., Kaljurand, K., Kuhn, T.: Discourse representation structures for ACE 6.0.
    Department of Informatics, University of Zurich, Zurich (2008)
2. Schwitter, R.: English as a formal specification language. Proceedings of the 13th
    International Workshop on Database and Expert Systems Applications (2002) 228-232
3. Pratt-Hartmann, I.: A two-variable fragment of English. Journal of Logic, Language and
    Information 12 (2003) 13-45
4. Sowa, J.: Common logic controlled English. (2007)
5. Clark, P., Harrison, P., Jenkins, T., Thompson, J., Wojcik, R.: Acquiring and using world
    knowledge using a restricted subset of English. FLAIRS (2005) 506–511
6. Mann, W.C., Thompson, S.A.: Rhetorical structure theory: Towards a functional theory of
    text organization. Text 8 (1988) 243-281
7. Toulmin, S.E.: The uses of argument. Cambridge University Press, Cambridge, UK (1958)
8. Perelman, C., Olbrechts-Tyteca, L.: The new rhetoric: A treatise on argumentation.
    University of Notre Dame, Notre Dame, IA (1969 [1958])
9. Potter, A.: A discourse approach to explanation aware knowledge representation. In: Roth-
    Berghofer, T., Schulz, S., Leake, D.B., Bahls, D. (eds.): Explanation-aware computing:
    Papers from the 2007 AAAI Workshop. AAAI Press, Menlo Park, CA (2007) 56-63
10. Potter, A.: Generating discourse-based explanations. Künstliche Intelligenz 22 (2008) 28-
   31
11. Potter, A.: Linked and convergent structures in discourse-based reasoning. In: Roth-
   Berghofer, T., Schulz, S., Bahls, D., Leake, D.B. (eds.): Proceedings of the 3rd
   International Explanation Aware Computing Workshop (ExaCt 2008), Patras, Greece
   (2008) 72-83
12. Marcu, D.: The theory and practice of discourse parsing and summarization. MIT Press,
   Cambridge, MA (2000)
13. Stede, M.: Disambiguating rhetorical structure Research on Language & Computation 6
   (2006) 311-332
14. Marcu, D., Echihabi, A.: An unsupervised approach to recognizing discourse relations.
   Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics
   (ACL), Philadelphia (2002) 368-375
15. Campbell, K.S.: Coherence, continuity, and cohesion: Theoretical foundations for
   document design. Lawrence Erlbaum, Hillsdale, NJ (1995)

</pre>