=Paper=
{{Paper
|id=Vol-448/paper-1
|storemode=property
|title=Discourse-Based Reasoning for Controlled Natural Languages
|pdfUrl=https://ceur-ws.org/Vol-448/paper1.pdf
|volume=Vol-448
|dblpUrl=https://dblp.org/rec/conf/cnl/Potter09a
}}
==Discourse-Based Reasoning for Controlled Natural Languages==
Discourse-Based Reasoning for
Controlled Natural Languages
Andrew Potter
Sentar, Incorporated, 315 Wynn Drive, Suite 1
Huntsville, Alabama, USA 35805
andrew.potter@sentar.com
Abstract. Logic-based controlled natural languages usually provide some
facility for compositional representation, minimally including sentence level
coordination and sometimes subordination. Although these compositional forms
suffice for representing short passages, they can become unwieldy for
expressing entire paragraphs and documents. This paper describes an approach
to representing larger composite texts in a controlled natural language. This
approach, called discourse-based reasoning, integrates rhetorical structure
theory with argumentation theory to define a model for defining composite
structures and argument strategies in an ontological representation. Rhetorical
structures are used to represent controlled texts, and argument strategies are
defined for reasoning about interactions between structures. This provides the
basis for expressing, summarizing, and interacting with explanatory and
argumentative discourse. This would expand the scope of problems that may be
addressed using controlled natural languages.
Keywords: Controlled Natural Language, Rhetorical Structure Theory,
Argumentation, Discourse-Based Reasoning
1 Introduction
Logic-based controlled natural languages usually provide some facility for
compositional representation. Most well known among these, ACE and PENG define
discourse representation structures that support both sentence level coordination and
subordination [1, 2], and CLCE, CPL, and E2V support sentence level coordination
[3-5]. Although these forms of compositional representation are useful for expressing
short passages of a few sentences, they can become unwieldy for expressing entire
paragraphs or documents. Techniques are needed for representing longer
compositions in a way that is both rhetorically expressive and logically reducible. In
response to this need, we are developing a discourse-based representation technology
that will support high level rhetorical structures, argumentation strategies, and
intertextual synthesis.
Our approach, called Discourse-Based Reasoning (DBR), is based on underlying
structures of natural discourse and argumentation theory. DBR draws on Mann and
Thompson's Rhetorical Structure Theory (RST) [6], Toulmin's model of
argumentation [7], and Perelman and Olbrechts-Tyteca's strategic argumentative
processes [8]. The Toulmin model provides a framework for argumentation. RST
provides schemas, constraints, and rhetorical relations used in generating discourse
structures. The concept of strategic argumentative processes leads to a definition of
structural interactions which may be discovered and synthesized within one or more
ontologically normalized texts. While DBR has been introduced in some earlier
papers [9-11], it has become clear that implementation will require use of a controlled
natural language. It seems that controlled languages could use DBR as well.
Warrant
nucleus Instance Situation
satellite Instance* Situation
nucleus satellite* interactant*
Interaction
interactant Instance* Warrant
Span Substantiation
relation Instance Relation Rebuttal
statement Instance Statement type Symbol Backing
Undercut
…
relation
statement
Relation
Antithesis
Background
identifier Symbol Circumstance Statement
Concession
…
Fig. 1. DBR Reasoning Model
2 Reasoning Model
The reasoning model defines a mapping between RST and the Toulmin model. This
makes it possible to represent argumentative reasoning using RST discourse
structures. As shown in Fig. 1, the elements of the model are warrants, spans,
statements, relations, and interactions. A warrant establishes a set of links between a
nucleus and zero or more satellites. The nucleus and its satellites are represented as
spans. A span consists of a CNL statement, and in the case of satellites, the satellite’s
RST relation to its nucleus. In argumentative terms, the nucleus corresponds to a
claim, and the satellites corespond to grounds. Each satellite (or ground) links to the
nucleus (or claim) by means of a rhetorical relation. That said, it should be noted that
while some rhetorical relations are clearly argumentative, or at least inferential, others
are merely synthetic, and the reasoning model must take this into account. Examples
of inferential relations include Condition, Evidence, Means, Otherwise, and the causal
relations. Examples of synthetic relations include Background, Circumstance,
Elaboration, Restatement, and Summary. In a synthetic relation, the satellite and
nucleus are logically conjunctive, but the nucleus is more salient than the satellite.
This distinction between synthetic and inferential relations supports application of the
reasoning process, facilitating both explanatory and argumentative discourse.
Interactions define the rules for synthesizing complex structures to create an
explanation network. An interaction occurs when a nucleus, satellite, or warrant of
one structure can be unified with a nucleus, satellite, or warrant of another structure.
Interactions are defined in terms of the possible relationships between warrants,
satellites, and nuclei. For example, if the claim of one argument unifies with the
ground of another, a substantiation interaction is said to occur. Fig. 2 shows examples
of substantiation and concomitance, and Table 1 defines the full set of interactions.
Substantiation Concomitance
Fig. 2. Examples of Interaction Strategies
With this reasoning model it is possible to represent highly expressive explanation
networks that may be queried at varying levels of depth. In natural language
processing, Marcu [12] and others have shown that salience-based discourse structure
may be useful in distilling textual summaries. Further, Marcu integrated a set of
metrics that could be used to improve these summaries, such as rhetorical clustering,
explicit markers, and structure shape. If techniques such as these are promising for
summarizing natural language, it would seem of likely utility for controlled languages
as well. Our preliminary experimentation supports this claim. We developed a utility
that distills raw summaries with specifiable depth from RST analyses stored in
RSTtool markup format, and our results thus far have been encouraging.
3 Generating Discourse Structures
For the value of these discourse structures to be realized, it will be necessary to
provide an efficient means for their generation. Although parsing discourse relations
in CNL may be less difficult than natural language, it is not a trivial problem. The
difficulties lie not merely in the complexities of the language, but in the subtleties of
the RST relation definitions themselves. Consider for example the distinction between
Antithesis and Concession. Antithesis prescribes that the writer has positive regard for
the nucleus and that the satellite and nucleus are mutually incompatible, e.g. “Rather
than waste time teaching at the university, Charles pursued a lucrative career in the
publishing industry.” The Concession relation, on the other hand, prescribes that the
writer has positive regard for the nucleus but that there is not necessarily an
incompatibility between it and the satellite, e.g. “Although his mother would have
preferred that he teach, Charles pursued a lucrative career in the publishing industry.”
Following these definitions it might seem that any instance of the Antithesis relation
could also be coded as Concession [13]. Similar difficulties arise when distinguishing
Elaboration from Evidence.
For CNL, the answer, it seems to us, is that DBR structures would be created the
same way as other CNL discourse structures—namely they would be created as part
of the authoring process. For example, Attempto Controlled English supports several
discourse representation structures for representing composite sentences, such as
conditions, coordinates, and subordinates [1], and the ACE parser is able to recognize
these. In a study of automated parsing of natural language texts, Marcu and Echihabi
[14] were able to achieve 93% accuracy in recognizing discourse relations for a small
subset of relation types. While this success rate is not adequate for CNL, it does
suggest that through a combination of refining the RST relation set and extending the
set of cue phrases available to CNL authors, it may be possible to develop a
hypotactic style that would support automatic DBR structure generation. For example,
if we wish to preserve the distinction between Antithesis and Concession, we could
specify this through the use of cue words such as but, not, and although:
1 An administrator can not verify every system, but it is necessary that if a system is a
compromised system then the administrator must verify it.
2 Although it is possible that an administrator believes that a system is up-to-date, it is not
provable that the system is invulnerable.
Table 1. Interaction Definitions
Interaction Definition
Substantiation. The claim of one substantiation(arg(G1,C1,W1) & arg(C1,C2,W2))
argument is used as the ground of another
Rebuttal. The claims of two arguments rebuttal(arg(G1,C1,W1) & arg(G2,C2,W2)) &
are incompatible incompatible(C1,C2))
Backing. An argument substantiates the backing(arg(G1,C1,W1) & arg(G2,C2,C1))
warrant of another
Undercut. The claim of one argument is undercut(arg(G1,C1,W1) & arg(G2,C2,W2) &
incompatible with the ground of another incompatible(C1,G2))
Dissociation. The claim of one argument dissociation(arg(G1,C1,W1) & arg(G2,C2,W2) &
disputes the warrant of another incompatible(C1,W2))
Convergence. Two arguments lead to the accrual(arg(G1,C1,W1) & arg(G2,C1,W2))
same claim, with possible accrual
Concomitance. Two arguments use the concomitance(arg(G1,C1,W1) & arg(G1,C2,W2))
same ground to establish distinct claims
Confusion. The grounds of two confusion(arg(G1,C1,W1) & arg(G2,C2,W2) &
arguments are incompatible incompatible(G1, G2))
In addition, we may be able to build on this through recognition of syntactically
recognizable rhetorical forms, such as sorites, hypothetical syllogism, and dilemma.
Paragraph breaks and punctuation cues could also be used to support recognition of
larger composite structures [15]. Through a combination of cue phrases, syntactical
forms, and layout features, it may be possible to arrive at a composition style that is
easy enough for writers to write, readers to read, and automated reasoning systems to
process.
4 Conclusion
This paper has defined an approach to representing and reasoning about complex
composite structures in controlled natural languages. This is accomplished through
definition of a reasoning model that synthesizes rhetorical structure theory with
Toulmin’s argumentative model and Perelman’s theory of argument strategy. By
defining rules for managing interactions among inferential and synthetic structures,
DBR provides the basis for representing, summarizing, and interacting with
explanatory and argumentative discourse, and it expands the scope of problems that
may be addressed using controlled natural languages. Some anticipated future work
includes identification of an experimental RST relation set for CNL, developing a
prototype for encoding composite texts, and further definition of the reasoning model.
References
1. Fuchs, N.E., Kaljurand, K., Kuhn, T.: Discourse representation structures for ACE 6.0.
Department of Informatics, University of Zurich, Zurich (2008)
2. Schwitter, R.: English as a formal specification language. Proceedings of the 13th
International Workshop on Database and Expert Systems Applications (2002) 228-232
3. Pratt-Hartmann, I.: A two-variable fragment of English. Journal of Logic, Language and
Information 12 (2003) 13-45
4. Sowa, J.: Common logic controlled English. (2007)
5. Clark, P., Harrison, P., Jenkins, T., Thompson, J., Wojcik, R.: Acquiring and using world
knowledge using a restricted subset of English. FLAIRS (2005) 506–511
6. Mann, W.C., Thompson, S.A.: Rhetorical structure theory: Towards a functional theory of
text organization. Text 8 (1988) 243-281
7. Toulmin, S.E.: The uses of argument. Cambridge University Press, Cambridge, UK (1958)
8. Perelman, C., Olbrechts-Tyteca, L.: The new rhetoric: A treatise on argumentation.
University of Notre Dame, Notre Dame, IA (1969 [1958])
9. Potter, A.: A discourse approach to explanation aware knowledge representation. In: Roth-
Berghofer, T., Schulz, S., Leake, D.B., Bahls, D. (eds.): Explanation-aware computing:
Papers from the 2007 AAAI Workshop. AAAI Press, Menlo Park, CA (2007) 56-63
10. Potter, A.: Generating discourse-based explanations. Künstliche Intelligenz 22 (2008) 28-
31
11. Potter, A.: Linked and convergent structures in discourse-based reasoning. In: Roth-
Berghofer, T., Schulz, S., Bahls, D., Leake, D.B. (eds.): Proceedings of the 3rd
International Explanation Aware Computing Workshop (ExaCt 2008), Patras, Greece
(2008) 72-83
12. Marcu, D.: The theory and practice of discourse parsing and summarization. MIT Press,
Cambridge, MA (2000)
13. Stede, M.: Disambiguating rhetorical structure Research on Language & Computation 6
(2006) 311-332
14. Marcu, D., Echihabi, A.: An unsupervised approach to recognizing discourse relations.
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics
(ACL), Philadelphia (2002) 368-375
15. Campbell, K.S.: Coherence, continuity, and cohesion: Theoretical foundations for
document design. Lawrence Erlbaum, Hillsdale, NJ (1995)