<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Three Stories on Automated Reasoning for Natural Language Understanding</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Johan Bos University of Rome “La Sapienza”</institution>
        </aff>
      </contrib-group>
      <fpage>81</fpage>
      <lpage>91</lpage>
      <abstract>
        <p>Three recent applications of computerised reasoning in natural language processing are presented. The first is a text understanding system developed in the late 1990s, the second is a spoken-dialogue interface to a mobile robot and automated home, and the third is a system that determines textual entailment. In all of these applications, off-the-shelf tools for reasoning with first-order logic (theorem provers as well as model builders) are employed to assist in natural language understanding. This overview is not only an attempt to identify the added value of computerised reasoning in natural language understanding, but also to point out the limitations of the first-order inference techniques currently used in natural language processing.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Since the mid 1990s I’ve been using tools from automated deduction (mainly theorem
provers and model builders) to solve problems in natural language understanding. I’ve
done this both from the perspective of computational linguistics (testing and improving
a linguistic theory by means of a computational implementation) as well as that of
natural language processing (using inference in applications such as question answering
and textual entailment). In this paper I will write about some recent projects that I was
involved in, with the aim of convincing researchers — both from the natural language
engineering and the automated deduction communities — that computerised reasoning
can be successfully applied to natural language understanding.</p>
      <p>Right — what’s new about this claim? Isn’t it obvious that one needs some form
of computerised reasoning to model natural language understanding? Indeed, in my
opinion it is. However, it is astonishing to see how few tools from automated reasoning
have made it in real applications. Why is that? I think there are a couple of reasons for
this.</p>
      <p>First of all, a rather high level of interdisciplinarity is required. One needs not only
to know about linguistics (and in particular (formal) semantics), but also about natural
language processing, knowledge representation, and automated inference. As a matter
of fact, not many researchers match this profile, and looking back to the 1980s and
1990s, there seem fewer and fewer interested in taking on the pursuit. Or as Steve
Pulman put it on the recent ICoS (Inference in Computational Semantics) conference:
“I feel like an endangered species.”</p>
      <p>Secondly, there is an enormous gap between formal linguistic theory and practical
implementation. There is a vast amount of formal linguistics theory on the semantics
of natural language. However interesting most of it is, it doesn’t always lead directly
to computational implementation. Many of the phenomena that are accounted for in
formal theory are quite rare in natural data, so a natural language engineer won’t loose
much by ignoring it. But more crucially, from a an automated reasoning point of view,
almost all semantic phenomena are formalised in higher-order logic (a trend set by noone
less than Richard Montague), which is, as is well-known, a computational nightmare.</p>
      <p>Third, it is not trendy to use theorem proving in natural language processing. In
contrast, stochastic approaches dominate the field, and after having been successfully
applied to speech recognition and syntactic processing, it won’t take long until statistics
will play a major role in semantic processing. Having said that, time and time again I am
surprised by the opinion that using theorem proving in natural language understanding is
classified as the “traditional approach.” What tradition? OK – it was tried in the 1970s
and 1980s, but it never got to work really well, for reasons that should not really surprise
us: theorem proving hadn’t matured into a state as we know it today, and moreover,
trivially, computers lacked the memory and speed to perform the computations required
by inference. Yet, we have only started to understand the limitations and opportunities
of computerised reasoning in natural language understanding.</p>
      <p>After this introduction I will present three applications demonstrating successful use
of first-order inference tools. The first is a text-understanding system that calculates
presuppositions of sentences and performs consistency and informativeness checks on
texts. The second is a spoken dialogue system, interfaced to a mobile robot and an
automated home environment, that uses theorem proving and model building for planning
its linguistic and non-linguistic actions. The third is a system for recognising textual
entailment. In all of these applications I will discuss the reasoning tools used, how they
are used, what added value they had, and what their limitations were.
1</p>
    </sec>
    <sec id="sec-2">
      <title>Presupposition Projection</title>
      <p>In the mid 1990s I started to implement tools for the semantic analysis of English
texts, as part of my thesis work at the University of the Saarland in Saarbru¨cken,
Germany. One of the aims was to follow linguistic theory as close as possible and see
how much of it could be implemented straight away. I adopted Discourse Representation
Theory (DRT), initially developed by Hans Kamp, because it accounted for a wide range
of linguistic phenomena in a unified framework, and, crucially, had a model-theoretic
semantics [KR93]. In particular I was interested in modelling presupposition projection,
and I followed a recent proposal by Rob van der Sandt, whose theory of presupposition
was casted in DRT [VdS92].</p>
      <p>My implementation efforts resulted in the doris system (Figure 1). It had a
reasonable grammer coverage (substantially more than a toy grammar, but certainly not
reaching the level of today’s wide coverage grammars). It parsed English sentences and
computed an underspecified discourse representation of the input sentence, followed by
resolving ambiguities. The linguistic phenomena covered included pronouns, quantifier
and negation scope, and presuppositions [Bos01].</p>
      <p>Presuppositions are propositions taken for granted by the speaker, and “triggered”
by words or phrases. Rather informally, p presupposes q if both p and ¬p entail q.
For instance, the phrase “Mia’s husband” presupposes that Mia is married, and that
Mia is a woman, because “Jody likes Mia’s husband” and the negation of this sentence
“Jody doesn’t like Mia’s husband” both entail that Mia is married and is a woman. The
problem with presuppositions is that they are sometimes neutralised by the linguistic
context, and that it is quite hard to pin down exactly when they are and when they
are not, especially in sentences that contain some form of implication, negation, or
disjunction. Consider, for instance, the following three sentences (with the relevant
presupposition trigger typeset in bold face):
(a) If Mia has a date with Vincent, then her husband is out of town.
(b) If Mia has a husband, then her husband is out of town.</p>
      <p>(c) If Mia is married, then her husband is out of town.</p>
      <p>Here (a) presupposes that Mia is married, but (b) and (c) do not. Van der Sandt’s
theory explained this by constructing various possibilities of positioning the
presupposition, and then checking whether these were acceptable by posing acceptability
contraints upon them: consistency as well as informativeness of the resulting text, both
on the global and local level of discourse. For (c), there are two positions where the
presupposition can “land” (which are underlined):
(c-1) Mia has a husband. If Mia is married, then her husband is out of town.
(c-2) If Mia is married/has a husband, then her husband is out of town.</p>
      <p>However, in the first paraphrase (c-1) the antecedent of the conditional violates
the constraint of (local) informativeness: if Mia has a husband, then the fact that she
is married is not new information. In the second paraphrase (c-2) all acceptability
constraints are satisfied. As a consequence, (c-1) is rejected, and (c-2) is accepted, as
possible interpretation.</p>
      <p>Despite the adequacy of the predictions of the theory, there was a still a problem:
the acceptability contraints required logical inference. But how cuold you implement
this? Even though DRT was an established semantic theory backed up with a
modeltheory, there were no (efficient) theorem provers available that could reason with the
semantic representations employed by DRT. Discussions with Patrick Blackburn and
Michael Kohlhase (both in Saarbru¨cken, at the time) developed the idea of using
firstorder theorem provers to implement Van der Sandt’s acceptability constraints. As the
core of DRT is a first-order language, it turned out to be pretty straightforward to
translate the DRT representations into ordinary first-order formula syntax, something
that first-order theorem provers could digest. Soon after, contacts were made with Hans
de Nivelle, whose theorem prover bliksem was among the first that was put to the test
[DN98]. And it looked promising: bliksem could handle most of the problems given to
it in reasonable time.</p>
      <p>However, as some natural language examples could cause hundreds of consistency
checking tasks (due to a combinatorial explosion of linguistic ambiguities), it took a
long time before bliksem had dealt with them all. Michael Kohlhase and Andreas
Franke came to the rescue, by offering MathWeb, a web-based inference service [FK99].
MathWeb farmed out a set of inference problems to different machines using a common
software bus. Using the internet and many machines around the world (I recall that there
were machines running in Edinburgh, Budapest, Saarbru¨cken, and Sydney, among other
sites), MathWeb could basically be viewed as a parallel supercomputer. (This sounds
perhaps quite ordinary right now, but at the time it was a sensation.) To cut a long story
short, doris was interfaced directly to MathWeb, and many different theorem provers
for first-order logic were added: spass [WAB+99], fdpll [Bau00], otter [MP96], and
vampire [RV02].</p>
      <p>In sum, the doris system demonstrated that first-order inference could play an
interesting role in natural language processing, albeit with limitations [BBKdN01]. It
generated a new application area for automated deduction (in fact, some of the problems
generated by doris made it to the TPTP collection, thanks to Geoff Sutcliffe), and it
opened a whole new vista of research in computational semantics. (Incidentally, it also
helped to precisely formulate the acceptability constraints of Van der Sandt’s theory of
presupposition projection.)</p>
      <p>So, what were these limitations? Scalability was one of them. A theorem prover
would do well on a couple of sentences, but — not surprisingly given the computational
properties of first-order logic — it would just choke on larger texts. Linguistic coverage
was another. Some linguistic phenomona require richer semantic representations and
therefore harder problems (for instance, tense and aspect require a richer sortal
hierarchy, cardinal expression require counting, and plurals require elements of set theory).</p>
      <p>The last version of doris was released in 2001 [Bos01]. Although it was an important
step in the development of computational semantics, its limited grammatical coverage
and unfocussed application domain left it without a future. At the time I thought that
it would take at least twenty years to develop a parser that achieved both wide-coverage
and syntactic representations of enough detail to construct meaningful semantic
representations (I was, fortunately, very wrong! See Section 3). In order to reach a new level
of sophistication in computational semantics, I instead focussed on small domains, in
which the grammatical coverage and necessary background knowledge could be specified
a-priory. Human-computer dialogue systems turned out to be the killer application.
2</p>
    </sec>
    <sec id="sec-3">
      <title>Spoken Dialogue Systems</title>
      <p>At the University of Edinburgh I was involved in developing a spoken dialogue system
which was interfaced with a (real) robot. The robot was a RWI Magellan Pro robot, with
sonars, infrared sensors, bumpers, and a video camera. It had an on-board computer
connected to the local network via a wireless LAN interface. The robot moved about at
the basement of Buccleuch Place, and people could direct it, ask it questions, or provide
it with new information, all via speech. A typical conversation could be:
Human: Robot?
Robot: Yes?
Human: Where are you?
Robot: I am in the hallway.</p>
      <p>Human: OK. Go to the rest room!</p>
      <p>Such kinds of dialogues were relatively straightforward to model with the then
stateof-the-art in human-machine dialogue. Yet, the robot was still “semantically
challenging”: it had no means to draw inferences. What I aimed to do was using components of
the doris system to give the robot means to do consistency checking, answer questions,
and calculate its next actions. In particular, I was interested in letting the robot react
to inconsistent information or obvious information, envisioning dialogues such as:
Human: Where are you?
Robot: I am in the hallway.</p>
      <p>Human: You are in my office.</p>
      <p>Robot: No, that’s not true!
Human: You are in the hallway.</p>
      <p>Robot: Yes, I know.</p>
      <p>I knew this was feasible because of the doris experience: theorem provers can easily
handle the amount of information, and the amount of background knowledge, given the
limited domain and environment, was easy to compute and maintain. And so it turned
out to be: using spass as theorem prover, the robot was checking whether each assertion
of the user was consistent with its current state and knowledge of the dialogue.</p>
      <p>After having implemented this, I was interested in using first-order theorem proving
for planning the actions of a directive, primarily from a semantic point of view. I
considered commands that involved negation, disjunction, or quantification. Examples
of utterances that I had in mind included:
(a) Turn on a light.
(b) Switch in every light.
(c) Switch on every light in the hallway.
(d) Turn off every light except the light in the office.</p>
      <p>(e) Go to the kitchen or the rest room.</p>
      <p>In (a), the robot had to turn on a light that was currently switched off (and of course
it had to complain when all the lights were already on). In (b), it had to turn on all
lights that were currently off (some of the lights could be on already). In (c), it should
only consider lights in the hallway (restrictive quantification). In (d), it should consider
all lights minus those in the office (negation). In (e), it should have either gone to the
kitchen or to the rest room (disjunction).</p>
      <p>This was hard to do with a theorem prover. It seemed a natural task for a finite model
builder though. Via the doris system I already came into contact with Karsten Konrad’s
model builder kimba [KW99], but I had never used model builders other than checking
for satisfiability. I started using Bill McCune’s mace because it searched models by
iteration over domain size, and generally generating models that were both
domainminimal and minimal in the extensions of the predicates [McC98]. Model building was
successfully integrated such that the minimal models produced by mace where used
to determine the actions that the robot had to perform [BO02]. The robot in action
was regularly demonstrated to visitors at Buccleuch Place (Figure 2) as well as to the
general public at the Scottish museum in Edinburgh (Figure 3).</p>
      <p>Of course there were the usual limitations. As I was using a first-order language with
possible worlds to model the semantics of actions, I was forced to erase the dialogue
memory after every second utterance in order to keep the response time acceptable. Also,
the number of different objects in the domain was very limited (given the background
axioms of the robot, mace produces models in reasonable time for models up to domain
size 20). Nevertheless, the overall system was impressive, and showed what one could
do with general purpose, off-the-shelf, first-order inference tools in a practical system.</p>
    </sec>
    <sec id="sec-4">
      <title>Recognising Textual Entailment</title>
      <p>The rapid developments in statistical parsing were of course of interest to the semanticist.
Yet most of these parsers produced syntactic derivations that were unsuitable to produce
semantic representations in a systematic, principled way. It is not an exaggeration to say
that the release of a stastistical wide-coverage parser for CCG (Combinatorial Categorial
Grammar) in 2004 corresponded to a breakthrough in computational semantics. This
CCG parser, implemented by Stephen Clark and James Curran [CC04], and trained on
an annotated treebank developed by Julia Hockenmaier and Mark Steedman [HS02], had
the best of both worlds: it achieved wide coverage on texts, and produced very detailed
syntactic derivations. Because of the correspondence between syntax and semantic rules
in CCG, this framework was the ideal setting for doing semantics.</p>
      <p>Because CCG is a heavily lexicalised theory, it has a large number of lexical
categories, and very few rules. In order to translate the output of the parser (a CCG
derivation) into a DRT representation (which was my main aim, in order to reuse the
existing tools that I developed for DRT and inference), I coded a lambda-expression
for each of the ca. syntactic 400 categories that were known to the parser. Using the
lambda-calculus to produce DRT representations, we simply had a parser that
translated newspaper texts into semantic representations, with a coverage of around 95%
[BCS+04, Bos05]. This was a great starting point for doing computerised reasoning for
natural language on a wider scale.</p>
      <p>In the same year the first challenge to recognising textual entailment (RTE) were
organised. This is basically a competition for implemented systems to detect whether
one (small) text entails another (small) text. To give an impression of the task, consider
an example of a positive and and an example of a negative entailment pair are (where
T is the text, and H the hypothesis, using the terminology of the RTE):
Example: 115 (TRUE)
T: On Friday evening, a car bomb exploded outside a Shiite mosque in Iskandariyah, 30 miles
south of the capital, killing seven people and wounding 10, doctors said on condition of
anonymity.</p>
      <p>H: A bomb exploded outside a mosque.</p>
      <p>Example: 117 (FALSE)
T: The release of its report led to calls for a complete ivory trade ban, and at the seventh
conference in 1989, the African Elephant was moved to appendix one of the treaty.
H: The ban on ivory trade has been effective in protecting the elephant from extinction.</p>
      <p>In the RTE challenge a participating system is given a set of entailment pairs and
has to decide, for each T-H pair, whether T entails H or not. This turned out to be a
very hard task. The baseline (randomly guessing) already gives a score of 50%, as half
of the dataset correspond to true entailments, and the other half to false ones. The best
systems on the RTE-1 campaign achieved a score approaching 60%.</p>
      <p>With the CCG parser and semantics at my disposal, I decided to implement a system
that used logical inference to approach the RTE challenge. The overall idea, given
the available tools, was straightforward: produce a DRT representation for T and H,
translate these to first-order logic, and then use an off-the-shelf prover to check whether
T ′ → H′. We used vampire as theorem prover [RV02], motivated by its performance
at the recent CASCs.</p>
      <p>At first, the performance was limited. The system performed quite well on cases such
as 115 above, but unfortunately the RTE challenge doesn’t contain many of these. Most
of the examples require a lot of background knowledge to draw the correct inference. I
used WordNet (an electronic dictionary) to compute some of these background axioms
automatically. Yet still there were knowledge gaps. It would be nice to have a theorem
prover that would be able to say that it “almost found a proof”, instead of just saying
“yes” or “(probably) no”.</p>
      <p>Back to model building. Finite model builders, such as mace [McC98], said more
than just “yes” or “no”. They also give you a finite model if there is one. As this model
is a clearly defined mathematical entity, it is ideal to simulate the “almost found a proof”
scenario. That is, by generating minimal models for T ′ and for T ′ ∧ H′, we hypothesised
that comparing the number of entities of the two models would give us a useful handle
on estimating entailment. If the difference is relatively small, it is likely that T entails
H. Otherwise it is not. (To deal with negation, one also has to calculate the models for
¬T ′ and ¬(T ′ ∧ H′) and compare the model sizes. To deal with disjunction, one has to
do this for all minimal models for T and H — but as of now it is unclear to me how to
implement the latter in a neat way...)</p>
      <p>This turned out to work well — not as good as one would hope (because it is hard
to get the right background knowledge), but significantly better than the baseline. We
used standard machine learning techniques to estimate the thresholds of the model sizes.
We used both the size of the domain as well as the number of instantiated relations in
the model. It turned out that mace was quite slow for longer sentences. We tried
paradox [CS03], which is faster, but it does not always return minimal models (with
respect to the predicate extensions). To overcome this, we used paradox to calculate
the domain size, and then called mace to generate the model given that domain size.</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>First-order inference tools, such as automated theorem provers and model builders, can
be successfully used in natural language understanding applications. Obviously, there
are limitations, but in many interesting applications these limitations play a subordinate
role. Whether computerised (logical) reasoning will ever become part of mainstream
research in natural language processing is questionable, though. We will have to see to
what extent statistical approaches (currently dominating computational linguistics) can
be applied to natural language understanding tasks. Meanwhile, collaboration between
researchers working in automated deduction and computational linguistics should be
stimulated to get a better understanding of the boundaries of applying automated
reasoning to natural language understanding.
[Bau00]</p>
      <p>Peter Baumgartner. FDPLL – A First-Order
Davis-Putnam-LogemanLoveland Procedure. In David McAllester, editor, CADE-17 – The 17th
International Conference on Automated Deduction, volume 1831 of Lecture
Notes in Artificial Intelligence, pages 200–219. Springer, 2000.
[BCS+04]</p>
      <p>J. Bos, S. Clark, M. Steedman, J.R. Curran, and Hockenmaier J.
WideCoverage Semantic Representations from a CCG Parser. In Proceedings of
the 20th International Conference on Computational Linguistics (COLING
’04), Geneva, Switzerland, 2004.</p>
      <p>
        Johan Bos and Tetsushi Oka. An Inference-based Approach to Dialogue
System Design. In Shu-Chuan Tseng, editor, COLING 2002. Proceedings
of the 19th International Conference on Computational Linguistics, pages
113–119, T
        <xref ref-type="bibr" rid="ref5">aipei, Taiwan, 2002</xref>
        .
      </p>
      <p>Johan Bos. DORIS 2001: Underspecification, Resolution and Inference for
Discourse Representation Structures. In Patrick Blackburn and Michael
Kohlhase, editors, ICoS-3, Inference in Computational Semantics, pages
117–124, 2001.</p>
      <p>Johan Bos. Towards wide-coverage semantic interpretation. In
Proceedings of Sixth International Workshop on Computational Semantics
IWCS6, pages 42–53, 2005.</p>
      <p>S. Clark and J.R. Curran. Parsing the WSJ using CCG and Log-Linear
Models. In Proceedings of the 42nd Annual Meeting of the Association for
Computational Linguistics (ACL ’04), Barcelona, Spain, 2004.</p>
      <p>K. Claessen and N. So¨rensson. New techniques that improve mace-style
model finding. In Model Computation – Principles, Algorithms,
Applications (Cade-19 Workshop), Miami, Florida, USA, 2003.</p>
      <p>Hans De Nivelle. A Resolution Decision Procedure for the Guarded
Fragment. In Automated Deduction - CADE-15. 15th International Conference
on Automated Deduction, pages 191–204. Springer-Verlag Berlin
Heidelberg, 1998.</p>
      <p>Andreas Franke and Michael Kohlhase. System description: Mathweb,
an agent-based communication layer for distributed automated theorem
proving. In CADE’99, 1999.</p>
      <p>J. Hockenmaier and M. Steedman. Generative Models for Statistical
Parsing with Combinatory Categorial Grammar. In Proceedings of 40th Annual
Meeting of the Association for Computational Linguistics, Philadelphia,
PA, 2002.</p>
      <p>H. Kamp and U. Reyle. From Discourse to Logic; An Introduction to
Modeltheoretic Semantics of Natural Language, Formal Logic and DRT.</p>
      <p>Kluwer, Dordrecht, 1993.
[McC98]
[RV02]</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [BBKdN01]
          <string-name>
            <given-names>Patrick</given-names>
            <surname>Blackburn</surname>
          </string-name>
          , Johan Bos,
          <string-name>
            <given-names>Michael</given-names>
            <surname>Kohlhase</surname>
          </string-name>
          , and Hans de Nivelle.
          <article-title>Inference and Computational Semantics</article-title>
          . In Harry Bunt, Reinhard Muskens, and Elias Thijsse, editors,
          <source>Computing Meaning</source>
          Vol.
          <volume>2</volume>
          , pages
          <fpage>11</fpage>
          -
          <lpage>28</lpage>
          . Kluwer,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [MP96] [VdS92]
          <string-name>
            <given-names>Karsten</given-names>
            <surname>Konrad</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.A.</given-names>
            <surname>Wolfram</surname>
          </string-name>
          .
          <article-title>System description: Kimba, a model generator for many-valued first-order logics</article-title>
          .
          <source>In 16th International Conference on Automated Deduction CADE-16</source>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>W.</given-names>
            <surname>McCune</surname>
          </string-name>
          .
          <article-title>Automatic Proofs and Counterexamples for Some Ortholattice Identities</article-title>
          .
          <source>Information Processing Letters</source>
          ,
          <volume>65</volume>
          (
          <issue>6</issue>
          ):
          <fpage>285</fpage>
          -
          <lpage>291</lpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          Springer-Verlag,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <source>AI Communications</source>
          ,
          <volume>15</volume>
          (
          <issue>2-3</issue>
          ),
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <source>Journal of Semantics</source>
          ,
          <volume>9</volume>
          :
          <fpage>333</fpage>
          -
          <lpage>377</lpage>
          ,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [WAB+99]
          <string-name>
            <surname>Christoph</surname>
            <given-names>Weidenbach</given-names>
          </string-name>
          , Bijan Afshordel, Uwe Brahm, Christian Cohrs, Thorsten Engel, Enno Keen,
          <string-name>
            <given-names>Christian</given-names>
            <surname>Theobalt</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Dalibor</given-names>
            <surname>Topic</surname>
          </string-name>
          .
          <source>System description: Spass version 1.0</source>
          .0. In Harald Ganzinger, editor,
          <source>16th International Conference on Automated Deduction, CADE-16</source>
          , volume
          <volume>1632</volume>
          <source>of LNAI</source>
          , pages
          <fpage>314</fpage>
          -
          <lpage>318</lpage>
          . Springer-Verlag, Berlin,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>