<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Two-Level Approach to Generate Synthetic Argumentation Reports</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Patrick Saint-Dizier IRIT-CNRS</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>route de Narbonne Toulouse</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>France stdizier@irit.fr</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Generative Lexicon</institution>
          ,
          <addr-line>Rhetoric</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2017</year>
      </pub-date>
      <fpage>14</fpage>
      <lpage>21</lpage>
      <abstract>
        <p>Given a controversial issue, a major challenge in argument mining is to organize the arguments which have been mined to generate a synthesis that is readable, synthetic enough and relevant for various types of users. Based on the Generative Lexicon (GL) Qualia structure, which is a kind of lexical and knowledge repository, that we have enhanced in di↵ erent manners and associated with inferences and language patterns, we show how to construct a synthesis that outlines the typical elements found in arguments. We propose a two-level approach: a synthesis of the arguments that have been mined and navigation facilities that allow to access the argument contents in order to get more details.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>CCS CONCEPTS</title>
      <p>• Computer systems organization → Natural language
processing; Argument mining; Knowledge representation;</p>
    </sec>
    <sec id="sec-2">
      <title>AIMS AND CHALLENGES</title>
      <p>One of the main goals of argument mining is, given a
controversial issue, to identify in a set of texts the arguments
for or against that issue. These arguments act as supports or
attacks of the issue. Arguments may also attack or support
the arguments which support or attack that controversial
issue in order to reinforce or cancel out their impact.
Arguments are di cult to identify, in particular when they are
not adjacent to the controversial issue, possibly not in the
same text, because their linguistic, conceptual or referential
links to that issue are rarely explicit.</p>
      <p>
        For example, given the controversial issue: Vaccine against
Ebola is necessary, identifying the argumentative link with
statements such as Ebola adjuvant is toxic, Ebola vaccine
production is costly, or 7 people died during Ebola vaccine tests
cannot be realized solely on the basis of linguistics data, but
requires domain knowledge. Furthermore, a knowledge-based
analysis of the third statement shows that it is irrelevant or
neutral w.r.t. the issue
        <xref ref-type="bibr" rid="ref12 ref13">(Saint-Dizier 2016)</xref>
        .
      </p>
    </sec>
    <sec id="sec-3">
      <title>Argument Mining Challenges</title>
      <p>
        Argument mining is an emerging research area which
introduces new challenges in natural language processing and
generation. Argument mining research applies to written
texts, e.g.
        <xref ref-type="bibr" rid="ref8">(Mochales Palau et ali.., 2009)</xref>
        ,
        <xref ref-type="bibr" rid="ref6">(Kirschner et ali.,
2015)</xref>
        , for example for opinion analysis, e.g.
        <xref ref-type="bibr" rid="ref15">(Villalba et al.,
2012)</xref>
        , mediation analysis
        <xref ref-type="bibr" rid="ref5">(Janier et al. 2015)</xref>
        or transcribed
argumentative dialog analysis, e.g.
        <xref ref-type="bibr" rid="ref1">(Budzynska et ali., 2014)</xref>
        ,
        <xref ref-type="bibr" rid="ref14">(Swanson et ali., 2015)</xref>
        . The analysis of the NLP techniques
relevant for argument mining from annotated structures is
analyzed in e.g.
        <xref ref-type="bibr" rid="ref10">(Peldszus et al. 2016)</xref>
        . Annotated corpora
are now available, e.g. the AIFDB dialog corpora or
        <xref ref-type="bibr" rid="ref16">(Walker
et al., 2012)</xref>
        . These corpora are very useful to understand
how argumentation is realized in texts, e.g. to identify
argumentative discourse units (ADUs), linguistic cues
        <xref ref-type="bibr" rid="ref9">(Nguyen et
al., 2015)</xref>
        , and argumentation strategies, in a concrete way,
possibly in association with abstract argumentation schemes,
as shown in e.g.
        <xref ref-type="bibr" rid="ref3">(Feng et al., 2011)</xref>
        . Finally, reasoning
aspects related to argumentation analysis are developed in e.g.
        <xref ref-type="bibr" rid="ref4">(Fiedler et al., 2007)</xref>
        and
        <xref ref-type="bibr" rid="ref17">(Winterstein, 2012)</xref>
        from a formal
semantics perspective.
      </p>
      <p>In opinion analysis, the benefits of argument mining are
not only to identify the customers satisfaction level, but
also to characterize why customers are happy or unhappy.
Abstracting over arguments allows to construct summaries
and to induce customer preferences or value systems (e.g. low
fares are preferred to localization or quality of welcome for
some categories of hotel customers).</p>
      <p>
        In
        <xref ref-type="bibr" rid="ref10 ref12 ref13">(Saint-Dizier 2016a)</xref>
        , a corpus analysis identifies the
type of knowledge and inferences that are required to develop
argument mining. It is briefly reported in this paper. Then,
we have shown, on the basis of a set of examples, that the
Generative Lexicon (GL) could be an appropriate model,
su ciently expressive, to characterize the types of knowledge,
inferences and lexical data that are required to accurately
identify arguments related to an issue.
1.2
      </p>
    </sec>
    <sec id="sec-4">
      <title>Natural Language Summarization</title>
      <p>
        In natural language generation, the main projects on
argument generation was developed as early as
        <xref ref-type="bibr" rid="ref18">(Zuckerman et
ali. 2000)</xref>
        and
        <xref ref-type="bibr" rid="ref4">(Fiedler 2007)</xref>
        . While there are currently
several reseach e↵ orts to develop argument mining, very little
has been done recently to produce a synthesis of the mined
arguments that is readable, synthetic enough and relevant
for various types of users. This includes identifying the main
features for or against a controversial issue, but also tasks
such as eliminating duplicates, fallacies or ad’hominem
statements and identifying arguments which attack or support
each other, besides the controversial issue.
      </p>
      <p>
        In
        <xref ref-type="bibr" rid="ref12 ref13">(Saint-Dizier 2016b)</xref>
        , we show how arguments that have
been mined can be organized in hierarchically structured
clusters so that readers car navigate over and within sets of
arguments according to the conceptual organization proposed
by the Generative Lexicon. This approach turned out not to
be synthetic enough, since over 100 arguments can be mined
for a given issue, making the perception of the main attacks
and supports quite di cult. However, this initial approach
allows the construction of an argument database useful to
readers who wish to access to the exact form of arguments
that have been mined.
      </p>
      <p>The present contribution focuses on the next stage, aiming
at producing a synthesis that is short and e cient where the
concepts present in the GL Qualia structures are used to
abstract over arguments while keeping the structure of those
clusters of arguments which are accessible via links from the
synthesis. The argument cluster system is accessed to get
more precise information.</p>
      <p>
        This contribution to natural language argumentation
synthesis is not really a summarization task, as e.g. developed
in
        <xref ref-type="bibr" rid="ref7">(Mani et al. 1999)</xref>
        . In our approach, no text or document
is reduced to produce a summary. The synthesis that is
proposed is simply a two level re-organization task that involves
forms of clustering. From that perspective, it could be viewed
as a preliminary step to a summarization procedure. A real
summarization task would involve constructing summaries
for each cluster of arguments, but this is beyond the present
research.
      </p>
      <p>In terms of feature classification and relevance, the
concepts used in the Qualia structure of the Generative lexicon
are defined a priori, similarly to the features evaluated in
most opinion analysis systems. They are used as entry points
to the re-organization and to the cluster system. A
challenging point is that these concepts must obviously correspond
as much as possible to the user perception of the domain to
which the issue belongs.
1.3</p>
    </sec>
    <sec id="sec-5">
      <title>Paper Structure</title>
      <p>In this paper, for the sake of understanding, we first
summarize the results elaborated in our previous contributions, we
then develop the synthesis production model. This two-level
approach, a synthesis of the arguments that have been mined
and, associated with navigation facilities that allow to access
the argument contents in order to get more details seems to
be an e cient approach for readers who want first to get the
essentials of the argumentation.
2
2.1</p>
    </sec>
    <sec id="sec-6">
      <title>MINING ARGUMENTS: THE NEED</title>
    </sec>
    <sec id="sec-7">
      <title>OF KNOWLEDGE</title>
    </sec>
    <sec id="sec-8">
      <title>Corpus Analysis: the need of knowledge</title>
      <p>To explore and characterize the forms of knowledge that are
required to develop argument mining in texts, we constructed
and annotated four corpora based on four independent
controversial issues. The texts considered are extracts from various
sources, e.g.: newspaper articles and blogs from associations.
Issues deal with:
(1) Ebola vaccination,
(2) women’s situation in India,
(3) nuclear plants and
(4) organic agriculture.</p>
      <p>The total corpus includes 51 texts, a total of 24500 words for
122 di↵ erent arguments. From our manual analysis, the
following argument polarities are observed: attacks: 51 occurrences,
supports: 32, argumentative concessions: 17, argumentative
contrasts: 18 and undetermined: 4.</p>
      <p>Our analysis shows that for 95 arguments (78%), some
form of knowledge is involved to establish an argumentative
relation with an issue. An important result is that the
number of concepts involved is not very large: 121 concepts for
95 arguments over 4 domains. These concepts are mainly
related to purposes, functions, parts, properties, creation and
development of the concepts in the issues. These are relatively
well defined and implemented in the Qualia structure of the
Generative Lexicon, which is the framework adopted in our
modeling.
2.2</p>
    </sec>
    <sec id="sec-9">
      <title>An introduction to the Generative</title>
    </sec>
    <sec id="sec-10">
      <title>Lexicon</title>
      <p>
        The Generative Lexicon (GL)
        <xref ref-type="bibr" rid="ref11">(Pustejovsky, 1995)</xref>
        is an
attempt to structure lexical semantics knowledge in conjunction
with domain knowledge. In the GL, the Qualia structure of an
entity is both a lexical and knowledge repository composed
of four fields called roles:
• the constitutive role describes the various parts
of the entity and its physical properties, it may
include subfields such as material, parts, shape, etc.
• the formal role describes what distinguishes the
entity from other objects,
• the telic role describes the entity functions, uses,
roles and purposes,
• the agentive role describes the origin of the entity,
how it was created or produced.
      </p>
      <p>To illustrate this conceptual organization, let us consider the
controversial issue (1):
The vaccine against Ebola is necessary.</p>
      <p>The main concepts in the Qualia structure of the head term
of (1), vaccine are organized as follows:</p>
      <sec id="sec-10-1">
        <title>Vaccine(X):</title>
        <p>2 constitutive: hactive principle, adjuvanti,3
6666666666666664 ftaoeglreimncta:i642lv:ammehveam:oai"eninddd:s(iep:cXvrii,nneodljetieo,secspacet(rtmT(tZif,inX,frXaao)tc,,mYitot(i)enX,s(,tDY(),TD),,X),753),,# 7777777777777775
sell(T,X)</p>
      </sec>
      <sec id="sec-10-2">
        <title>The Qualia structure of Ebola is: Ebola:</title>
        <p>2 formal: hvirus, diseasei,
66
64 telic:
"infect(E1,ebola,P) )
)
3die(E3,P) ^ E1 E2  E3
7
get sick(E2,P)#77
5
3</p>
        <p>The terms, predicates or constants, found in the di↵ erent
roles of any Qualia are defined on the basis of a domain
ontology, when it exists, or via bootstrapping techniques on
the web, if it doesn’t exist for this domain. Qualia structures
can be hierarchically organized, as in any ontology. Vaccine is
a kind of medicine, it therefore inherits of the properties, i.e.
the predicates present in medicine, unless some blocking is
formulated. Similarly, Ebola is a kind of disease, therefore it
inherits of the properties of a disease. This rich organization
greatly simplifies the description of Qualias. Some Qualia
structure resources are available as payware at ELRA, from
the SIMPLE EEC project.</p>
        <p>Finally, from the two above Qualias and via formula
expansion, the formal representation of the controversial issue
is:
2 (protect from(X,Y, (infect(E1,ebola, Y) ) get sick(E2,Y)
)
3 die(E3,Y))) ^ avoid(X,dissemination(ebola)).
2.3</p>
      </sec>
    </sec>
    <sec id="sec-11">
      <title>Using Qualias for Argument Mining</title>
      <p>
        Originally, the Qualia structure was designed to characterize
sense variations around a prototypical one, and the large
number of potential combinations of NP arguments with
predicates, in particular verbs. This was implemented via
a mechanism called type coercion. In
        <xref ref-type="bibr" rid="ref11">(Pustejovsky 1995)</xref>
        ,
the Qualia structure manipulates atomic terms assocated as
lists to one of the four qualia roles. This Qualia structure,
in our view, is a specific interpretation of a more global
typology of object descriptions, realized in various manners
from Aristotle.
      </p>
      <p>
        In our approach, we view the Qualia structure as a means to
structure knowledge associated with concepts in a functional
way, via telicity (an subtypes of telicity), various types of
functional and structural parts, and the way an object was
created. This view allows us to have complex structures such
as formula, modalities, etc. instead of just the atomic concepts
of the original Qualia. Manipulating such structures is more
16
complex, but better corresponds to the reality. Reasoning
with these complex forms is addressed in
        <xref ref-type="bibr" rid="ref10 ref12 ref13">(Saint-Dizier 2016a)</xref>
        .
In the present paper, we propose an argument synthesis based
on atomic concepts, which may be isolated concepts in roles
or part of formula.
      </p>
      <p>Other types of resources such as FrameNet, WordNet or
VerbNet do not contain the information found in Qualias,
which is essential for argument mining. These latter resources
are structured around predicative forms and mainly describe
the type of arguments and adjuncts predicates can take and
how they are combined. VerbNet introduces semantic
representations based on primitives which may be of interest
for our approach as a way to normalize the complex
representations we have implemented and, possibly, the atomic
concepts themselves.</p>
      <p>In terms of data completeness, it is clear that Qualia
descriptions will never be comprehesive knowledge repositories
for a given concept, with all its facets. In our approach, due
to a lack of existing resources, Qualias are mostly described
manually. Even via the use of bootstrapping techniques, it is
clear that the Qualia of a concept C (e.g. vaccine) essentially
contains the most typical features (encoded via concepts,
which themselves can originate Qualias). An incremental
automatic acquisition of Qualia features would be crucial and
helpful, but this raises complex problems such as consistency
or granularity management.
3</p>
    </sec>
    <sec id="sec-12">
      <title>A NETWORK OF QUALIAS TO</title>
    </sec>
    <sec id="sec-13">
      <title>CHARACTERIZE THE</title>
    </sec>
    <sec id="sec-14">
      <title>GENERATIVE EXPANSION OF</title>
    </sec>
    <sec id="sec-15">
      <title>ARGUMENTS</title>
      <p>Before generating any argument synthesis, it is necessary to
organize the set of concepts at stake in these arguments, in
particular those which are supported or attacked w.r.t. the
controversial issue.</p>
      <p>Our observations show that arguments attack or support
(1) specific concepts found in the Qualia of the head terms in
the controversial issue (called root concepts) or (2) concepts
directly derived from these root concepts, via their Qualia.
In particular, concepts related to various types of parts of
the concept, purposes, functions and uses of the concept are
frequently found in arguments, whatever their polarity. For
example, arguments can attack properties or purposes of the
adjuvant, which is a part of a vaccine or the way a vaccine
avoids dissemination of a disease. Besides the telic role, the
agentive is also a crucial role since, for example, arguments
often attack the way a vaccine has been tested, or its purchase
cost.</p>
      <p>From these observations, a network of Qualias can be
defined to organize the concepts and knowledge structures
involved in the arguments. This network is, for the time
being, limited to three levels because derived concepts must
remain functionally close to the root concepts to have a
certain argumentative weight. However, some arguments,
quite remote from the main concepts of the issue may have a
T 1ij00,X ]}.</p>
      <p>In the case of issue (1), nodes form the set T which
corresponds to the terms in the Qualias of vaccine(X) and Ebola,
some of which are terminal and others non-terminal.
- Step 2: similarly, the terms T 1ij00,X from the Q0i0 of step 1
introduce new nodes into the network together with their
own Qualia when they are non-terminal concepts. They form
the set T 1, derived from T.
- Step 3: the same operation is carried out on T 1 to produce
T 2.
- Final step: production of T 3. The set of concepts involved
is: {T [ T 1 [ T 2 [ T 3}.</p>
      <p>
        This network of Qualias forms the backbone of the
argument mining system. This network develops the
argumentative generative expansion of the controversial issue.
This network is also the organization principle, expressed in
terms of relatedness, that guides the generation of a synthesis
where the di↵ erent facets of the Qualias it contains are the
structuring principles
        <xref ref-type="bibr" rid="ref12 ref13">(Saint-Dizier 2016b)</xref>
        . Natural language
words or expressions that lexicalize each concepts can be
associated with each network nodes.
      </p>
      <p>An important issue is to evaluate if and how this network
defines a kind of ’transitive closure’ that would characterize
the typical and most frequent concepts that appear in
arguments that support or attack an issue. Obviously, unexpected
arguments may arise with concepts not in this network,
probably with a lower frequency and recurrence.</p>
      <p>The total number of concepts at stake in arguments for
an average size issue, such as issues (1) and (3), is about 40
concepts, with non-homogeneous usages. A rough estimate
indicates that about 80% of the arguments related to an issue
can be recognized on the basis of these concepts.</p>
      <p>The ’transitive closure’ induced by this network is
obviously not perfect, but quite e cient. The arguments which
are not found are rather unexpected, but of much interest.
For example, arguments such as: vaccinations prevents
bioterrorism, vaccination raises ethical and racial problems are
The concepts in the formal and telic roles (medicine,
chemicals, dilute(Y,X1), inject(X1,P) originate new Qualias, these
are considered at step 2. Natural language terms are
associated to these concepts, e.g.: medicine, chemicals, inject,
injection, dilute, dilution.</p>
      <p>Similarly, test(T,X), in the agentive role of vaccine(X),
applied to vaccines (and medicines more generally),
originates a node in T, and additional nodes in T1, T2 from its
non-terminal concepts:
2
Test(T,X):
666
6666 telic:
66 formal : hscientific acti
66
4 agentive : helaborate(T,X)i
constitutive:hparts of a test: data, protocoli 3
7
"Main: evaluate(T,protection(X,Y, A)),#77
evaluate(T,side-effects(X,Y, A)) 77777777
5</p>
      <p>Then, arguments may attack or support concepts present
in test, such as the evaluation of the protection or the test
protocol that has been used.
4
there are no medical alternative to vaccine, etc.</p>
      <p>Attacks:
there is a limited number of cases and deaths compared to
other diseases;
7 vaccined people died in Monrovia,
there are limited risks of contamination,
there is a large ignorance of contamination forms,
competent sta↵ is hard to find and P4 lab is really di cult
to develop;
vaccine toxicity has been shown,
vaccine may have high side-e↵ ects,
Concessions or Contrasts:
some side-e↵ ects;
production and development costs are high;
vaccine is not yet available;
a systematic vaccination raises ethical and freedom problems.</p>
      <p>The type of synthesis we propose reduces these expressions
to an evaluation of the main concepts, as found in the Qualia
structure, as developed in section 3. The number of arguments
for and against each concept is given to outline the balance
between each tendency. This however remains a tendency
because this number of arguments depends on how many
texts have been processed and how may arguments have
been mined. The comprehensive list of arguments is stored
in clusters and are accessible via navigation links from the
concepts in the synthesis (section 4.3).
4.2</p>
    </sec>
    <sec id="sec-16">
      <title>Synthesis Input Data: annotated arguments</title>
      <p>
        The output of the mining system, which is the starting point
of the synthesis construction, includes the following attributes
        <xref ref-type="bibr" rid="ref12 ref13">(Saint-Dizier 2016)</xref>
        , associated with each argument:
• the argument identifier (an integer), in our first
experiment, all arguments attack or support the
controversial issue and no other arguments,
• the text span involved that delimits the argument
compound and its kernel, which ranges from a few
words to a paragraph. In the synthesis, only the
kernel of the argument is considered,
• the polarity of the argument w.r.t. the issue:
support or attack. Additional intermediate values
(argumentative concessions and contrasts) could be
added in the future,
• the concepts involved, to identify the argument:
list of the main concepts from the Qualias used in
the mining process. Only the concepts found in the
main argument section are considered. Those in
adjoined discourse structures will be considered for
higher-level synthesis in a later stage, to identify, e.g.
restrictions.
• the strength of the argument, based on linguistic
marks found in the argument,
• the discourse structures in the compound,
associated with the argument kernel, as processed by our
discourse analysis platform TextCoop.
18
5
      </p>
      <p>The ’ConceptsInvolved’ attribute is structured from the
root node, as a kind of path, so that the concept that is
involved is clearly identified. This attribute may contain an
ordered list of paths if several concepts are involved. A typical
path is a sequence:
root-concept/(Role/Concept)*,
where the Concept is a predicate or a constant of a Qualia
structure found under the role ’Role’.</p>
      <p>For example, the concept of ’protocol’ is defined as follows:
vaccine(X)/agentive/test(T,X)/constitutive/protocol.
since protocol is a concept associated with the constitutive
role of the concept ’test’. Besides a clear identification of the
concept ’protocol’, this path can be used as (1) a way to
structure a synthesis and (2) a way to provide some
explanation of why an utterance is an argument by outlining its
relation(s) with the root concept.</p>
      <p>Argument 11:
Even if the vaccine seems 100% e cient and without any
side e↵ ects on the tested population, it is necessary to wait
for more conclusive tests before making large vaccination
campaigns. The national authority of Guinea has approved
the continuation of the tests on targeted populations.
is composed of an argument kernel (it is necessary to wait for
more conclusive tests before making large vaccination
campaigns) modified by two discourse structures. This argument
is tagged as follows:
&lt;argument Id= 11,
polarity= attack
conceptsInvolved= ’vaccine(X)/agentive/ test(T,X)’
strength= moderate &gt;
&lt;concession&gt; Even if the vaccine seems 100% e cient and
without any side e↵ ects on the tested population, &lt; /concession&gt;
&lt;main arg&gt; it is necessary to wait for more conclusive tests before
making large vaccination campaigns. &lt; /main arg&gt;
&lt;elaboration&gt; The national authority of Guinea has approved the
continuation of the tests on targeted populations. &lt;/elaboration&gt;
&lt; /argument&gt;.</p>
      <p>At this stage no meta-data is considered such as the date
of the argument or the author status. This notation was
defined independently of any ongoing task such as ConLL15.
4.3</p>
    </sec>
    <sec id="sec-17">
      <title>Example of a argumentation synthesis</title>
      <p>Let us now characterize the form of the synthesis. For issue
(1) the synthesis of the examples given in 4.1 is organized as
follows, starting by the concepts which appear in the issue
(root concepts) and then considering those, more remote,
which appear in the derived concepts constructed by the
network of concepts. Each line of the synthesis is produced
via a predefined language pattern. Between parenthesis, the
total number of occurrences of arguments mined in texts for
that concept is given as an indication. This number is also a
link that points to the arguments that have been mined in
their original textual form. For each line, the positive facet
is presented first, followed by the negative one when they
exist, independently of the occurrence frequency, in order to
preserve a certain homogeneity in the reading:
Vaccine protection is good (3), bad (5).</p>
      <p>Vaccine avoids (5), does not avoid (3) dissemination.
Vaccine is di cult (3) to develop.</p>
      <p>Vaccine is (4) expensive.</p>
      <p>Vaccine is not (1) available.</p>
      <p>Ebola is (5) a dangerous disease.</p>
      <p>Humans may die (1) from Ebola.</p>
      <p>Tests of the vaccine show no (2), high (4) side-e↵ ects.
Other arguments (4).
5</p>
    </sec>
    <sec id="sec-18">
      <title>ARGUMENT SYNTHESIS</title>
    </sec>
    <sec id="sec-19">
      <title>GENERATION</title>
      <p>Given the input data and the output forms presented in 4.3,
let us now develop the grammatical and lexical environment
that allows the generation of this synthesis. The ordering
of the synthesis is based on the path mentioned in the
’conceptsInvolved’ attribute of each argument. Arguments are
sorted on the basis of this attribute.
5.1</p>
    </sec>
    <sec id="sec-20">
      <title>The lexico-grammatical generation system</title>
      <p>The synthesis of arguments is based on abstract linguistic
patterns defined as follows:
(1) [HeadConcept, Be/Predicate, Evaluative,
AttributeLexicalization].
or:
(2) [HeadConcept, Be/Predicate,
AttributeLexicalization. Evaluative].</p>
      <p>The symbol HeadConcept is the lexicalization of the
rightmost (or leaf) concept in the attribute ’conceptsInvolved’.
For example, in:
conceptsInvolved= ’vaccine(X)/agentive/ test(T,X)’
the rightmost concept is ’Test(T,Z)’, its lexicxalization, stored
in a lexicon, is ’test’. For example:
word([test], noun, abstract, test(X,Y)).</p>
      <p>The function lexicalization(Word, Predicate) extracts the
lexical item from the appropriate lexical entry. Finally, when
the path given in ’conceptsInvolved’ is long (from 2 concepts),
a lexicalization of the whole path is produced, e.g. for the
example above: ’test of the vaccine’ instead of ’test’ alone,
using the basic compound NP pattern:
[A of B] or
[A of B of C]
where A, B and C are concepts of the path:
’C/role/B/role’/A.</p>
      <p>The symbol Be/Predicate entails the lexicalization of the
main predicate of the sentence. It is either the neutral ’be’
(is, are) or a specific lexicalization if the attribute name that
is considered includes a higher-level predicate such as prevent,
evaluate, allow, avoid as shown in the Qualia structure above.
When there are supports and attacks, this verb appears as
such and then modified by a negation so that supports and
attacks can be di↵ erentiated in the synthesis, using the
following patterns:
Supports:
[is/are/Verb, (Stats)],
6</p>
      <sec id="sec-20-1">
        <title>Attacks:</title>
        <p>[is not/ are not/do not Verb/ does not Verb, (Stats)]
Supports and attacks:
[is/are/Verb, (Stats1), is not/ are not/do not Verb/
does not Verb, (Stats2)]</p>
        <p>The symbol (Stats) simply indicates the number of
arguments that have been mined with the ’conceptInvolved’ path
considered. These statistics are indicative since they depend
on the volume of text that has been mined. They also do not
account for the strength of the argument. (Stats) is a link
to the set of mined arguments that the reader may wish to
inspect. These are stored in a cluster (section 4.2) and sorted
from the argument(s) that have the highest strength to those
that have the lowest one.</p>
        <p>
          The symbol Evaluative is an evaluative expression, often
a scalar adjective, modified by a negation depending on its
polarity, the existence of an antonym and the polarity of the
arguments to represent. Adjectives, as well as nouns, have
their semantic characteristics stored in their lexical entry.
The lexicalization of Evaluative is defined as follows:
(1) by default the values good / bad for products and attitudes
and easy/di cult for processes. However, these adjectives
are not very accurate and specific values are preferred.
(2) by an adjective found in one of the arguments of the
cluster. The adjective must be prototypical. For that
purpose, we use a resource we defined for opinion analysis, where
about 500 of the most standard adjectives are organized in
non-branching proportional series
          <xref ref-type="bibr" rid="ref2">(Cruse 1986)</xref>
          . Each series
corresponds a precise conceptual dimension such as shape,
cost, temperature, di culty, peace, availability, etc. Most
series are composed of a few positive and negatively oriented
terms and possibly a neutral point, terms are structured with
a partial order. Other series correspond to boolean
adjectives and are simply composed of two elements. For example,
starting from the most negative term:
temperature: frozen - cold - mild - (warm, hot) - boiling.
prototypical: cold / warm, neutral: mild.
toxicity : poisonous - dangerous - neutral - recommended
beneficial. prototypical: dangerous / beneficial, neutral:
neutral.
cost: expensive - (reasonable, appropriate) - cheap.
In the cluster of arguments being processed the adjectives
used as evaluators are collected and their inclusion in one or
more non-branching proportional series is investigated. The
series that is the most frequently refereed to is kept, and
both the positively and negatively oriented typical adjectives
are used in the lexicalization.
        </p>
        <p>The symbol AttributeLexicalization is the direct lexical
item that corresponds to the concept. In our approach this
lexical item is stored in a lexicon where lexical entries include
lexical items and their associated logical representations, as
described above. When the attribute is propositional, the
same strategy is used, in that case, an expression is produced
via a pattern instead of a single item. This is the case for
example with ’develop(T,X)’ which gets the realization to
develop.
19</p>
        <p>This synthesis generation system is quite simple at the
moment. Besides domain lexical entries, related to the concepts
in the Qualias (between 50 and 100 lexical entries depending
on the issue), the system currently uses 22 patterns that
allow to produce the constructions presented above. These
patterns are stable for the type of issues we consider, which
are simple, with arguments which are direct and concern a
single concept of the network. This generation system needs
further investigation for more abstract issues or complex
situations such as controversial dialogs. In this first stage, only
the relations between the controversial issue and arguments
have been investigated. In a ’real’ argumentation, arguments
may attack or support other arguments instead of the issue.
It is not clear at the moment whether the same generation
procedure can be used. Probably, to keep the synthesis
readable, additional devices such as navigation links would be
needed to indicate that an argument gets supports or attacks
from others.</p>
        <p>The system described here is relatively simple to implement.
A first implementation has been realized with the logic-based
platform &lt;TextCoop&gt; we developed for discourse analysis.
This platform can also be used for language generation since
it is fully declarative and partly reversible. However, the
strategy used in &lt;TextCoop&gt; needs to be revised so that the
simplest structure is generated first. The parsing strategy that
is used in &lt;TextCoop&gt; is indeed a priori oriented towards
language parsing.
5.2</p>
      </sec>
    </sec>
    <sec id="sec-21">
      <title>Features of an Evaluation</title>
      <p>This approach to argument synthesis generation is relatively
simple and straightforward. The two levels: synthesis and
links to the exact arguments stored as clusters seems to
be a good compromise between proliferation of data and
over generalization via a few synthetic lines. Elaborating
an appropriate level of synthesis and the way to realize it
linguistically needs a lot of experimentations and tunings.</p>
      <p>A direct evaluation of this system must be realized along
at least the following main features:
• the overall linguistic adequacy of the generation
system, based on the patterns presented in the previous
section. These patterns produce short sentences that
readers can understand, their linguistic adequacy
and language overall quality must be evaluated and
possibly tuned.
• the types of domains and controversial issues for
which this system is adequate, from very concrete
to more abstract, and for various amounts of
arguments, from just a few to several hundreds, including
duplicates. Several experiments show that Qualia
structures can quite straightforwardly be specified
for concrete areas, this is less easy for areas which
manipulate abstract or very general purpose
concepts. An evaluation of the limits of the approach
20
and the means to overcome di culties must be
carried out.
• the load in linguistic and conceptual resource
description for each domain where arguments are mined.
This includes essentially Qualia structures, lexical
entries and associated resources such as non-branching
proportional series, a few generation patterns. Qualia
structures are often related to a domain ontology.
These resources are not very large for a domain, but
they nevertheless require some manual e↵ ort.
Another aspect is the management of the coherence of
the resources when they are specified at di↵ erent
levels (e.g. lexical, conceptual).
• the adequacy of the conceptual model, here the
Qualia structure. It is necessary to show that it
is indeed su ciently accurate in a large number of
domains.</p>
      <p>A higher level evaluation concerns the adequacy of this
synthesis for professionals who want to access to an
argument synthesis, where arguments do correspond to their view
and analysis of the domain. For example, in the hotel and
restaurant domains, the features at stake are well identified
in consumer evaluation. For more abstract or less common
areas such as the issues developed in this paper, there is a
need to make sure that the concepts developed in the Qualias
do correspond to the vision of professional users, otherwise
such a system will not be of much practical relevance.</p>
      <p>We also feel there is no unique form of synthesis: several
forms of synthesis could be foreseen that would depend on the
reader’s interests and profile. A real evaluation of the system
presented here requires the development of adequate protocols
to measure the relevance of various forms of synthesis (more
abstract, more or less concise, using various types of concepts,
with appropriate lexicalizations). This can only be done with
large and diverse populations of users over several domains.
6</p>
    </sec>
    <sec id="sec-22">
      <title>CONCLUSION AND PERSPECTIVES</title>
      <p>Given a controversial issue, argument mining from texts in
natural language is extremely challenging: besides
linguistic aspects, domain knowledge is often required together
with appropriate forms of inferences to identify arguments.
A major challenge in argument mining is to organize the
arguments which have been mined to generate a synthesis
that is readable, synthetic enough and relevant for various
types of users.</p>
      <p>Based on the Generative Lexicon (GL) Qualia structure,
which is a kind of lexical and knowledge repository, we have
shown how to construct a synthesis that captures the typical
elements found in arguments and their polarity. We propose
a two-level approach: a synthesis of the arguments that have
been mined and, associated with the elements of this
synthesis, navigation facilities that allow to access the argument
contents in order to get more details.</p>
      <p>The work presented in this paper is a first, exploratory
experiment. Constructing an argument synthesis from a large
diversity of issues, and in various contexts (dialogs, consumer
opinion expression, etc.) , where arguments may also attack
each other, is a complex task. The type of synthesis that
would be really useful to the public and to professionals
requires a close cooperation with opinion analysts and
related professional. Additional features of arguments such as
reliability, strength, validity, persuasion, etc. should also be
incorporated at some future stage.</p>
      <p>The next steps of our work include:
• the development of other issues and the annotation
of related arguments by at least two annotators, this
would entail a further validation of our model,
• the development of a larger argument mining system
based on knowledge, in particular as structured in
the Qualia,
• the development of tools that would contribute to
the creation of Qualias from texts, we have some
ongoing work on this very crucial dimension,
• the development of adequate and relevant evaluation
protocols that would analyze the adequacy of the
type of synthesis that is produced, as such and w.r.t.
users expectations and implicit model of the domain.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Budzynska</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Janier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Reed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Saint-Dizier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Stede</surname>
          </string-name>
          , and
          <string-name>
            <given-names>O.</given-names>
            <surname>Yakorska</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>A model for processing illocutionary structures and argumentation in debates</article-title>
          .
          <source>In proc. LREC</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Cruse</surname>
          </string-name>
          , Lexical Semantics, Cambridge university Press,
          <year>1986</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>V. W.</surname>
          </string-name>
          , Feng and
          <string-name>
            <surname>G</surname>
          </string-name>
          , Hirst.
          <year>2011</year>
          .
          <article-title>Classifying arguments by scheme</article-title>
          .
          <source>In Proceedings of the 49th ACL: Human Language Technologies</source>
          , Portland, USA.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>A.</surname>
          </string-name>
          , Fiedler and
          <string-name>
            <surname>H.</surname>
          </string-name>
          ,
          <source>Horacek</source>
          .
          <year>2007</year>
          .
          <article-title>Argumentation within deductive reasoning</article-title>
          .
          <source>International Journal of Intelligent Systems</source>
          ,
          <volume>22</volume>
          (
          <issue>1</issue>
          ):
          <fpage>49</fpage>
          -
          <lpage>70</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Janier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            and
            <surname>Reed</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <year>2015</year>
          .
          <article-title>Towards a Theory of Close Analysis for Dispute Mediation Discourse</article-title>
          ,
          <source>Journal of Argumentation.</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Kirschner</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          ,
          <source>Eckle-Kohler and I., Gurevych</source>
          .
          <year>2015</year>
          .
          <article-title>Linking the Thoughts: Analysis of Argumentation Structures in Scientific Publications</article-title>
          .
          <source>In: Proceedings of the 2nd Workshop on Argumentation Mining</source>
          , Denver.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>I.</given-names>
            <surname>Mani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Maybury</surname>
          </string-name>
          ,
          <year>1999</year>
          .
          <article-title>Advances in Automatic Text Summarization</article-title>
          , MIT Press.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>R.</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Mochales</given-names>
            <surname>Palau</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.F.</given-names>
            ,
            <surname>Moens</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Argumentation mining: the detection, classification and structure of arguments in text</article-title>
          .
          <source>Twelfth international ICAIL'09</source>
          ,
          <string-name>
            <surname>Barcelona</surname>
          </string-name>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>H.</surname>
          </string-name>
          , Nguyen and
          <string-name>
            <given-names>D.</given-names>
            <surname>Litman</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Extracting Argument and Domain Words for Identifying Argument Components in Texts</article-title>
          .
          <source>In: Proc of the 2nd Workshop on Argumentation Mining</source>
          , Denver.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>A.</surname>
          </string-name>
          , Peldszus and
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Stede</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>From argument diagrams to argumentation mining in texts: a survey</article-title>
          .
          <source>International Journal of Cognitive Informatics and Natural Intelligence (IJCINI).</source>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>J.</surname>
          </string-name>
          ,
          <source>Pustejovsky</source>
          .
          <year>1995</year>
          .
          <article-title>The Generative Lexicon</article-title>
          , MIT Press.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>P.</given-names>
            <surname>Saint-Dizier</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Argument Mining: the bottleneck of knowledge and lexical resources</article-title>
          , LREC, Portoroz.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>P.</given-names>
            <surname>Saint-Dizier</surname>
          </string-name>
          .
          <year>2016b</year>
          .
          <source>Challenges of Argument Mining: Generating an Argument Synthesis based on the Qualia Structure, proceedings of INLG</source>
          , Edinburgh.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Swanson</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
          , Ecker and
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Walker</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Argument Mining: Extracting Arguments from Online Dialogue</article-title>
          ,
          <source>in proc. SIGDIAL</source>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>M.G.</surname>
          </string-name>
          , Villalba and
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Saint-Dizier</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Some Facets of Argument Mining for Opinion Analysis</article-title>
          ,
          <string-name>
            <surname>COMMA</surname>
          </string-name>
          , Vienna, IOS Publishing.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Walker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Anand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.E.</given-names>
            ,
            <surname>Fox</surname>
          </string-name>
          <string-name>
            <surname>Tree</surname>
          </string-name>
          ,
          <string-name>
            <surname>R.</surname>
          </string-name>
          , Abbott and
          <string-name>
            <surname>J.</surname>
          </string-name>
          ,
          <source>King</source>
          .
          <year>2012</year>
          .
          <article-title>A Corpus for Research on Deliberation and Debate</article-title>
          .
          <source>Proc. of LREC</source>
          , Istanbul.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>G.</surname>
          </string-name>
          ,
          <source>Winterstein</source>
          .
          <year>2012</year>
          .
          <article-title>What but-sentences argue for: An argumentative analysis of 'but'</article-title>
          ,
          <source>in Lingua 122.</source>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>I.</given-names>
            ,
            <surname>Zuckerman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>McConachy</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Korb</surname>
          </string-name>
          .
          <year>2000</year>
          .
          <article-title>Using Argumentation Strategies in Automatic Argument Generation</article-title>
          , INLG.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>