A Two-Level Approach to Generate Synthetic Argumentation
                            Reports
                                                      Patrick Saint-Dizier
                                                           IRIT-CNRS
                                                      118 route de Narbonne
                                                      Toulouse 31062, France
                                                          stdizier@irit.fr
ABSTRACT                                                            texts, e.g. (Mochales Palau et ali.., 2009), (Kirschner et ali.,
Given a controversial issue, a major challenge in argument          2015), for example for opinion analysis, e.g. (Villalba et al.,
mining is to organize the arguments which have been mined           2012), mediation analysis (Janier et al. 2015) or transcribed
to generate a synthesis that is readable, synthetic enough and      argumentative dialog analysis, e.g. (Budzynska et ali., 2014),
relevant for various types of users. Based on the Generative        (Swanson et ali., 2015). The analysis of the NLP techniques
Lexicon (GL) Qualia structure, which is a kind of lexical           relevant for argument mining from annotated structures is
and knowledge repository, that we have enhanced in di↵erent         analyzed in e.g. (Peldszus et al. 2016). Annotated corpora
manners and associated with inferences and language pat-            are now available, e.g. the AIFDB dialog corpora or (Walker
terns, we show how to construct a synthesis that outlines the       et al., 2012). These corpora are very useful to understand
typical elements found in arguments. We propose a two-level         how argumentation is realized in texts, e.g. to identify argu-
approach: a synthesis of the arguments that have been mined         mentative discourse units (ADUs), linguistic cues (Nguyen et
and navigation facilities that allow to access the argument         al., 2015), and argumentation strategies, in a concrete way,
contents in order to get more details.                              possibly in association with abstract argumentation schemes,
                                                                    as shown in e.g. (Feng et al., 2011). Finally, reasoning as-
CCS CONCEPTS                                                        pects related to argumentation analysis are developed in e.g.
                                                                    (Fiedler et al., 2007) and (Winterstein, 2012) from a formal
• Computer systems organization → Natural language
                                                                    semantics perspective.
processing; Argument mining; Knowledge representation;
                                                                       In opinion analysis, the benefits of argument mining are
                                                                    not only to identify the customers satisfaction level, but
KEYWORDS
                                                                    also to characterize why customers are happy or unhappy.
Generative Lexicon, Rhetoric                                        Abstracting over arguments allows to construct summaries
                                                                    and to induce customer preferences or value systems (e.g. low
1     AIMS AND CHALLENGES                                           fares are preferred to localization or quality of welcome for
                                                                    some categories of hotel customers).
One of the main goals of argument mining is, given a con-
                                                                       In (Saint-Dizier 2016a), a corpus analysis identifies the
troversial issue, to identify in a set of texts the arguments
                                                                    type of knowledge and inferences that are required to develop
for or against that issue. These arguments act as supports or
                                                                    argument mining. It is briefly reported in this paper. Then,
attacks of the issue. Arguments may also attack or support
                                                                    we have shown, on the basis of a set of examples, that the
the arguments which support or attack that controversial
                                                                    Generative Lexicon (GL) could be an appropriate model,
issue in order to reinforce or cancel out their impact. Argu-
                                                                    sufficiently expressive, to characterize the types of knowledge,
ments are difficult to identify, in particular when they are
                                                                    inferences and lexical data that are required to accurately
not adjacent to the controversial issue, possibly not in the
                                                                    identify arguments related to an issue.
same text, because their linguistic, conceptual or referential
links to that issue are rarely explicit.
   For example, given the controversial issue: Vaccine against
                                                                    1.2    Natural Language Summarization
Ebola is necessary, identifying the argumentative link with
statements such as Ebola adjuvant is toxic, Ebola vaccine           In natural language generation, the main projects on argu-
production is costly, or 7 people died during Ebola vaccine tests   ment generation was developed as early as (Zuckerman et
cannot be realized solely on the basis of linguistics data, but     ali. 2000) and (Fiedler 2007). While there are currently sev-
requires domain knowledge. Furthermore, a knowledge-based           eral reseach e↵orts to develop argument mining, very little
analysis of the third statement shows that it is irrelevant or      has been done recently to produce a synthesis of the mined
neutral w.r.t. the issue (Saint-Dizier 2016).                       arguments that is readable, synthetic enough and relevant
                                                                    for various types of users. This includes identifying the main
1.1        Argument Mining Challenges                               features for or against a controversial issue, but also tasks
                                                                    such as eliminating duplicates, fallacies or ad’hominem state-
Argument mining is an emerging research area which intro-
                                                                    ments and identifying arguments which attack or support
duces new challenges in natural language processing and
                                                                    each other, besides the controversial issue.
generation. Argument mining research applies to written


      14                                                            18th Workshop on Computational Models of Natural Argument
                                                                                Floris Bex, Floriana Grasso, Nancy Green (eds)
                                                                                                    16th July 2017, London, UK
   In (Saint-Dizier 2016b), we show how arguments that have           sources, e.g.: newspaper articles and blogs from associations.
been mined can be organized in hierarchically structured              Issues deal with:
clusters so that readers car navigate over and within sets of         (1) Ebola vaccination,
arguments according to the conceptual organization proposed           (2) women’s situation in India,
by the Generative Lexicon. This approach turned out not to            (3) nuclear plants and
be synthetic enough, since over 100 arguments can be mined            (4) organic agriculture.
for a given issue, making the perception of the main attacks          The total corpus includes 51 texts, a total of 24500 words for
and supports quite difficult. However, this initial approach          122 di↵erent arguments. From our manual analysis, the follow-
allows the construction of an argument database useful to             ing argument polarities are observed: attacks: 51 occurrences,
readers who wish to access to the exact form of arguments             supports: 32, argumentative concessions: 17, argumentative
that have been mined.                                                 contrasts: 18 and undetermined: 4.
   The present contribution focuses on the next stage, aiming            Our analysis shows that for 95 arguments (78%), some
at producing a synthesis that is short and efficient where the        form of knowledge is involved to establish an argumentative
concepts present in the GL Qualia structures are used to              relation with an issue. An important result is that the num-
abstract over arguments while keeping the structure of those          ber of concepts involved is not very large: 121 concepts for
clusters of arguments which are accessible via links from the         95 arguments over 4 domains. These concepts are mainly
synthesis. The argument cluster system is accessed to get             related to purposes, functions, parts, properties, creation and
more precise information.                                             development of the concepts in the issues. These are relatively
   This contribution to natural language argumentation syn-           well defined and implemented in the Qualia structure of the
thesis is not really a summarization task, as e.g. developed          Generative Lexicon, which is the framework adopted in our
in (Mani et al. 1999). In our approach, no text or document           modeling.
is reduced to produce a summary. The synthesis that is pro-
posed is simply a two level re-organization task that involves
forms of clustering. From that perspective, it could be viewed
as a preliminary step to a summarization procedure. A real
summarization task would involve constructing summaries               2.2    An introduction to the Generative
for each cluster of arguments, but this is beyond the present                Lexicon
research.                                                             The Generative Lexicon (GL) (Pustejovsky, 1995) is an at-
   In terms of feature classification and relevance, the con-         tempt to structure lexical semantics knowledge in conjunction
cepts used in the Qualia structure of the Generative lexicon          with domain knowledge. In the GL, the Qualia structure of an
are defined a priori, similarly to the features evaluated in          entity is both a lexical and knowledge repository composed
most opinion analysis systems. They are used as entry points          of four fields called roles:
to the re-organization and to the cluster system. A challeng-
ing point is that these concepts must obviously correspond
as much as possible to the user perception of the domain to
which the issue belongs.                                                    • the constitutive role describes the various parts
                                                                              of the entity and its physical properties, it may in-
1.3    Paper Structure                                                        clude subfields such as material, parts, shape, etc.

In this paper, for the sake of understanding, we first summa-               • the formal role describes what distinguishes the
rize the results elaborated in our previous contributions, we                 entity from other objects,
then develop the synthesis production model. This two-level
approach, a synthesis of the arguments that have been mined                 • the telic role describes the entity functions, uses,
and, associated with navigation facilities that allow to access               roles and purposes,
the argument contents in order to get more details seems to
be an efficient approach for readers who want first to get the              • the agentive role describes the origin of the entity,
essentials of the argumentation.                                              how it was created or produced.

2  MINING ARGUMENTS: THE NEED
   OF KNOWLEDGE
                                                                      To illustrate this conceptual organization, let us consider the
2.1 Corpus Analysis: the need of                                      controversial issue (1):
    knowledge                                                         The vaccine against Ebola is necessary.
To explore and characterize the forms of knowledge that are           The main concepts in the Qualia structure of the head term
required to develop argument mining in texts, we constructed          of (1), vaccine are organized as follows:
and annotated four corpora based on four independent contro-
versial issues. The texts considered are extracts from various
                                                                  2


 18th Workshop on Computational Models of Natural Argument                                                                       15
 Floris Bex, Floriana Grasso, Nancy Green (eds)
 16th July 2017, London, UK
Vaccine(X):                                                             complex, but better corresponds to the reality. Reasoning
2                h                        i3
  constitutive: active principle, adjuvant ,                            with these complex forms is addressed in (Saint-Dizier 2016a).
6                                            7                          In the present paper, we propose an argument synthesis based
6       2                             3      7
6         main:  protect from(X,Y,D),        7                          on atomic concepts, which may be isolated concepts in roles
6                                            7
6       6                             7      7                          or part of formula.
6telic: 4avoid(X,dissemination(D)), 5,       7
6                                            7                             Other types of resources such as FrameNet, WordNet or
6         means:  inject(Z,X,Y)              7
6                                            7                          VerbNet do not contain the information found in Qualias,
6          h                  i              7
6                                            7                          which is essential for argument mining. These latter resources
6formal: medicine, artifact ,                7
6            "                          #    7                          are structured around predicative forms and mainly describe
6                                            7
6                                            7
4agentive : develop(T,X), test(T,X),         5                          the type of arguments and adjuncts predicates can take and
               sell(T,X)                                                how they are combined. VerbNet introduces semantic rep-
                                                                        resentations based on primitives which may be of interest
   The Qualia structure of Ebola is:                                    for our approach as a way to normalize the complex repre-
Ebola:                                                                  sentations we have implemented and, possibly, the atomic
2          h               i                       3
  formal: virus, disease ,                                              concepts themselves.
6       "                                         #7                       In terms of data completeness, it is clear that Qualia de-
6                                                  7
6         infect(E1,ebola,P)   )   get sick(E2,P)  7                    scriptions will never be comprehesive knowledge repositories
4telic:                                            5
          ) 3die(E3,P) ^ E1E2 E3                                      for a given concept, with all its facets. In our approach, due
                                                                        to a lack of existing resources, Qualias are mostly described
    The terms, predicates or constants, found in the di↵erent           manually. Even via the use of bootstrapping techniques, it is
roles of any Qualia are defined on the basis of a domain                clear that the Qualia of a concept C (e.g. vaccine) essentially
ontology, when it exists, or via bootstrapping techniques on            contains the most typical features (encoded via concepts,
the web, if it doesn’t exist for this domain. Qualia structures         which themselves can originate Qualias). An incremental au-
can be hierarchically organized, as in any ontology. Vaccine is         tomatic acquisition of Qualia features would be crucial and
a kind of medicine, it therefore inherits of the properties, i.e.       helpful, but this raises complex problems such as consistency
the predicates present in medicine, unless some blocking is             or granularity management.
formulated. Similarly, Ebola is a kind of disease, therefore it
inherits of the properties of a disease. This rich organization
greatly simplifies the description of Qualias. Some Qualia
                                                                        3    A NETWORK OF QUALIAS TO
structure resources are available as payware at ELRA, from                   CHARACTERIZE THE
the SIMPLE EEC project.                                                      GENERATIVE EXPANSION OF
    Finally, from the two above Qualias and via formula ex-                  ARGUMENTS
pansion, the formal representation of the controversial issue           Before generating any argument synthesis, it is necessary to
is:                                                                     organize the set of concepts at stake in these arguments, in
2 (protect from(X,Y, (infect(E1,ebola, Y) ) get sick(E2,Y)              particular those which are supported or attacked w.r.t. the
)                                                                       controversial issue.
3 die(E3,Y))) ^ avoid(X,dissemination(ebola)).                             Our observations show that arguments attack or support
                                                                        (1) specific concepts found in the Qualia of the head terms in
2.3        Using Qualias for Argument Mining                            the controversial issue (called root concepts) or (2) concepts
Originally, the Qualia structure was designed to characterize           directly derived from these root concepts, via their Qualia.
sense variations around a prototypical one, and the large               In particular, concepts related to various types of parts of
number of potential combinations of NP arguments with                   the concept, purposes, functions and uses of the concept are
predicates, in particular verbs. This was implemented via               frequently found in arguments, whatever their polarity. For
a mechanism called type coercion. In (Pustejovsky 1995),                example, arguments can attack properties or purposes of the
the Qualia structure manipulates atomic terms assocated as              adjuvant, which is a part of a vaccine or the way a vaccine
lists to one of the four qualia roles. This Qualia structure,           avoids dissemination of a disease. Besides the telic role, the
in our view, is a specific interpretation of a more global              agentive is also a crucial role since, for example, arguments
typology of object descriptions, realized in various manners            often attack the way a vaccine has been tested, or its purchase
from Aristotle.                                                         cost.
   In our approach, we view the Qualia structure as a means to             From these observations, a network of Qualias can be
structure knowledge associated with concepts in a functional            defined to organize the concepts and knowledge structures
way, via telicity (an subtypes of telicity), various types of           involved in the arguments. This network is, for the time
functional and structural parts, and the way an object was              being, limited to three levels because derived concepts must
created. This view allows us to have complex structures such            remain functionally close to the root concepts to have a
as formula, modalities, etc. instead of just the atomic concepts        certain argumentative weight. However, some arguments,
of the original Qualia. Manipulating such structures is more            quite remote from the main concepts of the issue may have a
                                                                    3


      16                                                                18th Workshop on Computational Models of Natural Argument
                                                                                    Floris Bex, Floriana Grasso, Nancy Green (eds)
                                                                                                        16th July 2017, London, UK
strong weight because of the hot concepts they include, e.g.          found. These arguments are stored in a specific cluster called
vaccination prevents bio-terrorism.                                  ’Other’, so that they can be accessed in the synthesis.
   A Qualia Qi describes major features of a concept such as             Let us now illustrate the construction of this network. For
vaccine(X), it can be formally defined as follows:                    example, from ‘vaccine’, two nodes are candidates:
Qi : [ RX : Tji,X ], where:                                           {active principle, adjuvant}.
                                                                     Assuming that, e.g. active principle is a terminal concept,
- RX denotes the four roles: X 2 {f ormal, constitutive, agentive, telic}
and possibly sub-roles,                                               and adjuvant a non-terminal one, then, active principle is
- Tji,X is a term which is a formula, a predicate or a constant       associated with words such as ‘active principle, stem cell’.
Tj in the role X of Qi .                                             ‘Adjuvant’ being non-terminal, its Qualia is included into the
   A network of Qualias is then defined as follows:                   network at step 1:
- nodes are of two types: [terminal concept] (no associated
                                                                     Adjuvant(Y,X1):
Qualia) or [non terminal concept, associated Qualia],               2              h                      i               3
- the root is the semantic representation of the controversial         formal : medicine, chemicals ,
                                                                    6          h                                         i7
issue and the related Qualias Qi ,                                  4                                                     5
- Step 1: the first level of the network is composed of the            telic:    dilute(Y,X1),    allow(inject(X1,P))
nodes which correspond to the terms Tji,X in the roles of the        The concepts in the formal and telic roles (medicine, chemi-
Qualias Qi The result of this step is the set T of terminal           cals, dilute(Y,X1), inject(X1,P) originate new Qualias, these
nodes { Tji,X } and non terminal nodes { Tji,X , Q0i0 : [ RX :        are considered at step 2. Natural language terms are asso-
     0
T 1ij 0,X ]}.                                                         ciated to these concepts, e.g.: medicine, chemicals, inject,
In the case of issue (1), nodes form the set T which corre-           injection, dilute, dilution.
sponds to the terms in the Qualias of vaccine(X) and Ebola,              Similarly, test(T,X), in the agentive role of vaccine(X),
some of which are terminal and others      non-terminal.              applied to vaccines (and medicines more generally), origi-
                                    0
- Step 2: similarly, the terms T 1ij 0,X from the Q0i0 of step 1      nates a node in T, and additional nodes in T1, T2 from its
introduce new nodes into the network together with their              non-terminal concepts:
own Qualia when they are non-terminal concepts. They form
the set T 1, derived from T.                                        2Test(T,X):
                                                                                         h                                   i3
- Step 3: the same operation is carried out on T 1 to produce          constitutive:       parts of a test: data, protocol
T 2.                                                                6                                                          7
                                                                    6          "                                              #7
                                                                    6            Main: evaluate(T,protection(X,Y, A)), 7
- Final step: production of T 3. The set of concepts involved       6                                                          7
                                                                    6telic:                                                    7
is: {T [ T 1 [ T 2 [ T 3}.                                          6            evaluate(T,side-effects(X,Y,       A))        7
                                                                    6                                                          7
   This network of Qualias forms the backbone of the argu-          6              h                i                          7
                                                                    6formal : scientific act                                   7
ment mining system. This network develops the argumenta-            6                                                          7
                                                                    6                 h                 i                      7
tive generative expansion of the controversial issue.               4                                                          5
This network is also the organization principle, expressed in          agentive     :   elaborate(T,X)
terms of relatedness, that guides the generation of a synthesis          Then, arguments may attack or support concepts present
where the di↵erent facets of the Qualias it contains are the          in test, such as the evaluation of the protection or the test
structuring principles (Saint-Dizier 2016b). Natural language         protocol that has been used.
words or expressions that lexicalize each concepts can be
associated with each network nodes.                                   4 GENERATING AN
   An important issue is to evaluate if and how this network
                                                                           ARGUMENTATIVE REPORT FROM
defines a kind of ’transitive closure’ that would characterize
the typical and most frequent concepts that appear in argu-                A CONTROVERSIAL ISSUE
ments that support or attack an issue. Obviously, unexpected          4.1 Main arguments to include in the
arguments may arise with concepts not in this network, prob-                  synthesis
ably with a lower frequency and recurrence.
                                                                      Let us consider the arguments found in issue (1) that must
   The total number of concepts at stake in arguments for
                                                                      be included in a synthesis. Arguments mainly attack or sup-
an average size issue, such as issues (1) and (3), is about 40
                                                                      port salient features of the main concepts of the issue or
concepts, with non-homogeneous usages. A rough estimate
                                                                      closely related ones by means of various forms of evaluative
indicates that about 80% of the arguments related to an issue
                                                                      expressions. Among 50 non-overlapping arguments, the main
can be recognized on the basis of these concepts.
                                                                      arguments associated with issue (1) are, omitting associated
   The ’transitive closure’ induced by this network is obvi-
                                                                      discourse structures:
ously not perfect, but quite efficient. The arguments which
                                                                      Supports:
are not found are rather unexpected, but of much interest.
                                                                      vaccine protection is very good;
For example, arguments such as: vaccinations prevents bio-
                                                                     Ebola is a dangerous disease;
terrorism, vaccination raises ethical and racial problems are
                                                                      there are high contamination risks;
                                                                      vaccine has limited side-e↵ects,
                                                                 4


 18th Workshop on Computational Models of Natural Argument                                                                      17
 Floris Bex, Floriana Grasso, Nancy Green (eds)
 16th July 2017, London, UK
there are no medical alternative to vaccine, etc.                            The ’ConceptsInvolved’ attribute is structured from the
Attacks:                                                                  root node, as a kind of path, so that the concept that is
there is a limited number of cases and deaths compared to                 involved is clearly identified. This attribute may contain an
other diseases;                                                           ordered list of paths if several concepts are involved. A typical
7 vaccined people died in Monrovia,                                       path is a sequence:
there are limited risks of contamination,                                 root-concept/(Role/Concept)*,
there is a large ignorance of contamination forms,                        where the Concept is a predicate or a constant of a Qualia
competent sta↵ is hard to find and P4 lab is really difficult             structure found under the role ’Role’.
to develop;                                                               For example, the concept of ’protocol’ is defined as follows:
vaccine toxicity has been shown,                                          vaccine(X)/agentive/test(T,X)/constitutive/protocol.
vaccine may have high side-e↵ects,                                        since protocol is a concept associated with the constitutive
Concessions or Contrasts:                                                 role of the concept ’test’. Besides a clear identification of the
some side-e↵ects;                                                         concept ’protocol’, this path can be used as (1) a way to
production and development costs are high;                                structure a synthesis and (2) a way to provide some expla-
vaccine is not yet available;                                             nation of why an utterance is an argument by outlining its
a systematic vaccination raises ethical and freedom problems.             relation(s) with the root concept.
   The type of synthesis we propose reduces these expressions                Argument 11:
to an evaluation of the main concepts, as found in the Qualia             Even if the vaccine seems 100% efficient and without any
structure, as developed in section 3. The number of arguments             side e↵ects on the tested population, it is necessary to wait
for and against each concept is given to outline the balance              for more conclusive tests before making large vaccination
between each tendency. This however remains a tendency                    campaigns. The national authority of Guinea has approved
because this number of arguments depends on how many                      the continuation of the tests on targeted populations.
texts have been processed and how may arguments have                      is composed of an argument kernel (it is necessary to wait for
been mined. The comprehensive list of arguments is stored                 more conclusive tests before making large vaccination cam-
in clusters and are accessible via navigation links from the              paigns) modified by two discourse structures. This argument
concepts in the synthesis (section 4.3).                                  is tagged as follows:
                                                                          <argument Id= 11,
                                                                          polarity= attack
4.2        Synthesis Input Data: annotated                                conceptsInvolved= ’vaccine(X)/agentive/ test(T,X)’
           arguments                                                      strength= moderate >
The output of the mining system, which is the starting point              <concession> Even if the vaccine seems 100% efficient and with-
of the synthesis construction, includes the following attributes          out any side e↵ects on the tested population, < /concession>
(Saint-Dizier 2016), associated with each argument:                       <main arg> it is necessary to wait for more conclusive tests before
                                                                          making large vaccination campaigns. < /main arg>
       • the argument identifier (an integer), in our first ex-           <elaboration> The national authority of Guinea has approved the
         periment, all arguments attack or support the con-               continuation of the tests on targeted populations. </elaboration>
         troversial issue and no other arguments,                         < /argument>.
       • the text span involved that delimits the argument                   At this stage no meta-data is considered such as the date
         compound and its kernel, which ranges from a few                 of the argument or the author status. This notation was
         words to a paragraph. In the synthesis, only the                 defined independently of any ongoing task such as ConLL15.
         kernel of the argument is considered,
       • the polarity of the argument w.r.t. the issue:
         support or attack. Additional intermediate values                4.3    Example of a argumentation synthesis
         (argumentative concessions and contrasts) could be               Let us now characterize the form of the synthesis. For issue
         added in the future,                                             (1) the synthesis of the examples given in 4.1 is organized as
       • the concepts involved, to identify the argument:                 follows, starting by the concepts which appear in the issue
         list of the main concepts from the Qualias used in               (root concepts) and then considering those, more remote,
         the mining process. Only the concepts found in the               which appear in the derived concepts constructed by the
         main argument section are considered. Those in ad-               network of concepts. Each line of the synthesis is produced
         joined discourse structures will be considered for               via a predefined language pattern. Between parenthesis, the
         higher-level synthesis in a later stage, to identify, e.g.       total number of occurrences of arguments mined in texts for
         restrictions.                                                    that concept is given as an indication. This number is also a
       • the strength of the argument, based on linguistic                link that points to the arguments that have been mined in
         marks found in the argument,                                     their original textual form. For each line, the positive facet
       • the discourse structures in the compound, associ-                is presented first, followed by the negative one when they
         ated with the argument kernel, as processed by our               exist, independently of the occurrence frequency, in order to
         discourse analysis platform TextCoop.                            preserve a certain homogeneity in the reading:
                                                                      5


      18                                                                  18th Workshop on Computational Models of Natural Argument
                                                                                      Floris Bex, Floriana Grasso, Nancy Green (eds)
                                                                                                          16th July 2017, London, UK
Vaccine protection is good (3), bad (5).                                 Attacks:
Vaccine avoids (5), does not avoid (3) dissemination.                    [is not/ are not/do not Verb/ does not Verb, (Stats)]
Vaccine is difficult (3) to develop.                                     Supports and attacks:
Vaccine is (4) expensive.                                                [is/are/Verb, (Stats1), is not/ are not/do not Verb/
Vaccine is not (1) available.                                            does not Verb, (Stats2)]
Ebola is (5) a dangerous disease.                                           The symbol (Stats) simply indicates the number of argu-
Humans may die (1) from Ebola.                                           ments that have been mined with the ’conceptInvolved’ path
Tests of the vaccine show no (2), high (4) side-e↵ects.                  considered. These statistics are indicative since they depend
Other arguments (4).                                                     on the volume of text that has been mined. They also do not
                                                                         account for the strength of the argument. (Stats) is a link
5     ARGUMENT SYNTHESIS                                                 to the set of mined arguments that the reader may wish to
      GENERATION                                                         inspect. These are stored in a cluster (section 4.2) and sorted
                                                                         from the argument(s) that have the highest strength to those
Given the input data and the output forms presented in 4.3,
                                                                         that have the lowest one.
let us now develop the grammatical and lexical environment
                                                                            The symbol Evaluative is an evaluative expression, often
that allows the generation of this synthesis. The ordering
                                                                         a scalar adjective, modified by a negation depending on its
of the synthesis is based on the path mentioned in the ’con-
                                                                         polarity, the existence of an antonym and the polarity of the
ceptsInvolved’ attribute of each argument. Arguments are
                                                                         arguments to represent. Adjectives, as well as nouns, have
sorted on the basis of this attribute.
                                                                         their semantic characteristics stored in their lexical entry.
                                                                         The lexicalization of Evaluative is defined as follows:
5.1     The lexico-grammatical generation                                (1) by default the values good / bad for products and attitudes
        system                                                           and easy/difficult for processes. However, these adjectives
The synthesis of arguments is based on abstract linguistic               are not very accurate and specific values are preferred.
 patterns defined as follows:                                            (2) by an adjective found in one of the arguments of the
(1) [HeadConcept, Be/Predicate, Evaluative,                              cluster. The adjective must be prototypical. For that pur-
AttributeLexicalization].                                                pose, we use a resource we defined for opinion analysis, where
 or:                                                                     about 500 of the most standard adjectives are organized in
(2) [HeadConcept, Be/Predicate,                                          non-branching proportional series (Cruse 1986). Each series
AttributeLexicalization. Evaluative].                                    corresponds a precise conceptual dimension such as shape,
    The symbol HeadConcept is the lexicalization of the right-           cost, temperature, difficulty, peace, availability, etc. Most
 most (or leaf) concept in the attribute ’conceptsInvolved’.             series are composed of a few positive and negatively oriented
 For example, in:                                                        terms and possibly a neutral point, terms are structured with
 conceptsInvolved= ’vaccine(X)/agentive/ test(T,X)’                      a partial order. Other series correspond to boolean adjec-
 the rightmost concept is ’Test(T,Z)’, its lexicxalization, stored       tives and are simply composed of two elements. For example,
 in a lexicon, is ’test’. For example:                                   starting from the most negative term:
word([test], noun, abstract, test(X,Y)).                                 temperature: frozen - cold - mild - (warm, hot) - boiling.
The function lexicalization(Word, Predicate) extracts the                prototypical: cold / warm, neutral: mild.
 lexical item from the appropriate lexical entry. Finally, when          toxicity : poisonous - dangerous - neutral - recommended -
 the path given in ’conceptsInvolved’ is long (from 2 concepts),         beneficial. prototypical: dangerous / beneficial, neutral: neu-
 a lexicalization of the whole path is produced, e.g. for the            tral.
 example above: ’test of the vaccine’ instead of ’test’ alone,           cost: expensive - (reasonable, appropriate) - cheap.
 using the basic compound NP pattern:                                    In the cluster of arguments being processed the adjectives
 [A of B] or                                                             used as evaluators are collected and their inclusion in one or
 [A of B of C]                                                           more non-branching proportional series is investigated. The
where A, B and C are concepts of the path:                               series that is the most frequently refereed to is kept, and
’C/role/B/role’/A.                                                       both the positively and negatively oriented typical adjectives
    The symbol Be/Predicate entails the lexicalization of the            are used in the lexicalization.
 main predicate of the sentence. It is either the neutral ’be’              The symbol AttributeLexicalization is the direct lexical
(is, are) or a specific lexicalization if the attribute name that        item that corresponds to the concept. In our approach this
 is considered includes a higher-level predicate such as prevent,        lexical item is stored in a lexicon where lexical entries include
 evaluate, allow, avoid as shown in the Qualia structure above.          lexical items and their associated logical representations, as
When there are supports and attacks, this verb appears as                described above. When the attribute is propositional, the
 such and then modified by a negation so that supports and               same strategy is used, in that case, an expression is produced
 attacks can be di↵erentiated in the synthesis, using the fol-           via a pattern instead of a single item. This is the case for
 lowing patterns:                                                        example with ’develop(T,X)’ which gets the realization to
 Supports:                                                               develop.
 [is/are/Verb, (Stats)],
                                                                     6


 18th Workshop on Computational Models of Natural Argument                                                                          19
 Floris Bex, Floriana Grasso, Nancy Green (eds)
 16th July 2017, London, UK
   This synthesis generation system is quite simple at the mo-                 and the means to overcome difficulties must be car-
ment. Besides domain lexical entries, related to the concepts                  ried out.
in the Qualias (between 50 and 100 lexical entries depending
on the issue), the system currently uses 22 patterns that                    • the load in linguistic and conceptual resource descrip-
allow to produce the constructions presented above. These                      tion for each domain where arguments are mined.
patterns are stable for the type of issues we consider, which                  This includes essentially Qualia structures, lexical en-
are simple, with arguments which are direct and concern a                      tries and associated resources such as non-branching
single concept of the network. This generation system needs                    proportional series, a few generation patterns. Qualia
further investigation for more abstract issues or complex sit-                 structures are often related to a domain ontology.
uations such as controversial dialogs. In this first stage, only               These resources are not very large for a domain, but
the relations between the controversial issue and arguments                    they nevertheless require some manual e↵ort. An-
have been investigated. In a ’real’ argumentation, arguments                   other aspect is the management of the coherence of
may attack or support other arguments instead of the issue.                    the resources when they are specified at di↵erent
It is not clear at the moment whether the same generation                      levels (e.g. lexical, conceptual).
procedure can be used. Probably, to keep the synthesis read-
able, additional devices such as navigation links would be                   • the adequacy of the conceptual model, here the
needed to indicate that an argument gets supports or attacks                   Qualia structure. It is necessary to show that it
from others.                                                                   is indeed sufficiently accurate in a large number of
   The system described here is relatively simple to implement.                domains.
A first implementation has been realized with the logic-based
platform <TextCoop> we developed for discourse analysis.                  A higher level evaluation concerns the adequacy of this
This platform can also be used for language generation since           synthesis for professionals who want to access to an argu-
it is fully declarative and partly reversible. However, the            ment synthesis, where arguments do correspond to their view
strategy used in <TextCoop> needs to be revised so that the            and analysis of the domain. For example, in the hotel and
simplest structure is generated first. The parsing strategy that       restaurant domains, the features at stake are well identified
is used in <TextCoop> is indeed a priori oriented towards              in consumer evaluation. For more abstract or less common
language parsing.                                                      areas such as the issues developed in this paper, there is a
                                                                       need to make sure that the concepts developed in the Qualias
                                                                       do correspond to the vision of professional users, otherwise
5.2        Features of an Evaluation                                   such a system will not be of much practical relevance.
                                                                          We also feel there is no unique form of synthesis: several
This approach to argument synthesis generation is relatively
                                                                       forms of synthesis could be foreseen that would depend on the
simple and straightforward. The two levels: synthesis and
                                                                       reader’s interests and profile. A real evaluation of the system
links to the exact arguments stored as clusters seems to
                                                                       presented here requires the development of adequate protocols
be a good compromise between proliferation of data and
                                                                       to measure the relevance of various forms of synthesis (more
over generalization via a few synthetic lines. Elaborating
                                                                       abstract, more or less concise, using various types of concepts,
an appropriate level of synthesis and the way to realize it
                                                                       with appropriate lexicalizations). This can only be done with
linguistically needs a lot of experimentations and tunings.
                                                                       large and diverse populations of users over several domains.
   A direct evaluation of this system must be realized along
at least the following main features:
                                                                       6    CONCLUSION AND PERSPECTIVES
       • the overall linguistic adequacy of the generation sys-        Given a controversial issue, argument mining from texts in
         tem, based on the patterns presented in the previous          natural language is extremely challenging: besides linguis-
         section. These patterns produce short sentences that          tic aspects, domain knowledge is often required together
         readers can understand, their linguistic adequacy             with appropriate forms of inferences to identify arguments.
         and language overall quality must be evaluated and            A major challenge in argument mining is to organize the
         possibly tuned.                                               arguments which have been mined to generate a synthesis
                                                                       that is readable, synthetic enough and relevant for various
       • the types of domains and controversial issues for             types of users.
         which this system is adequate, from very concrete                Based on the Generative Lexicon (GL) Qualia structure,
         to more abstract, and for various amounts of argu-            which is a kind of lexical and knowledge repository, we have
         ments, from just a few to several hundreds, including         shown how to construct a synthesis that captures the typical
         duplicates. Several experiments show that Qualia              elements found in arguments and their polarity. We propose
         structures can quite straightforwardly be specified           a two-level approach: a synthesis of the arguments that have
         for concrete areas, this is less easy for areas which         been mined and, associated with the elements of this synthe-
         manipulate abstract or very general purpose con-              sis, navigation facilities that allow to access the argument
         cepts. An evaluation of the limits of the approach            contents in order to get more details.
                                                                   7


      20                                                               18th Workshop on Computational Models of Natural Argument
                                                                                   Floris Bex, Floriana Grasso, Nancy Green (eds)
                                                                                                       16th July 2017, London, UK
   The work presented in this paper is a first, exploratory                  [17] G., Winterstein. 2012. What but-sentences argue for: An argu-
experiment. Constructing an argument synthesis from a large                       mentative analysis of ’but’, in Lingua 122.
                                                                             [18] I., Zuckerman, R., McConachy and K. Korb. 2000. Using Argu-
diversity of issues, and in various contexts (dialogs, consumer                   mentation Strategies in Automatic Argument Generation, INLG.
opinion expression, etc.) , where arguments may also attack
each other, is a complex task. The type of synthesis that
would be really useful to the public and to professionals
requires a close cooperation with opinion analysts and re-
lated professional. Additional features of arguments such as
reliability, strength, validity, persuasion, etc. should also be
incorporated at some future stage.
   The next steps of our work include:
      • the development of other issues and the annotation
          of related arguments by at least two annotators, this
          would entail a further validation of our model,
      • the development of a larger argument mining system
          based on knowledge, in particular as structured in
          the Qualia,
      • the development of tools that would contribute to
          the creation of Qualias from texts, we have some
          ongoing work on this very crucial dimension,
      • the development of adequate and relevant evaluation
          protocols that would analyze the adequacy of the
          type of synthesis that is produced, as such and w.r.t.
          users expectations and implicit model of the domain.

REFERENCES
 [1] K., Budzynska, M., Janier, C., Reed, P. Saint-Dizier, M., Stede,
     and O. Yakorska. 2014. A model for processing illocutionary
     structures and argumentation in debates. In proc. LREC, 2014.
 [2] A. Cruse, Lexical Semantics, Cambridge university Press, 1986.
 [3] V. W., Feng and G, Hirst. 2011. Classifying arguments by scheme.
     In Proceedings of the 49th ACL: Human Language Technologies,
     Portland, USA.
 [4] A., Fiedler and H., Horacek. 2007. Argumentation within de-
     ductive reasoning. International Journal of Intelligent Systems,
     22(1):49-70.
 [5] M., Janier, C. and Reed, C. 2015. Towards a Theory of Close
     Analysis for Dispute Mediation Discourse, Journal of Argumenta-
     tion.
 [6] C., Kirschner, J., Eckle-Kohler and I., Gurevych. 2015. Linking
     the Thoughts: Analysis of Argumentation Structures in Scientific
     Publications. In: Proceedings of the 2nd Workshop on Argumen-
     tation Mining, Denver.
 [7] I. Mani, M. Maybury, 1999. Advances in Automatic Text Sum-
     marization, MIT Press.
 [8] R., Mochales Palau and M.F., Moens. 2009. Argumentation min-
     ing: the detection, classification and structure of arguments in
     text. Twelfth international ICAIL’09, Barcelona.
 [9] H., Nguyen and D. Litman. 2015. Extracting Argument and
     Domain Words for Identifying Argument Components in Texts.
     In: Proc of the 2nd Workshop on Argumentation Mining, Denver.
[10] A., Peldszus and M., Stede. 2016. From argument diagrams to
     argumentation mining in texts: a survey. International Journal of
     Cognitive Informatics and Natural Intelligence (IJCINI).
[11] J., Pustejovsky. 1995. The Generative Lexicon, MIT Press.
[12] P. Saint-Dizier. 2016. Argument Mining: the bottleneck of knowl-
     edge and lexical resources, LREC, Portoroz.
[13] P. Saint-Dizier. 2016b. Challenges of Argument Mining: Gen-
     erating an Argument Synthesis based on the Qualia Structure,
     proceedings of INLG, Edinburgh.
[14] R., Swanson, B., Ecker and M., Walker. 2015. Argument Mining:
     Extracting Arguments from Online Dialogue, in proc. SIGDIAL.
[15] M.G., Villalba and P., Saint-Dizier. 2012. Some Facets of Ar-
     gument Mining for Opinion Analysis, COMMA, Vienna, IOS
     Publishing.
[16] M., Walker, P., Anand, J.E., Fox Tree, R., Abbott and J., King.
     2012. A Corpus for Research on Deliberation and Debate. Proc.
     of LREC, Istanbul.
                                                                         8


 18th Workshop on Computational Models of Natural Argument                                                                                 21
 Floris Bex, Floriana Grasso, Nancy Green (eds)
 16th July 2017, London, UK