A Two-Level Approach to Generate Synthetic Argumentation Reports Patrick Saint-Dizier IRIT-CNRS 118 route de Narbonne Toulouse 31062, France stdizier@irit.fr ABSTRACT texts, e.g. (Mochales Palau et ali.., 2009), (Kirschner et ali., Given a controversial issue, a major challenge in argument 2015), for example for opinion analysis, e.g. (Villalba et al., mining is to organize the arguments which have been mined 2012), mediation analysis (Janier et al. 2015) or transcribed to generate a synthesis that is readable, synthetic enough and argumentative dialog analysis, e.g. (Budzynska et ali., 2014), relevant for various types of users. Based on the Generative (Swanson et ali., 2015). The analysis of the NLP techniques Lexicon (GL) Qualia structure, which is a kind of lexical relevant for argument mining from annotated structures is and knowledge repository, that we have enhanced in di↵erent analyzed in e.g. (Peldszus et al. 2016). Annotated corpora manners and associated with inferences and language pat- are now available, e.g. the AIFDB dialog corpora or (Walker terns, we show how to construct a synthesis that outlines the et al., 2012). These corpora are very useful to understand typical elements found in arguments. We propose a two-level how argumentation is realized in texts, e.g. to identify argu- approach: a synthesis of the arguments that have been mined mentative discourse units (ADUs), linguistic cues (Nguyen et and navigation facilities that allow to access the argument al., 2015), and argumentation strategies, in a concrete way, contents in order to get more details. possibly in association with abstract argumentation schemes, as shown in e.g. (Feng et al., 2011). Finally, reasoning as- CCS CONCEPTS pects related to argumentation analysis are developed in e.g. (Fiedler et al., 2007) and (Winterstein, 2012) from a formal • Computer systems organization → Natural language semantics perspective. processing; Argument mining; Knowledge representation; In opinion analysis, the benefits of argument mining are not only to identify the customers satisfaction level, but KEYWORDS also to characterize why customers are happy or unhappy. Generative Lexicon, Rhetoric Abstracting over arguments allows to construct summaries and to induce customer preferences or value systems (e.g. low 1 AIMS AND CHALLENGES fares are preferred to localization or quality of welcome for some categories of hotel customers). One of the main goals of argument mining is, given a con- In (Saint-Dizier 2016a), a corpus analysis identifies the troversial issue, to identify in a set of texts the arguments type of knowledge and inferences that are required to develop for or against that issue. These arguments act as supports or argument mining. It is briefly reported in this paper. Then, attacks of the issue. Arguments may also attack or support we have shown, on the basis of a set of examples, that the the arguments which support or attack that controversial Generative Lexicon (GL) could be an appropriate model, issue in order to reinforce or cancel out their impact. Argu- sufficiently expressive, to characterize the types of knowledge, ments are difficult to identify, in particular when they are inferences and lexical data that are required to accurately not adjacent to the controversial issue, possibly not in the identify arguments related to an issue. same text, because their linguistic, conceptual or referential links to that issue are rarely explicit. For example, given the controversial issue: Vaccine against 1.2 Natural Language Summarization Ebola is necessary, identifying the argumentative link with statements such as Ebola adjuvant is toxic, Ebola vaccine In natural language generation, the main projects on argu- production is costly, or 7 people died during Ebola vaccine tests ment generation was developed as early as (Zuckerman et cannot be realized solely on the basis of linguistics data, but ali. 2000) and (Fiedler 2007). While there are currently sev- requires domain knowledge. Furthermore, a knowledge-based eral reseach e↵orts to develop argument mining, very little analysis of the third statement shows that it is irrelevant or has been done recently to produce a synthesis of the mined neutral w.r.t. the issue (Saint-Dizier 2016). arguments that is readable, synthetic enough and relevant for various types of users. This includes identifying the main 1.1 Argument Mining Challenges features for or against a controversial issue, but also tasks such as eliminating duplicates, fallacies or ad’hominem state- Argument mining is an emerging research area which intro- ments and identifying arguments which attack or support duces new challenges in natural language processing and each other, besides the controversial issue. generation. Argument mining research applies to written 14 18th Workshop on Computational Models of Natural Argument Floris Bex, Floriana Grasso, Nancy Green (eds) 16th July 2017, London, UK In (Saint-Dizier 2016b), we show how arguments that have sources, e.g.: newspaper articles and blogs from associations. been mined can be organized in hierarchically structured Issues deal with: clusters so that readers car navigate over and within sets of (1) Ebola vaccination, arguments according to the conceptual organization proposed (2) women’s situation in India, by the Generative Lexicon. This approach turned out not to (3) nuclear plants and be synthetic enough, since over 100 arguments can be mined (4) organic agriculture. for a given issue, making the perception of the main attacks The total corpus includes 51 texts, a total of 24500 words for and supports quite difficult. However, this initial approach 122 di↵erent arguments. From our manual analysis, the follow- allows the construction of an argument database useful to ing argument polarities are observed: attacks: 51 occurrences, readers who wish to access to the exact form of arguments supports: 32, argumentative concessions: 17, argumentative that have been mined. contrasts: 18 and undetermined: 4. The present contribution focuses on the next stage, aiming Our analysis shows that for 95 arguments (78%), some at producing a synthesis that is short and efficient where the form of knowledge is involved to establish an argumentative concepts present in the GL Qualia structures are used to relation with an issue. An important result is that the num- abstract over arguments while keeping the structure of those ber of concepts involved is not very large: 121 concepts for clusters of arguments which are accessible via links from the 95 arguments over 4 domains. These concepts are mainly synthesis. The argument cluster system is accessed to get related to purposes, functions, parts, properties, creation and more precise information. development of the concepts in the issues. These are relatively This contribution to natural language argumentation syn- well defined and implemented in the Qualia structure of the thesis is not really a summarization task, as e.g. developed Generative Lexicon, which is the framework adopted in our in (Mani et al. 1999). In our approach, no text or document modeling. is reduced to produce a summary. The synthesis that is pro- posed is simply a two level re-organization task that involves forms of clustering. From that perspective, it could be viewed as a preliminary step to a summarization procedure. A real summarization task would involve constructing summaries 2.2 An introduction to the Generative for each cluster of arguments, but this is beyond the present Lexicon research. The Generative Lexicon (GL) (Pustejovsky, 1995) is an at- In terms of feature classification and relevance, the con- tempt to structure lexical semantics knowledge in conjunction cepts used in the Qualia structure of the Generative lexicon with domain knowledge. In the GL, the Qualia structure of an are defined a priori, similarly to the features evaluated in entity is both a lexical and knowledge repository composed most opinion analysis systems. They are used as entry points of four fields called roles: to the re-organization and to the cluster system. A challeng- ing point is that these concepts must obviously correspond as much as possible to the user perception of the domain to which the issue belongs. • the constitutive role describes the various parts of the entity and its physical properties, it may in- 1.3 Paper Structure clude subfields such as material, parts, shape, etc. In this paper, for the sake of understanding, we first summa- • the formal role describes what distinguishes the rize the results elaborated in our previous contributions, we entity from other objects, then develop the synthesis production model. This two-level approach, a synthesis of the arguments that have been mined • the telic role describes the entity functions, uses, and, associated with navigation facilities that allow to access roles and purposes, the argument contents in order to get more details seems to be an efficient approach for readers who want first to get the • the agentive role describes the origin of the entity, essentials of the argumentation. how it was created or produced. 2 MINING ARGUMENTS: THE NEED OF KNOWLEDGE To illustrate this conceptual organization, let us consider the 2.1 Corpus Analysis: the need of controversial issue (1): knowledge The vaccine against Ebola is necessary. To explore and characterize the forms of knowledge that are The main concepts in the Qualia structure of the head term required to develop argument mining in texts, we constructed of (1), vaccine are organized as follows: and annotated four corpora based on four independent contro- versial issues. The texts considered are extracts from various 2 18th Workshop on Computational Models of Natural Argument 15 Floris Bex, Floriana Grasso, Nancy Green (eds) 16th July 2017, London, UK Vaccine(X): complex, but better corresponds to the reality. Reasoning 2 h i3 constitutive: active principle, adjuvant , with these complex forms is addressed in (Saint-Dizier 2016a). 6 7 In the present paper, we propose an argument synthesis based 6 2 3 7 6 main: protect from(X,Y,D), 7 on atomic concepts, which may be isolated concepts in roles 6 7 6 6 7 7 or part of formula. 6telic: 4avoid(X,dissemination(D)), 5, 7 6 7 Other types of resources such as FrameNet, WordNet or 6 means: inject(Z,X,Y) 7 6 7 VerbNet do not contain the information found in Qualias, 6 h i 7 6 7 which is essential for argument mining. These latter resources 6formal: medicine, artifact , 7 6 " # 7 are structured around predicative forms and mainly describe 6 7 6 7 4agentive : develop(T,X), test(T,X), 5 the type of arguments and adjuncts predicates can take and sell(T,X) how they are combined. VerbNet introduces semantic rep- resentations based on primitives which may be of interest The Qualia structure of Ebola is: for our approach as a way to normalize the complex repre- Ebola: sentations we have implemented and, possibly, the atomic 2 h i 3 formal: virus, disease , concepts themselves. 6 " #7 In terms of data completeness, it is clear that Qualia de- 6 7 6 infect(E1,ebola,P) ) get sick(E2,P) 7 scriptions will never be comprehesive knowledge repositories 4telic: 5 ) 3die(E3,P) ^ E1E2 E3 for a given concept, with all its facets. In our approach, due to a lack of existing resources, Qualias are mostly described The terms, predicates or constants, found in the di↵erent manually. Even via the use of bootstrapping techniques, it is roles of any Qualia are defined on the basis of a domain clear that the Qualia of a concept C (e.g. vaccine) essentially ontology, when it exists, or via bootstrapping techniques on contains the most typical features (encoded via concepts, the web, if it doesn’t exist for this domain. Qualia structures which themselves can originate Qualias). An incremental au- can be hierarchically organized, as in any ontology. Vaccine is tomatic acquisition of Qualia features would be crucial and a kind of medicine, it therefore inherits of the properties, i.e. helpful, but this raises complex problems such as consistency the predicates present in medicine, unless some blocking is or granularity management. formulated. Similarly, Ebola is a kind of disease, therefore it inherits of the properties of a disease. This rich organization greatly simplifies the description of Qualias. Some Qualia 3 A NETWORK OF QUALIAS TO structure resources are available as payware at ELRA, from CHARACTERIZE THE the SIMPLE EEC project. GENERATIVE EXPANSION OF Finally, from the two above Qualias and via formula ex- ARGUMENTS pansion, the formal representation of the controversial issue Before generating any argument synthesis, it is necessary to is: organize the set of concepts at stake in these arguments, in 2 (protect from(X,Y, (infect(E1,ebola, Y) ) get sick(E2,Y) particular those which are supported or attacked w.r.t. the ) controversial issue. 3 die(E3,Y))) ^ avoid(X,dissemination(ebola)). Our observations show that arguments attack or support (1) specific concepts found in the Qualia of the head terms in 2.3 Using Qualias for Argument Mining the controversial issue (called root concepts) or (2) concepts Originally, the Qualia structure was designed to characterize directly derived from these root concepts, via their Qualia. sense variations around a prototypical one, and the large In particular, concepts related to various types of parts of number of potential combinations of NP arguments with the concept, purposes, functions and uses of the concept are predicates, in particular verbs. This was implemented via frequently found in arguments, whatever their polarity. For a mechanism called type coercion. In (Pustejovsky 1995), example, arguments can attack properties or purposes of the the Qualia structure manipulates atomic terms assocated as adjuvant, which is a part of a vaccine or the way a vaccine lists to one of the four qualia roles. This Qualia structure, avoids dissemination of a disease. Besides the telic role, the in our view, is a specific interpretation of a more global agentive is also a crucial role since, for example, arguments typology of object descriptions, realized in various manners often attack the way a vaccine has been tested, or its purchase from Aristotle. cost. In our approach, we view the Qualia structure as a means to From these observations, a network of Qualias can be structure knowledge associated with concepts in a functional defined to organize the concepts and knowledge structures way, via telicity (an subtypes of telicity), various types of involved in the arguments. This network is, for the time functional and structural parts, and the way an object was being, limited to three levels because derived concepts must created. This view allows us to have complex structures such remain functionally close to the root concepts to have a as formula, modalities, etc. instead of just the atomic concepts certain argumentative weight. However, some arguments, of the original Qualia. Manipulating such structures is more quite remote from the main concepts of the issue may have a 3 16 18th Workshop on Computational Models of Natural Argument Floris Bex, Floriana Grasso, Nancy Green (eds) 16th July 2017, London, UK strong weight because of the hot concepts they include, e.g. found. These arguments are stored in a specific cluster called vaccination prevents bio-terrorism. ’Other’, so that they can be accessed in the synthesis. A Qualia Qi describes major features of a concept such as Let us now illustrate the construction of this network. For vaccine(X), it can be formally defined as follows: example, from ‘vaccine’, two nodes are candidates: Qi : [ RX : Tji,X ], where: {active principle, adjuvant}. Assuming that, e.g. active principle is a terminal concept, - RX denotes the four roles: X 2 {f ormal, constitutive, agentive, telic} and possibly sub-roles, and adjuvant a non-terminal one, then, active principle is - Tji,X is a term which is a formula, a predicate or a constant associated with words such as ‘active principle, stem cell’. Tj in the role X of Qi . ‘Adjuvant’ being non-terminal, its Qualia is included into the A network of Qualias is then defined as follows: network at step 1: - nodes are of two types: [terminal concept] (no associated Adjuvant(Y,X1): Qualia) or [non terminal concept, associated Qualia], 2 h i 3 - the root is the semantic representation of the controversial formal : medicine, chemicals , 6 h i7 issue and the related Qualias Qi , 4 5 - Step 1: the first level of the network is composed of the telic: dilute(Y,X1), allow(inject(X1,P)) nodes which correspond to the terms Tji,X in the roles of the The concepts in the formal and telic roles (medicine, chemi- Qualias Qi The result of this step is the set T of terminal cals, dilute(Y,X1), inject(X1,P) originate new Qualias, these nodes { Tji,X } and non terminal nodes { Tji,X , Q0i0 : [ RX : are considered at step 2. Natural language terms are asso- 0 T 1ij 0,X ]}. ciated to these concepts, e.g.: medicine, chemicals, inject, In the case of issue (1), nodes form the set T which corre- injection, dilute, dilution. sponds to the terms in the Qualias of vaccine(X) and Ebola, Similarly, test(T,X), in the agentive role of vaccine(X), some of which are terminal and others non-terminal. applied to vaccines (and medicines more generally), origi- 0 - Step 2: similarly, the terms T 1ij 0,X from the Q0i0 of step 1 nates a node in T, and additional nodes in T1, T2 from its introduce new nodes into the network together with their non-terminal concepts: own Qualia when they are non-terminal concepts. They form the set T 1, derived from T. 2Test(T,X): h i3 - Step 3: the same operation is carried out on T 1 to produce constitutive: parts of a test: data, protocol T 2. 6 7 6 " #7 6 Main: evaluate(T,protection(X,Y, A)), 7 - Final step: production of T 3. The set of concepts involved 6 7 6telic: 7 is: {T [ T 1 [ T 2 [ T 3}. 6 evaluate(T,side-effects(X,Y, A)) 7 6 7 This network of Qualias forms the backbone of the argu- 6 h i 7 6formal : scientific act 7 ment mining system. This network develops the argumenta- 6 7 6 h i 7 tive generative expansion of the controversial issue. 4 5 This network is also the organization principle, expressed in agentive : elaborate(T,X) terms of relatedness, that guides the generation of a synthesis Then, arguments may attack or support concepts present where the di↵erent facets of the Qualias it contains are the in test, such as the evaluation of the protection or the test structuring principles (Saint-Dizier 2016b). Natural language protocol that has been used. words or expressions that lexicalize each concepts can be associated with each network nodes. 4 GENERATING AN An important issue is to evaluate if and how this network ARGUMENTATIVE REPORT FROM defines a kind of ’transitive closure’ that would characterize the typical and most frequent concepts that appear in argu- A CONTROVERSIAL ISSUE ments that support or attack an issue. Obviously, unexpected 4.1 Main arguments to include in the arguments may arise with concepts not in this network, prob- synthesis ably with a lower frequency and recurrence. Let us consider the arguments found in issue (1) that must The total number of concepts at stake in arguments for be included in a synthesis. Arguments mainly attack or sup- an average size issue, such as issues (1) and (3), is about 40 port salient features of the main concepts of the issue or concepts, with non-homogeneous usages. A rough estimate closely related ones by means of various forms of evaluative indicates that about 80% of the arguments related to an issue expressions. Among 50 non-overlapping arguments, the main can be recognized on the basis of these concepts. arguments associated with issue (1) are, omitting associated The ’transitive closure’ induced by this network is obvi- discourse structures: ously not perfect, but quite efficient. The arguments which Supports: are not found are rather unexpected, but of much interest. vaccine protection is very good; For example, arguments such as: vaccinations prevents bio- Ebola is a dangerous disease; terrorism, vaccination raises ethical and racial problems are there are high contamination risks; vaccine has limited side-e↵ects, 4 18th Workshop on Computational Models of Natural Argument 17 Floris Bex, Floriana Grasso, Nancy Green (eds) 16th July 2017, London, UK there are no medical alternative to vaccine, etc. The ’ConceptsInvolved’ attribute is structured from the Attacks: root node, as a kind of path, so that the concept that is there is a limited number of cases and deaths compared to involved is clearly identified. This attribute may contain an other diseases; ordered list of paths if several concepts are involved. A typical 7 vaccined people died in Monrovia, path is a sequence: there are limited risks of contamination, root-concept/(Role/Concept)*, there is a large ignorance of contamination forms, where the Concept is a predicate or a constant of a Qualia competent sta↵ is hard to find and P4 lab is really difficult structure found under the role ’Role’. to develop; For example, the concept of ’protocol’ is defined as follows: vaccine toxicity has been shown, vaccine(X)/agentive/test(T,X)/constitutive/protocol. vaccine may have high side-e↵ects, since protocol is a concept associated with the constitutive Concessions or Contrasts: role of the concept ’test’. Besides a clear identification of the some side-e↵ects; concept ’protocol’, this path can be used as (1) a way to production and development costs are high; structure a synthesis and (2) a way to provide some expla- vaccine is not yet available; nation of why an utterance is an argument by outlining its a systematic vaccination raises ethical and freedom problems. relation(s) with the root concept. The type of synthesis we propose reduces these expressions Argument 11: to an evaluation of the main concepts, as found in the Qualia Even if the vaccine seems 100% efficient and without any structure, as developed in section 3. The number of arguments side e↵ects on the tested population, it is necessary to wait for and against each concept is given to outline the balance for more conclusive tests before making large vaccination between each tendency. This however remains a tendency campaigns. The national authority of Guinea has approved because this number of arguments depends on how many the continuation of the tests on targeted populations. texts have been processed and how may arguments have is composed of an argument kernel (it is necessary to wait for been mined. The comprehensive list of arguments is stored more conclusive tests before making large vaccination cam- in clusters and are accessible via navigation links from the paigns) modified by two discourse structures. This argument concepts in the synthesis (section 4.3). is tagged as follows: The output of the mining system, which is the starting point Even if the vaccine seems 100% efficient and with- of the synthesis construction, includes the following attributes out any side e↵ects on the tested population, < /concession> (Saint-Dizier 2016), associated with each argument:
it is necessary to wait for more conclusive tests before making large vaccination campaigns. < /main arg> • the argument identifier (an integer), in our first ex- The national authority of Guinea has approved the periment, all arguments attack or support the con- continuation of the tests on targeted populations. troversial issue and no other arguments, < /argument>. • the text span involved that delimits the argument At this stage no meta-data is considered such as the date compound and its kernel, which ranges from a few of the argument or the author status. This notation was words to a paragraph. In the synthesis, only the defined independently of any ongoing task such as ConLL15. kernel of the argument is considered, • the polarity of the argument w.r.t. the issue: support or attack. Additional intermediate values 4.3 Example of a argumentation synthesis (argumentative concessions and contrasts) could be Let us now characterize the form of the synthesis. For issue added in the future, (1) the synthesis of the examples given in 4.1 is organized as • the concepts involved, to identify the argument: follows, starting by the concepts which appear in the issue list of the main concepts from the Qualias used in (root concepts) and then considering those, more remote, the mining process. Only the concepts found in the which appear in the derived concepts constructed by the main argument section are considered. Those in ad- network of concepts. Each line of the synthesis is produced joined discourse structures will be considered for via a predefined language pattern. Between parenthesis, the higher-level synthesis in a later stage, to identify, e.g. total number of occurrences of arguments mined in texts for restrictions. that concept is given as an indication. This number is also a • the strength of the argument, based on linguistic link that points to the arguments that have been mined in marks found in the argument, their original textual form. For each line, the positive facet • the discourse structures in the compound, associ- is presented first, followed by the negative one when they ated with the argument kernel, as processed by our exist, independently of the occurrence frequency, in order to discourse analysis platform TextCoop. preserve a certain homogeneity in the reading: 5 18 18th Workshop on Computational Models of Natural Argument Floris Bex, Floriana Grasso, Nancy Green (eds) 16th July 2017, London, UK Vaccine protection is good (3), bad (5). Attacks: Vaccine avoids (5), does not avoid (3) dissemination. [is not/ are not/do not Verb/ does not Verb, (Stats)] Vaccine is difficult (3) to develop. Supports and attacks: Vaccine is (4) expensive. [is/are/Verb, (Stats1), is not/ are not/do not Verb/ Vaccine is not (1) available. does not Verb, (Stats2)] Ebola is (5) a dangerous disease. The symbol (Stats) simply indicates the number of argu- Humans may die (1) from Ebola. ments that have been mined with the ’conceptInvolved’ path Tests of the vaccine show no (2), high (4) side-e↵ects. considered. These statistics are indicative since they depend Other arguments (4). on the volume of text that has been mined. They also do not account for the strength of the argument. (Stats) is a link 5 ARGUMENT SYNTHESIS to the set of mined arguments that the reader may wish to GENERATION inspect. These are stored in a cluster (section 4.2) and sorted from the argument(s) that have the highest strength to those Given the input data and the output forms presented in 4.3, that have the lowest one. let us now develop the grammatical and lexical environment The symbol Evaluative is an evaluative expression, often that allows the generation of this synthesis. The ordering a scalar adjective, modified by a negation depending on its of the synthesis is based on the path mentioned in the ’con- polarity, the existence of an antonym and the polarity of the ceptsInvolved’ attribute of each argument. Arguments are arguments to represent. Adjectives, as well as nouns, have sorted on the basis of this attribute. their semantic characteristics stored in their lexical entry. The lexicalization of Evaluative is defined as follows: 5.1 The lexico-grammatical generation (1) by default the values good / bad for products and attitudes system and easy/difficult for processes. However, these adjectives The synthesis of arguments is based on abstract linguistic are not very accurate and specific values are preferred. patterns defined as follows: (2) by an adjective found in one of the arguments of the (1) [HeadConcept, Be/Predicate, Evaluative, cluster. The adjective must be prototypical. For that pur- AttributeLexicalization]. pose, we use a resource we defined for opinion analysis, where or: about 500 of the most standard adjectives are organized in (2) [HeadConcept, Be/Predicate, non-branching proportional series (Cruse 1986). Each series AttributeLexicalization. Evaluative]. corresponds a precise conceptual dimension such as shape, The symbol HeadConcept is the lexicalization of the right- cost, temperature, difficulty, peace, availability, etc. Most most (or leaf) concept in the attribute ’conceptsInvolved’. series are composed of a few positive and negatively oriented For example, in: terms and possibly a neutral point, terms are structured with conceptsInvolved= ’vaccine(X)/agentive/ test(T,X)’ a partial order. Other series correspond to boolean adjec- the rightmost concept is ’Test(T,Z)’, its lexicxalization, stored tives and are simply composed of two elements. For example, in a lexicon, is ’test’. For example: starting from the most negative term: word([test], noun, abstract, test(X,Y)). temperature: frozen - cold - mild - (warm, hot) - boiling. The function lexicalization(Word, Predicate) extracts the prototypical: cold / warm, neutral: mild. lexical item from the appropriate lexical entry. Finally, when toxicity : poisonous - dangerous - neutral - recommended - the path given in ’conceptsInvolved’ is long (from 2 concepts), beneficial. prototypical: dangerous / beneficial, neutral: neu- a lexicalization of the whole path is produced, e.g. for the tral. example above: ’test of the vaccine’ instead of ’test’ alone, cost: expensive - (reasonable, appropriate) - cheap. using the basic compound NP pattern: In the cluster of arguments being processed the adjectives [A of B] or used as evaluators are collected and their inclusion in one or [A of B of C] more non-branching proportional series is investigated. The where A, B and C are concepts of the path: series that is the most frequently refereed to is kept, and ’C/role/B/role’/A. both the positively and negatively oriented typical adjectives The symbol Be/Predicate entails the lexicalization of the are used in the lexicalization. main predicate of the sentence. It is either the neutral ’be’ The symbol AttributeLexicalization is the direct lexical (is, are) or a specific lexicalization if the attribute name that item that corresponds to the concept. In our approach this is considered includes a higher-level predicate such as prevent, lexical item is stored in a lexicon where lexical entries include evaluate, allow, avoid as shown in the Qualia structure above. lexical items and their associated logical representations, as When there are supports and attacks, this verb appears as described above. When the attribute is propositional, the such and then modified by a negation so that supports and same strategy is used, in that case, an expression is produced attacks can be di↵erentiated in the synthesis, using the fol- via a pattern instead of a single item. This is the case for lowing patterns: example with ’develop(T,X)’ which gets the realization to Supports: develop. [is/are/Verb, (Stats)], 6 18th Workshop on Computational Models of Natural Argument 19 Floris Bex, Floriana Grasso, Nancy Green (eds) 16th July 2017, London, UK This synthesis generation system is quite simple at the mo- and the means to overcome difficulties must be car- ment. Besides domain lexical entries, related to the concepts ried out. in the Qualias (between 50 and 100 lexical entries depending on the issue), the system currently uses 22 patterns that • the load in linguistic and conceptual resource descrip- allow to produce the constructions presented above. These tion for each domain where arguments are mined. patterns are stable for the type of issues we consider, which This includes essentially Qualia structures, lexical en- are simple, with arguments which are direct and concern a tries and associated resources such as non-branching single concept of the network. This generation system needs proportional series, a few generation patterns. Qualia further investigation for more abstract issues or complex sit- structures are often related to a domain ontology. uations such as controversial dialogs. In this first stage, only These resources are not very large for a domain, but the relations between the controversial issue and arguments they nevertheless require some manual e↵ort. An- have been investigated. In a ’real’ argumentation, arguments other aspect is the management of the coherence of may attack or support other arguments instead of the issue. the resources when they are specified at di↵erent It is not clear at the moment whether the same generation levels (e.g. lexical, conceptual). procedure can be used. Probably, to keep the synthesis read- able, additional devices such as navigation links would be • the adequacy of the conceptual model, here the needed to indicate that an argument gets supports or attacks Qualia structure. It is necessary to show that it from others. is indeed sufficiently accurate in a large number of The system described here is relatively simple to implement. domains. A first implementation has been realized with the logic-based platform we developed for discourse analysis. A higher level evaluation concerns the adequacy of this This platform can also be used for language generation since synthesis for professionals who want to access to an argu- it is fully declarative and partly reversible. However, the ment synthesis, where arguments do correspond to their view strategy used in needs to be revised so that the and analysis of the domain. For example, in the hotel and simplest structure is generated first. The parsing strategy that restaurant domains, the features at stake are well identified is used in is indeed a priori oriented towards in consumer evaluation. For more abstract or less common language parsing. areas such as the issues developed in this paper, there is a need to make sure that the concepts developed in the Qualias do correspond to the vision of professional users, otherwise 5.2 Features of an Evaluation such a system will not be of much practical relevance. We also feel there is no unique form of synthesis: several This approach to argument synthesis generation is relatively forms of synthesis could be foreseen that would depend on the simple and straightforward. The two levels: synthesis and reader’s interests and profile. A real evaluation of the system links to the exact arguments stored as clusters seems to presented here requires the development of adequate protocols be a good compromise between proliferation of data and to measure the relevance of various forms of synthesis (more over generalization via a few synthetic lines. Elaborating abstract, more or less concise, using various types of concepts, an appropriate level of synthesis and the way to realize it with appropriate lexicalizations). This can only be done with linguistically needs a lot of experimentations and tunings. large and diverse populations of users over several domains. A direct evaluation of this system must be realized along at least the following main features: 6 CONCLUSION AND PERSPECTIVES • the overall linguistic adequacy of the generation sys- Given a controversial issue, argument mining from texts in tem, based on the patterns presented in the previous natural language is extremely challenging: besides linguis- section. These patterns produce short sentences that tic aspects, domain knowledge is often required together readers can understand, their linguistic adequacy with appropriate forms of inferences to identify arguments. and language overall quality must be evaluated and A major challenge in argument mining is to organize the possibly tuned. arguments which have been mined to generate a synthesis that is readable, synthetic enough and relevant for various • the types of domains and controversial issues for types of users. which this system is adequate, from very concrete Based on the Generative Lexicon (GL) Qualia structure, to more abstract, and for various amounts of argu- which is a kind of lexical and knowledge repository, we have ments, from just a few to several hundreds, including shown how to construct a synthesis that captures the typical duplicates. Several experiments show that Qualia elements found in arguments and their polarity. We propose structures can quite straightforwardly be specified a two-level approach: a synthesis of the arguments that have for concrete areas, this is less easy for areas which been mined and, associated with the elements of this synthe- manipulate abstract or very general purpose con- sis, navigation facilities that allow to access the argument cepts. An evaluation of the limits of the approach contents in order to get more details. 7 20 18th Workshop on Computational Models of Natural Argument Floris Bex, Floriana Grasso, Nancy Green (eds) 16th July 2017, London, UK The work presented in this paper is a first, exploratory [17] G., Winterstein. 2012. What but-sentences argue for: An argu- experiment. Constructing an argument synthesis from a large mentative analysis of ’but’, in Lingua 122. [18] I., Zuckerman, R., McConachy and K. Korb. 2000. Using Argu- diversity of issues, and in various contexts (dialogs, consumer mentation Strategies in Automatic Argument Generation, INLG. opinion expression, etc.) , where arguments may also attack each other, is a complex task. The type of synthesis that would be really useful to the public and to professionals requires a close cooperation with opinion analysts and re- lated professional. Additional features of arguments such as reliability, strength, validity, persuasion, etc. should also be incorporated at some future stage. The next steps of our work include: • the development of other issues and the annotation of related arguments by at least two annotators, this would entail a further validation of our model, • the development of a larger argument mining system based on knowledge, in particular as structured in the Qualia, • the development of tools that would contribute to the creation of Qualias from texts, we have some ongoing work on this very crucial dimension, • the development of adequate and relevant evaluation protocols that would analyze the adequacy of the type of synthesis that is produced, as such and w.r.t. users expectations and implicit model of the domain. REFERENCES [1] K., Budzynska, M., Janier, C., Reed, P. Saint-Dizier, M., Stede, and O. Yakorska. 2014. A model for processing illocutionary structures and argumentation in debates. In proc. LREC, 2014. [2] A. Cruse, Lexical Semantics, Cambridge university Press, 1986. [3] V. W., Feng and G, Hirst. 2011. Classifying arguments by scheme. In Proceedings of the 49th ACL: Human Language Technologies, Portland, USA. [4] A., Fiedler and H., Horacek. 2007. Argumentation within de- ductive reasoning. International Journal of Intelligent Systems, 22(1):49-70. [5] M., Janier, C. and Reed, C. 2015. Towards a Theory of Close Analysis for Dispute Mediation Discourse, Journal of Argumenta- tion. [6] C., Kirschner, J., Eckle-Kohler and I., Gurevych. 2015. Linking the Thoughts: Analysis of Argumentation Structures in Scientific Publications. In: Proceedings of the 2nd Workshop on Argumen- tation Mining, Denver. [7] I. Mani, M. Maybury, 1999. Advances in Automatic Text Sum- marization, MIT Press. [8] R., Mochales Palau and M.F., Moens. 2009. Argumentation min- ing: the detection, classification and structure of arguments in text. Twelfth international ICAIL’09, Barcelona. [9] H., Nguyen and D. Litman. 2015. Extracting Argument and Domain Words for Identifying Argument Components in Texts. In: Proc of the 2nd Workshop on Argumentation Mining, Denver. [10] A., Peldszus and M., Stede. 2016. From argument diagrams to argumentation mining in texts: a survey. International Journal of Cognitive Informatics and Natural Intelligence (IJCINI). [11] J., Pustejovsky. 1995. The Generative Lexicon, MIT Press. [12] P. Saint-Dizier. 2016. Argument Mining: the bottleneck of knowl- edge and lexical resources, LREC, Portoroz. [13] P. Saint-Dizier. 2016b. Challenges of Argument Mining: Gen- erating an Argument Synthesis based on the Qualia Structure, proceedings of INLG, Edinburgh. [14] R., Swanson, B., Ecker and M., Walker. 2015. Argument Mining: Extracting Arguments from Online Dialogue, in proc. SIGDIAL. [15] M.G., Villalba and P., Saint-Dizier. 2012. Some Facets of Ar- gument Mining for Opinion Analysis, COMMA, Vienna, IOS Publishing. [16] M., Walker, P., Anand, J.E., Fox Tree, R., Abbott and J., King. 2012. A Corpus for Research on Deliberation and Debate. Proc. of LREC, Istanbul. 8 18th Workshop on Computational Models of Natural Argument 21 Floris Bex, Floriana Grasso, Nancy Green (eds) 16th July 2017, London, UK