A Pilot Study in Using Argumentation
       Frameworks for Online Debates
Federico CERUTTI a , Alexis PALMER b , Ariel ROSENFELD c , Jan ŠNAJDER d and
                                  Francesca TONI e
                              a
                                Cardiff University, U.K.
                       b
                         Universität Heidelberg, Germany
                            c
                              Bar-Ilan University, Israel
                          d
                            University of Zagreb, Croatia
                         e
                           Imperial College London, U.K.

           Abstract. We describe a pilot study in using argumentation frameworks obtained
           from an online debate to evaluate positions expressed in the debate. This pilot study
           aims at exploring the richness of Computational Argumentation methods and tech-
           niques for evaluating arguments to reason with the output of Argument Mining.
           It uses a hand-generated graphical representation of the debate as an intermediate
           representation from which argumentation frameworks can be extracted, but richer
           than any existing argumentation framework. The intermediate representation can
           provide insights for benchmark sets derived from online debates.
           Keywords. Argumentation frameworks comparison; Benchmarks; Argument
           Mining


1. Introduction

Computational Argumentation (CompArg) is a branch of AI aiming at providing com-
putational models of argumentation; see [6], [7], and [21] for overviews. In its simplest
form, CompArg amounts to characterising and determining (dialectically) acceptable
sets of arguments in any given Abstract Argumentation Framework [14], consisting sim-
ply of a set of abstract entities (the arguments) and a binary relation of attack between
arguments. Several other forms of CompArg have been proposed and deployed in appli-
cations, including the Argumentation Framework with Recursive Attacks (AFRA) [3,4],
allowing attacks to be in turn the object of other attacks, and Quantitative Argumentation
Debate (QuAD) Frameworks [5,19], allowing graded (numerical) acceptability statuses
of arguments [5,19,8].
     This paper describes a pilot study in comparing those different frameworks, building
on top of an Argument Mining (ArgMin) exercise. ArgMin is an emerging field aiming to
automatically extract argumentation structures from natural language texts; see [17,18,
16] for overviews. To this end, it heavily relies on Natural Language Processing (NLP)
to detect the argumentative discourse structure in text and recognize the components of
an argument and relations between them.
     Despite a large theoretical investigation on the semantic intertranslatability of
frameworks [10], to our knowledge there have been no attempts to compare different
                                                    63
frameworks with respect to real-world tasks. With this aim, we considered a pipeline ap-
proach, where the output of ArgMin provides an input to tools developed within Com-
pArg to determine the dialectical acceptability and/or strength of opinions in debates.
    The starting point of the experiment was an excerpt from an online for/against debate
taken from www.createdebate.com. The excerpt is given in Table 1.1 We then:
    1. mapped the debate onto a hand-annotated graphical representation, identifying
       annotations dynamically as demanded by the features of the debate and the opin-
       ions expressed therein; this resulted in a rich annotation scheme with five types
       of nodes and six types of edges;
    2. mapped the hand-annotated graphical representation onto an Abstract Argumen-
       tation Framework and determined the dialectical acceptability of opinions in the
       debate by determining the grounded labelling [2] of the arguments in the frame-
       work;
    3. mapped the hand-annotated graphical representation onto a QuAD Framework
       and used the Arg&Dec tool ([1], www.arganddec.com) to determine the di-
       alectical strength of opinions in the debate, as well as to rank the two answers
       (yes/no) to the debated question;
    4. mapped the hand-annotated representation onto an Argumentation Framework
       with Recursive Attacks and determined the dialectical acceptability of opinions
       by using the grounded extension [4];
    5. compared the results obtained with those frameworks.
The pilot study raises a number of questions, from both the ArgMin and CompArg per-
spectives. In particular, for CompArg:
    • whether our hand-annotated graphical representation can be used as a tool for
      producing cross-framework benchmarks;
    • whether the Argumentation Frameworks and tools considered are sufficiently
      general to serve as a target for reasoning automatically with debates;
    • whether other existing Argumentation Frameworks and tools may be more suit-
      able for the task at hand.
The paper is organised as follows. Section 2 first presents in full the debate used as
a starting point for our experiment, then continues with the hand-annotated graphical
representation of the debate. We additionally discuss the relationships between user-
generated annotations to the dialogue and those from expert annotators. Section 3 shows
the mapping onto Abstract Argumentation, Section 4 onto QuAD, and Section 5 onto
AFRA. In Section 6 we conclude.


2. A Graphical Analysis of the Input Debate

We identified statements in the dialogue as well as the relationships between them via the
means of a graph-based representation. Although this representation has been influenced
by other works, notably the Argument Interchange Format (AIF) [13,20], and Inference

  1 For the full debate see http://goo.gl/DZuRdg


                                             64
Debate question:
Should contraception be covered by health insurance?

#1 Intangible
Noes because that’s not something you need.

#2 sweetspice16 (disputes #1)
You probably shouldn’t make that blanket statement, without any qualifiers or exceptions. For many
women, birth control pills are very important and are necessary to daily life.

#3 Cartman (disputes #2)
What about Viagra, should that be covered by health insurance?

#4 Sitara (disputes #1)
Women have the right to choose what to do with their bodies.

#5 ThePlague (disputes #4)
It is true that women have the right to choose what they wish to do with their bodies, but they have
absolutely no power to force insurance companies to pay for them. That should be left up to the
insurance company, and not the woman.

#6 sweetspice16 (disputes #5)
Oh please. Contraception doesn’t have to be to prevent pregnancy either. I nearly went broke
paying for birth control pills and I was on them because of severe issues. But insurance doesn’t
have to pay for it even then. Men don’t need erections but Viagra is covered in case a patient has
other issues. Birth control should be covered too: no matter what, just in case.

#7 ThePlague (disputes #6)
That is a much more logical argument that the user Sitara. Allow me to continue. I agree with
you. I do not think Viagra should be covered by insurance though and thus do not believe that
contraception should be provided by insurance companies. It should not be covered due to it’s
initial purpose, to prevent pregnancy, hence the name "birth control".

#8 Sitara (disputes #7)
Wrong. I am presenting a very logical argument.

#9 ThePlague (disputes #8)
You cannot follow the purpose of the debate. She used a conparative argument. Viagra is covered
yet contraception is not? That is a much more solid argument since it follows the premise of this
debate. Your argument is over women’s right to choose contraception. This debate doesn’t call for
that.

#10 Sitara (disputes #9)
I have presented a logical argument. I told you why contraception should and will be covered by
insurance, but you choose to ignore logic. Do stop wasting my time.

#11 ThePlague (disputes #10)
Contraception should be required because it is a women’s right to choose? If a murderer wishes
to purchase a weapon to use for mass slaughter will you favor his decision as well since he has
the right to choose what he wants? No. The company has the right to deny service to him and thus
can do the same with contraception. You cannot favor the liberty of women without favoring the
liberty of a business.

#12 Sitara (disputes #11)
 Logical fallacy. Contraception is not comparable
                                               65to murder.
                              Table 1. An excerpt from an online debate.
Anchoring Theory [11], it has been developed in an ad-hoc fashion driven by a linguistic
view point. A comparative study with other approaches is left for future work.
    In particular, we considered five types of nodes:
    • question nodes;
    • answer nodes;
    • standard statements;
    • partial statements—statements with missing premises or conclusions, i.e., en-
      thymemes;
    • distractor statements—statements that are dialectically irrelevant, albeit on topic.
    We linked the nodes using six types of edges, each taking one of several different
possible values, as follows:
    • answer-to-question, from one answer node to a question node (directed edges);
    • standard-explicit, from one standard statement node to another, or to a distractor
      node, a partial statement node, or an answer node (directed edges), with possible
      values attack/support/neither;
    • standard-implicit, from one partial statement to any statement or answer node
      (directed edges), with possible values attack/support;
    • meta, from any statement to any statement (directed edges), with possible values
      attack/support;
    • node-to-edge, from standard statements to edges (directed), with possible values
      attack/support;
    • expansion, amongst any statements (undirected edges).
This analysis resulted in the graph shown in Fig. 1, with Q denoting the question node,
and Y and N denoting the answer nodes. Moreover, we label each node in the graph with
the identifier of the statements made in the debate (e.g., “#2” in Table 1). Some identifiers
in Fig. 1 have a superscript (i.e., 2) to indicate that they actually represent multiple (i.e.,
two) statements.
     For example, we made the following mapping choices in deriving the graph:
    • #4 is a partial statement as it lacks an explicit conclusion;
    • #3 is a distractor statement as it is a sort of distraction from the main point of the
      debate, although still “on topic”, and could be interpreted as intended to promote
      conflict; edges onto distractor statements are neither attacks nor supports;
    • #2 and #6 form an expansion statement because #6 fills in some of the details
      omitted in #2;
    • #12 criticises #11 at the dialectical (meta) level as well as the content (standard)
      level;
    • #8 criticises the attack by #7 on #4.
Comments on the Annotations. Once a debate has started in the system, users may posit
arguments in the form of short textual posts as seen in Table 1. However, as shown in
the expert annotation presented in Fig. 1, some of these posts contain more than a single
argument, which poses the challenge of splitting posts into atomic arguments.
     Furthermore, each user of the debate platform is required to explicitly define how
her posts correspond and relate to the existing posts that were already presented in the
debate. Specifically, the user is required to choose whether her post supports, disputes,
                                              66
                                                         Q


                                              Y                      N        +
                                               +                      +
                              #11         -
                              - -              -
                                                             -       #1       #3
                            #12                -                 -        -
                                                     -                        ? ?
                                    #8     #4                #2            #6
                                -         - +                             + -
                            #9                                   #5
                                                                     2

                            -                  -                                        -
                                                                                    2

                            #10                                                   #7

Figure 1. Output of the initial graphical analysis. Dotted nodes represent partial statements, crossed dotted
nodes represent distractor statements, straight lines represent answer-to-question edges, solid black arrows
represent standard-explicit edges, dotted black arrows represent standard-implicit edges, green arrows represent
meta edges, blue arrows represent node-to-edge edges, and red rounded boxes represent expansion edges. A +
(resp. ) next to an arrow identifies support (resp. dispute); a ? is used when the nature of the relation between
the two statements is unclear.

or clarifies an existing post. We consider these as non-expert annotations of the relations
between the presented posts.
     In the debate on which we focus, all posts were annotated as dispute posts by their
authors. Namely, all posts were annotated as disputing other posts that had already been
presented in the debate. However, in our post factum annotation in Fig. 1, we show that
sometimes it is the support and clarification relations that were actually intended.
     In some cases, dispute annotations can be interpreted as attack annotations. For ex-
ample, post #2 was designated as disputing post #1 by its author and indeed argument
#2 attacks argument #1 (see Fig. 2). Yet, this is not always the case. For example, post
#6 was designated as disputing post #5, however we did not find any significant relation
between these posts.
     Overall, it is our opinion that having non-expert annotations generated by debaters
can be useful as a rough starting point for expert or automated annotation of the rela-
tion between arguments. Nevertheless, one needs to keep in mind that these non-expert
annotations are biased and imperfect.


3. From the Graphical Analysis to an Abstract Argumentation Framework

We mapped the hand-annotated graphical representation given in Fig. 1 onto an Abstract
Argumentation Framework in order to determine the dialectical acceptability of opinions
in the debate. To this aim, we had to identify the two main components of an Abstract
Argumentation Framework, namely the set of arguments and the set of attacks. In fact,
an Abstract Argumentation Framework [14] is composed by a set of arguments whose
nature is left unspecified, and by a binary relation of attacks among them. Therefore, an
Abstract Argumentation Framework can be represented as a directed graph, where nodes
identify arguments, and edges attacks.
     Since the notion of argument is now overloaded with different meanings, in this and
in the following section, argument stands for formal abstract argument, i.e., an element
                                                         67
of a mathematical theory of computational argumentation. We refer to the pieces of texts
considered in the annotation process as statements.

3.1. Identification of Arguments

To compute the dialectical acceptability of opinions in the debate, it was necessary both
to include the two possible outcomes of the dialogue—i.e., whether a player would an-
swer Yes (Y) or No (N) to the question—and to link arguments to the statements put
forward in the dialogue. In particular, we needed to identify atomic statements—as each
player might put forward multiple atomic statements in a single claim. We then aggre-
gated atomic statements into arguments.

3.1.1. Identification of Atomic Statements
The first step is to identify the atomic statements in the dialogue. According to Fig. 1,
nodes #5 and #7 contain two statements each:
    • #5a: It is true that women have the right to choose what they wish to do with their
      bodies,. . . ;
    • #5b: . . . but they have absolutely no power to force insurance companies to pay
      for them. That should be left up to the insurance company, and not the woman.;
    • #7a: That is a much more logical argument that the user Sitara. Allow me to
      continue. I agree with you.;
    • #7b: I do not think Viagra should be covered by insurance though and thus do not
      believe that contraception should be provided by insurance companies. It should
      not be covered due to it’s initial purpose, to prevent pregnancy, hence the name
      “birth control.
The other statements require no further analysis and thus are treated as atomic.

3.1.2. Aggregation of Atomic Statements into Arguments
We then aggregated atomic statements into arguments by exploiting both implicit and
explicit support links, as well as expansion links. Therefore, #2 and #6 together form
the argument #2#6, and similarly #4 together with #5a and #8. However, expansion and
support play different roles: an expansion should be interpreted as a single argument that
spans multiple atomic statements. Support should rather be seen as a combination of two
sub-arguments. We chose to also represent sub-arguments in the Abstract Argumentation
Framework, and thus #4#8 should be considered as an additional argument, as well as
#5a alone.

3.2. Identification of Attacks

To simplify the discussion, we assumed that the arguments Y and N are mutually exclu-
sive, and thus attacking each other. Therefore an implicit or explicit support to a positive
answer to the question (respectively a negative answer to the question) is transformed
into an attack to the negative answer (respectively the positive answer). Since both #5b
and #1 support (cf. Fig. 1) the negative answer to the question, they now both attack the
                                             68
                                                          Y             N

                                                         #5b          #4
                                                  #1                  #8
                                                                      #5a
                           #7a                    #2               #4       #5a
                                       #7b        #6               #8

Figure 2. Abstract Argumentation Framework derived from the analysis in Fig. 1. This figure also shows the
grounded labelling for this Framework: green nodes (solid, strong border) are IN, red nodes (no border) are
OUT, and gray nodes (dotted border) are UNDEC.


Y argument. Similarly, #4 supports a positive answer to the question and thus it attacks
the N argument.
     Moreover, we also considered attacks derived from the attacking links, either explicit
or implicit, depicted in Fig. 1. Therefore, the argument #2#6 attacks #1, and similarly the
argument #4#8 (and clearly its super-argument comprising #4, #8 and #5a) attacks #1.
     Finally, #7b attacks the argument #2#6, while #7a is a self-defeating argument that
also undermines #7b.

3.3. Relevance to the Dialogue and Filtering

The analysis depicted in Fig. 1 requires a language much richer than just abstract ar-
guments and attacks. The links marked with a question mark as well as those denoting
meta-information are rather complicated to represent in the abstract formalism. In this
pilot study we chose to ignore them instead of enforcing a specific semantics that—in
our opinion—is still unclear.
      Similarly, Fig. 1 includes edges pointing to other edges, potentially implying other
sorts of meta-information. Although there are proposals for encompassing recursive at-
tacks on Abstract Argumentation Frameworks [4], for the sake of this work we chose
once again to rely only on Dung’s original proposal which does not allow such cases—
i.e., attacks are only between arguments.
      Consequently, #3, #9, and #10 become unconnected arguments. Similarly, #12 at-
tacks #11, but together they are detached from the rest of the graph. Since they cannot
have any effect whatsoever on the dialectical acceptability of opinions for this dialogue,
in particular they cannot influence the acceptability of arguments Y or N, we chose to
filter them out from the final Abstract Argumentation Framework depicted in Fig. 2.

3.4. Dialectical Acceptability of Arguments

Once the Abstract Argumentation Framework depicted in Fig. 2 is obtained, we can eval-
uate the dialectical acceptability of each argument by identifying positions—i.e., sets
of arguments—that together stand against critiques and form a coherent point of view.
In [14] several criteria are proposed for such a task, and each criterion identifies a spe-
cific position, or extension using Dung’s terminology, given an Abstract Argumentation
Framework. Those criteria can be in terms of labellings: an exhaustive discussion on this
topic is beyond the scope of this paper, interested readers are referred to [2]. In short, in a
complete labelling, an argument is labelled IN if all its attackers are OUT (which clearly
                                                    69
includes the case that the argument is unattacked), OUT if at least one of its attackers is
labelled IN, and UNDEC otherwise. The set of IN arguments in a complete labelling is in
one-to-one correspondence to a complete extension [2]: therefore, the unique complete
labelling, which is depicted in Fig. 2, identifies also the grounded extension (which is the
minimal w.r.t. set inclusion complete extension) as well as the unique preferred exten-
sion (which are maximal w.r.t. set inclusion complete extensions) [14] of this Abstract
Argumentation Framework.
     Both Y and N are OUT as a combined effect of #5b and the argument comprising
#4, #8, and #5a. Although inconclusive, it allows participants in the dialogue to strate-
gically focus their attention. Indeed, let us assume that participants are supporting the
Yes answer, then they should focus on arguing against #5b as it is the only argument
undermining the Y argument.


4. From the Graphical Analysis to a QuAD Framework

4.1. Identification of Arguments and Attacks

As a next step, we mapped the hand-annotated graphical representation given in Fig. 1
onto a QuAD Framework [5], and input this into the Arg&Dec tool2 to determine the
dialectical strength of opinions in the debate, as well as to rank the two answers (Yes/No)
to the debated question.
     We followed the same approach described in Section 3.1 to identify arguments. Un-
like Abstract Argumentation Frameworks, though, QuAD Frameworks allow both attack
and support relationships between arguments to be represented explicitly, by assigning
“types” to arguments (as pros or cons or answers). Thus, arguments #4#8 and #5a can
keep their separate identities in the resulting QuAd Framework, and #5a is no longer
“isolated”. Moreover, QuAD Frameworks, when visualised as graphs, are acyclic, with
the result that neither the mutual attack between the Y and N arguments nor the self-
attack by argument #7a can be represented directly in the resulting QuAD Framework
(see Fig. 3, where pros arguments are indicated with ‘+’, cons arguments are indicated as
‘-’ and answer arguments are indicated by a blue light-bulb/mushroom). Note that, since
arguments have a single “type” in QuAD Frameworks, if an argument simultaneously
attacks one argument and supports another, it (and all its descendants, if any) needs to be
duplicated, as in the case of argument #4#8 in Fig. 3.
     Note also that converting the original graphical analysis in Fig. 1 to the QuAD
Framework in Fig. 3 required simplifications similar to those for converting to Abstract
Argumentation (Section 3.3).

4.2. Dialectical Strength of Arguments

In QuAD Frameworks, arguments are assigned a dialectical strength, from which, in par-
ticular, a ranking amongst answer arguments is determined. Note that ranking answers
amounts to seeing them as “incompatible”; thus the lack of mutual attacks between an-
swers is not a genuine limitation of QuAD Frameworks.
  2 www.arganddec.com


                                             70
Figure 3. QuAD Framework derived from the analysis depicted in Fig. 1, as depicted in Arg&Dec
(www.arganddec.com).


     In order to determine the strength of arguments in QuAD Frameworks, they need to
have a base score to start with (seen as an intrinsic strength, prior to any debate about
the arguments). Note that the self-attacking argument #7a in the original Fig. 1 can be
thought of as having a base score of 0, because of the self-attack, amounting to its com-
puted strength being also 0 (by using, for example, the methods for computing strength
in [5,19]). This renders the argument ineffective [5] and justifies its exclusion from the
QuAD Framework in Fig. 3.
     We experiment with two different policies for assigning base scores to the arguments
included in Fig. 3, leading to different rankings of the answers using the method in [5]:

    1. All arguments have a medium strength (0.5) to start with; this choice results in
       Yes being ranked higher than No (with computed strengths, respectively, 0.875
       and 0.796875);
    2. All arguments have a medium strength (0.5) to start with except

        • Argument #2#6, with a base score close to the maximum allowed (1), by virtue
          of the supporting meta edge from argument #7;
        • Argument #4#8, with a base score close to the minimum allowed (0), by virtue
          of the attacking meta edge from argument #7.

        Choosing base scores 0.9 for #2#6 and 0.1 for #4#8 results in No being ranked
        higher than Yes (with computed strengths, respectively, of 0.811875 and 0.775).

Thus, the use of base scores in QuAD Frameworks can accommodate information (e.g.,
meta-edges) playing no role in Abstract Argumentation Frameworks. Morever, the use
of dialectical strength instead of dialectical acceptability of arguments can help better
discriminate amongst arguments, but is highly sensitive to the choice of underlying base
score. Indeed, it is clear that the choice of base scores influences the final outcome from
the system.
                                             71
                                                             Y            N

                                                           #5b          #4
                                                    #1                  #8
                                                                        #5a
                            #7a                     #2               #4       #5a
                                         #7b        #6               #8

Figure 4. AFRA derived from the analysis depicted in Fig. 1. In green (solid, strong border), the grounded
extension (restricted to the set of arguments) for this Framework.

5. From the Graphical Analysis to an AFRA

5.1. Identification of Arguments and Attacks

As the final step, we mapped the hand-annotated graphical representation (Fig. 1) onto
AFRA [4], and input this into the Aspartix [15] tool3 to determine its grounded extension.
     We followed the same approach described in Section 3.1 to identify arguments.4
Unlike Abstract Argumentation Frameworks, though, AFRA allows attacks to be in turn
the object of other attacks. Therefore, we are now able to represent the attacks from #52
to the attacks between #4#8#5a and #1, (similarly between #4#8 and #1), and between
#4#8#5a and N. The resulting framework is depicted in Fig. 4. Please note that attacks to
supports such as the one from #52 against the support from #4 to Y (Fig. 1) becomes an
attack on the attack from #5a#8#4 against N (Fig. 2, see discussion in Section 3.2).

5.2. Dialectical Strength of Arguments

The semantic notions of AFRA are derived from those that apply for Dung’s Abstract
Argumentation Framework [14]. The main difference is that attacks will also participate
as active actors and thus they can also be part of a semantics extension. In particular,
the grounded extension of the AFRA depicted in Fig. 4 is {N, #5b, #4#8#5a, #4#8, #5a,
↵, , ⇣, ✓, }. Thus, in this representation, the No answer is accepted.
     Fig. 4 also depicts the restriction of the grounded extension to the set of arguments
only. The ⇣ attack, in particular, is pivotal in defending the argument N from the attack
it received from #4#8#5a, which is instead effective when using only the Dung’s frame-
work (Section 3.1).


6. Conclusion

In this paper we discuss a pilot study for comparing different argumentation frameworks
on the basis of the same annotation resulting from an analysis of an online debate. The
analysis suggests that the information captured by the original annotation scheme (Fig. 1)
is much richer than what can be represented in some of the current state-of-the-art frame-
  3 https://www.dbai.tuwien.ac.at/proj/argumentation/systempage/
   4 Although AFRA allows to represent more interactions than Dung’s AF, e.g., #11 attacking the support

from #4 to Y, we chose to consider the same set of arguments identified in Section 3 in order to facilitate the
comparison among the different formalisms.

                                                      72
works and tools. We also lack a ground truth (for assessing which position debated is
strongest) to assess which tool is better equipped for the task of analysing the specific
dialogue we considered in this pilot study.
     In the case at hand, increasing the elements of the original annotation schema in-
cluded in the formal analysis, i.e., the case of AFRA, leads to a less undecided situation
w.r.t. the outcome of the dialogue. Moreover, the use of graded semantics as in QuAD
allows a much more fine-grained analysis and shows how initial assumptions on the base
score of each argument might have a sensible effect on the outcome of the dialogue.
     Apart from highlighting differences between Abstract Argumentation, QuAD, and
AFRA, this pilot study shows how the proposed annotation scheme (Fig. 1) seems well
equipped to represent the complexity of online debates, and that it could be used to
produce a set of benchmarks for a variety of frameworks. In fact, most—if not all—of
the process described in Sections 3, 4, and 5 can be easily automatised. The foremost
issues are determining the arguments (cf. Section 3.1.1), and the relationships among
them, especially considering that non-expert annotations are of little help (cf. Section 2).
     This pilot study may also help in linking the two research areas of Argument Mining
(ArgMin) and of Computational Argumentation (CompArg). In particular, we showed
that the output of a potential ArgMin process—namely, the graphical analysis in Fig. 1—
may become the input to tools developed in the CompArg community for determining
the dialectical acceptability or strength of opinions in debates. Moreover, this mapping
may provide valuable feedback to debaters, for example, to inform strategies regarding
which aspects to focus on in order to modify the outcome of debates, or to make deci-
sions based on debates. Concretely, we mapped a naturally-occurring multi-party debate
from a debate website onto a hand-annotated graphical representation, and then: (a) onto
an Abstract Argumentation Framework to determine the dialectical acceptability of opin-
ions [14]; (b) onto a QuAD Framework [5] to determine the dialectical strength of opin-
ions using the Arg&Dec tool [1]; and (c) onto an AFRA Framework [4] to encompass
more elements of the original analysis (Fig. 1).
     Future work will include evaluating other frameworks proposed in CompArg, e.g.,
ADF [9], or GRAPPA [10], for representing debates at the level of detail required by
the annotations described in Fig. 1. Also, as the investigation of human perception and
behavior in argumentative interactions is becoming more prominent in argumentation
research [12,22], future work will also include a more thorough investigation of how
non-expert annotations made by human debaters can be used by automatic tools.
     This pilot study raises a number of questions also for the ArgMin community, while
at the same time shedding some light on the applicability of the approach taken. For in-
stance it would be interesting to study whether any of the existing NLP methods and tools
could be deployed to support the automatic generation of the initial graphical represen-
tation and annotation scheme. Moreover, it would be interesting to study other debates
to ascertain the generality or otherwise of the annotation scheme we identified.


Acknowledgements

The input debate was suggested by Adam Wyner and Ivan Habernal, as part of the
Dagstuhl seminar on “Natural Language Argumentation: Mining, Processing, and Rea-
soning over Textual Arguments”. We also thank Ivan Habernal for helpful feedback dur-
ing the preliminary graphical analysis of the debate described in Section 2.
                                             73
References

 [1] M. Aurisicchio, P. Baroni, D. Pellegrini, and F. Toni. Comparing and integrating argumentation-based
     with matrix-based decision support in arg&dec. In Theory and Applications of Formal Argumentation -
     Third International Workshop, TAFA 2015, Buenos Aires, Argentina, July 25-26, 2015, Revised Selected
     Papers, pages 1–20, 2015.
 [2] P. Baroni, M. Caminada, and M. Giacomin. An introduction to argumentation semantics. Knowledge
     Engineering Review, 26(4):365–410, 2011.
 [3] P. Baroni, F. Cerutti, M. Giacomin, and G. Guida. Encompassing Attacks to Attacks in Abstract Ar-
     gumentation Frameworks. In C Sossai and G Chemello, editors, ECSQARU 2009, pages 2–7. LLNAI,
     Springer-Verlag, 2009.
 [4] P. Baroni, F. Cerutti, M. Giacomin, and G. Guida. AFRA: Argumentation framework with recursive
     attacks. International Journal of Approximate Reasoning (Special Issue Tenth European Conference on
     Symbolic and Quantitative Approaches to Reasoning with Uncertainty - ECSQARU 2009), 52(1):19–37,
     2011.
 [5] P. Baroni, M. Romano, F. Toni, M. Aurisicchio, and G. Bertanza. Automatic evaluation of design
     alternatives with quantitative argumentation. Argument & Computation, 6(1):24–49, 2015. Special
     issue: Applications of logical approaches to argumentation.
 [6] T. J. M. Bench-Capon and P. E. Dunne. Argumentation in artificial intelligence. Artif. Intell., 171(10-
     15):619–641, 2007.
 [7] P. Besnard and A. Hunter. Elements of Argumentation. The MIT Press, 2008.
 [8] E. Bonzon, J. Delobelle, S. Konieczny, and N. Maudet. A Comparative Study of Ranking-based Seman-
     tics for Abstract Argumentation. In Proceedings of the 30th AAAI Conference on Artificial Intelligence
     (AAAI’16), pages 914–920, 2016.
 [9] G. Brewka, S. Ellmauthaler, H. Strass, J. P. Wallner, and S. Woltran. Abstract dialectical frameworks
     revisited. In Proceedings of the 23rd International Joint Conference on Artificial Intelligence IJCAI
     2013, pages 803–809. AAAI Press, aug 2013.
[10] G. Brewka and S. Woltran. GRAPPA: A Semantical Framework for Graph-Based Argument Processing.
     In 21st European Conference on Artificial Intelligence2, pages 153—-158, 2014.
[11] K. Budzynska and C. Reed. Whence inference? Technical report, University of Dundee, 2011.
[12] F. Cerutti, N. Tintarev, and N. Oren. Formal arguments, preferences, and natural language interfaces to
     humans: an empirical evaluation. In ECAI, pages 207–212, 2014.
[13] C. I. Chesnevar, J. McGinnis, S. Modgil, I. Rahwan, C. Reed, G. R. Simari, M. South, G. A. W.
     Vreeswijk, and S. Willmot. Towards an argument interchange format. The Knowledge Engineering
     Review, 21(04):293, December 2006.
[14] P. M. Dung. On the acceptability of arguments and its fundamental role in nonmonotonic reasoning,
     logic programming and n-person games. Artificial Intelligence, 77(2):321 – 357, 1995.
[15] U Egly, S A Gaggl, and S Woltran. Answer-Set Programming Encodings for Argumentation Frame-
     works. Technical Report DBAI-TR-2008-62, Technische Universität Wien, 2008.
[16] M. Lippi and P. Torroni. Argument Mining: A Machine Learning Perspective. In The 2015 International
     Workshop on Theory and Applications of Formal Argument, 2015.
[17] M-F. Moens. Argumentation mining: Where are we now, where do we want to be and how do we get
     there? In Post-proceedings of the forum for information retrieval evaluation (FIRE 2013), 2014.
[18] A. Peldszus and M. Stede. From argument diagrams to argumentation mining in texts: A survey. Int. J.
     Cogn. Inform. Nat. Intell., 7(1):1–31, January 2013.
[19] A. Rago, F. Toni, M. Aurisicchio, and P. Baroni. Discontinuity-free decision support with quantitative
     argumentation debates. In Principles of Knowledge Representation and Reasoning: Proceedings of the
     Fifteenth International Conference, KR 2016, Cape Town, South Africa, 2016.
[20] I. Rahwan, B. Banihashemi, C. Reed, D. Walton, and S. Abdallah. Representing and classifying argu-
     ments on the Semantic Web. The Knowledge Engineering Review, 26(04):487–511, November 2011.
[21] I. Rahwan and G. R. Simari. Argumentation in Artificial Intelligence. Springer, 2009.
[22] A. Rosenfeld and S. Kraus. Providing arguments in discussions based on the prediction of human
     argumentative behavior. In AAAI, pages 1320–1327, 2015.


                                                    74