<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Benno Stein</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Khalid Al-Khatib</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Henning Wachsmuth Paderborn University Department of Computer Science</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Philipp Cimiano Bielefeld University AG Semantic Computing</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Yamen Ajjour Roxanne El Baff Bauhaus-Universität Weimar Faculty of Media, Webis Group</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper introduces the Same Side Stance Classification problem and reports on the outcome of a related shared task, which has been collocated with the Sixth Workshop on Argument Mining at the ACL 2019 in Florence.1 We have proposed this task as a variant of the well-known stance classification task: Instead of predicting for a single argument whether it has a positive or negative stance towards a given topic, same side classification 'merely' involves the prediction of whether two given arguments share the same stance. The paper in hand provides the rationale for proposing this task, overviews important related work, describes the developed datasets, and reports on the results along with the main methods of the nine submitted systems. We draw conclusions from these results with respect to the suitability of the task as a proxy for measuring progress in the field of argument mining.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Identifying (i.e., classifying) the stance of an
argument towards a particular topic is a fundamental
task in computational argumentation and argument
mining. The stance of an argument as considered
here is a two-valued function: it can either be “pro”
a topic (meaning, “yes, I agree”), or “con” the topic
(“no, I do not agree”).</p>
      <p>Here we propose a related though simpler task,
called same side stance classification (later also
referred to as sameside). Same side stance
classification deals with the problem of classifying two
arguments as to whether they (a) share the same
stance or (b) have a different stance towards the
topic in question.</p>
      <p>As an example, consider the following two
arguments on the topic “gay marriage”, which
obviously are on the same side.</p>
      <p>Argument 1. Marriage is a commitment to
love and care for your spouse till death. This
is what is heard in all wedding vows. Gays can
clearly qualify for marriage according to these
vows, and any definition of marriage deduced
from these vows.</p>
      <p>Argument 2. Gay Marriage should be
legalized since denying some people the option to
marry is discriminatory and creates a second
class of citizens.</p>
      <p>Argument 3 below, however, is neither on the
side of Argument 1 nor on the side of Argument 2.</p>
      <p>Argument 3. Marriage is the institution that
forms and upholds for society, its values and
symbols are related to procreation. To change
the definition of marriage to include same-sex
couples would destroy its function, because it
could no longer represent the inherently
procreative relationship of opposite-sex pair-bonding.</p>
      <p>Same side stance classification is simpler than
the “classical” stance classification problem, or at
most equally complex: solving the latter implies
solving the former as well.</p>
      <p>Aside from the difference in problem complexity
a second aspect renders same side stance
classification a relevant task of its own right: Stance
classification, by definition, requires knowledge about
the topic that an argument is meant to address, i.e.,
stance classifiers must be trained for a particular
topic and hence cannot be reliably applied to other
(i.e, across) topics. In contrast, a same side stance
classifier does not necessarily need to distinguish
between topic-specific pro- and con-vocabulary;
“merely” the argument similarity within a stance
needs to be assessed. Consequently, same side
stance classification is likely to be solvable
independently of a topic or a domain—so to speak, in
a topic-agnostic fashion. Since topic agnosticity
is a big step towards application robustness and
flexibility, we believe that the development of
technologies that tackle this task has game-changing
potential.</p>
      <p>Last but not least, same side stance classification
has a number of useful and important applications
related to both argumentation analytics and
information retrieval, including but not limited to the
following:</p>
      <p>Measuring the strength of bias within an
argumentative utterance (analytics).</p>
    </sec>
    <sec id="sec-2">
      <title>Structuring a discussion (analytics).</title>
    </sec>
    <sec id="sec-3">
      <title>Finding out who or what is challenging in a</title>
      <p>discussion (analytics, retrieval).</p>
    </sec>
    <sec id="sec-4">
      <title>Filtering wrongly-labeled arguments in a large</title>
      <p>argument corpus, without relying on
knowledge of a topic or a domain (retrieval).</p>
      <p>To initiate research on same side stance
classification, we carried out a first respective shared
task in collocation with the Sixth Workshop on
Argument Mining at ACL 2019. We report on this
shared task and its results in the paper in hand.</p>
      <p>The remainder is organized as follows. Section 2
formalizes the same side stance classification task
and relates it to other problems in the field.
Section 3 points to relevant research and suggested
readings related to stance classification. Section 4
describes the dataset and the experiment settings of
the shared task. Section 5 reports on the systems of
the nine participating teams and their effectiveness.
Section 6 concludes with the lessons learned and
the planned follow-up resarch.</p>
      <sec id="sec-4-1">
        <title>Argument Decision Problems</title>
        <p>The same side stance classification task, sameside,
is a decision task in the field of computational
argumentation. As outlined in Section 1, mastering this
task is beneficial in the context of argumentation
analytics and information retrieval. This section
provides a succinct formalization of the problem.</p>
        <p>The syntax of the argument model underlying
sameside is rather simple but well-accepted: An
argument consists of a conclusion, c, and a set (a
conjunction) of premises, P .</p>
        <p>Both premises and conclusions are considered
as propositions to which a truth value can be
assigned. For this purpose an interpretation function,
I, which maps from premises and conclusion to
f0; 1g can be stated. Based on I the premises
P and the conclusion c can be connected
semantically. Recall in this regard the classical notion of
entailment, which bases the concept of logical
consequence on all possible interpretation functions:
Given two propositional formulas , , then
entails (denoted as j= ) if and only if for all I
holds:</p>
        <p>I( ) = 1 implies I( ) = 1
(1)</p>
        <p>However, for our argument model (and for
argumentation in natural language in general) this
notion of entailment is not applicable: human
language cannot be stuffed entirely into logical
formulas; the detection of semantically equivalent
argument units (which is necessary to transform
formulas whose atoms correspond to argument units)
belongs to the hardest NLP problems; truth
entailment in natural language is not restricted to a
recursive evaluation of truth values but comes in many
different flavors such as argument from authority,
analogical argument, or inductive argument; and
so forth.</p>
        <p>
          In any way, argumentation theory speaks of
acceptability rather than truth, since truth is
often unknown or not accessible
          <xref ref-type="bibr" rid="ref22 ref23">(Wachsmuth et al.,
2017a)</xref>
          . The acceptability of an argument is
subjective, which we capture as follows. Given an
interpretation function I, propositional premises P ,
and a propositional conclusion c, then (c; P ) is an
acceptable argument if and only if holds:
I(^p2P ) = 1 and
        </p>
        <p>I(c) = 1
(2)</p>
        <p>Compared to the classical notion of entailment
the universality requirement regarding
interpretation functions is relaxed. In this vein, (c; P ) may
be an argument for an individual, for a group, or
for all beholders—depending on the respective I.
Also, due to the aforementioned reasons, there is
no simple structural means2 that connects the
interpretation of c to the interpretation of P : For
participants in a debate the interpretation of the
premises may be identical, but their mental models
to determine the truth value of c, as well as the truth
value itself, can differ.</p>
        <p>The formalization of argument acceptability via
interpretation functions as introduced above
illustrates how a belief semantics for arguments can be
formalized. However, the identification and
classification of argument stance (as treated here as well
as treated by other researchers) does not depend on
individual interpretation functions. Arguments are
formulated purposefully with respect to a thesis,
which means that they are always dedicated to be
used either as pro or as con argument—independent
of the acceptability of a beholder.</p>
        <p>To formalize the interesting argument decision
problems will consider a propositional thesis t, also
called the “main claim”, which encodes a particular
“side” of a controversial issue. E.g., when referring
to the introductory example, t may encode “Gay
marriage is a great achievement.”, but t may also
encode “Gay marriage cannot be tolerated.”.3</p>
        <p>Let A = f(c1; P1); (c2; P2); : : : ; (cn; Pn)g be a
set of arguments related to t, then we are also given
an (implicitly defined) function , called “stance”,
which maps each argument A 2 A either to pro
or to con: encodes for which side of a
controversial issue an argument is devised. A pro
argument supports t; likewise, a con argument attacks t.
Two arguments A1 and A2 have the same stance iff
(A1) = (A2).</p>
        <p>Using these definitions, among others the
following decision problems can be stated. Given are a
thesis t and a set of related arguments A.</p>
        <p>sameside. Decide for two arguments, A1, A2
in A whether or not they have the same stance.</p>
        <p>stance. Decide for an argument A in A
whether it has a pro or a con stance, i.e.,
whether (A) = pro or (A) = con.</p>
        <p>Algorithmic stance classification as treated here
means to learn the function from a set of
examples.</p>
        <p>
          2Except for the trivial case where c 2 P .
3Given a thesis t we can consider its opposite as antithesis.
We have first mentioned same side stance
classification as a potential task in the context of argument
search
          <xref ref-type="bibr" rid="ref2">(Ajjour et al., 2019)</xref>
          . Some related previous
research has been concerned with the agreement
of different texts on a given topic
          <xref ref-type="bibr" rid="ref14">(Menini et al.,
2017)</xref>
          . In computational argumentation, the task
is new to our knowledge, which is why we restrict
our view to the most related task in the following:
stance classification.
        </p>
        <p>
          Stance classification has drawn a wide interest
in the last decade. The problem has been studied
for various linguistic genres including online
debates
          <xref ref-type="bibr" rid="ref11 ref18 ref18 ref19">(Somasundaran and Wiebe, 2009; Hasan and
Ng, 2013; Ranade et al., 2013)</xref>
          , political debates
          <xref ref-type="bibr" rid="ref1 ref15 ref21 ref7">(Vilares and He, 2017)</xref>
          , tweets
          <xref ref-type="bibr" rid="ref1 ref15">(Addawood et al.,
2017; Mohammad et al., 2017)</xref>
          , and spontaneous
speech
          <xref ref-type="bibr" rid="ref13">(Levow et al., 2014)</xref>
          . Stance classification
approaches have been motivated by different goals,
such as fact checking
          <xref ref-type="bibr" rid="ref16 ref5 ref7">(Bourgonje et al., 2017; Baly
et al., 2018; Nadeem et al., 2019)</xref>
          , enthymeme
reconstruction
          <xref ref-type="bibr" rid="ref17">(Rajendran et al., 2016)</xref>
          , and
knowledge graph building
          <xref ref-type="bibr" rid="ref20">(Toledo-Ronen et al., 2016)</xref>
          .
The underlying methods concentrate on supervised
learning. Among these, Bar-Haim et al. (2017)
employ a support vector machine with multiple
linguistic features, similar to those used in
sentiment analysis. Iyyer et al. (2014) apply recursive
neural networks, Augenstein et al. (2016) use a
bidirectional LSTM, and Chen et al. (2018)
implement a hybrid neural attention model. Unlike
stance classification, the task we consider here does
widely abstract from the topic on which stance is
expressed.
4
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>Dataset and Experiments</title>
        <p>
          In the shared task we carried out, we have devised
two types of same side stance classification
experiments: within a single topic and across two topics.
The latter experiment type models the situation of
a domain transfer and addresses the question of
topic-agnostic classification. As topics we chose
“gay marriage” and “abortion”, and we sampled
the respective argument datasets from the corpus
underlying the argument search engine args.me
          <xref ref-type="bibr" rid="ref22 ref23">(Wachsmuth et al., 2017b)</xref>
          . The following
subsections provide details about the dataset construction
and the experiment setup.
Because of its size and the balanced stance
distribution, the args.me corpus provides a rich source
for our experiments. At the time of the shared task
the corpus consisted of 387 606 arguments that
collected from 59 637 debates; a detailed description
can be found in
          <xref ref-type="bibr" rid="ref2">(Ajjour et al., 2019)</xref>
          .4
        </p>
        <p>An argument in args.me is modeled as a
conclusion along with a set of supporting premises. In
addition, each premise is labeled with a stance,
indicating whether it is “pro” or “con” the conclusion.
The stances originate from the debates where the
arguments are used in. Debates can be started from
different viewpoints, for instance, a debate may
discuss the viewpoint “abortion should be
legalized” while another may discuss “abortion should
be banned”). Therefore, the stance of an argument
has to interpreted in relation to the arguments in the
same debate. During the acquisition process of the
data for the shared task we followed this constraint
by ensuring that the arguments of an argument pair
stem always from the same debate.</p>
        <p>The count of debates that treat “abortion” and
“gay marriage” is 1567 and 712 respectively. We
filtered out those arguments whose premises are
shorter than four words since they are often meta
statements such as “I win” or “I accept”. As a
result, we kept 9426 arguments on abortion and
4480 arguments on gay marriage for the task.
4.2</p>
        <sec id="sec-4-2-1">
          <title>Experiments</title>
          <p>Starting from the arguments in a debate, we
generated all possible argument pairs. An argument pair
was labeled as “Sameside” if both arguments are
either “pro” or “con” the viewpoint of the debate,
otherwise the pair is labeled as “Diffside”. Pairs
with identical arguments were removed.</p>
        </sec>
        <sec id="sec-4-2-2">
          <title>Within-Topic Experiments The within-topic ex</title>
          <p>periments treat the two topics “Abortion” and “Gay
4The entire args.me corpus can be accessed here: https:
//webis.de/data.html#args-me
marriage” independently of each other. The
training sets each contain 67% of the argument pairs of
one topic, which were randomly chosen. The test
sets were formed from the remaining 33% for the
respective topic. Among others, it was ensured that
a label for an argument pair in the test set cannot be
transitively deduced.5 Note in this regard that the
“same side” relation forms an equivalence relation.
See Table 1 for the within-topic dataset statistics.
Cross-Topics Experiment The cross-topics
experiment provides a different topic for training
from the one for testing. In particular, the
training set contains argument pairs from the “abortion”
debates only, while the test set contains argument
pairs from “gay marriage” debates only.
“Sameside” pairs and “Diffside” pairs are balanced. See
Table 2 for the Cross-Topics dataset statistics.
5</p>
        </sec>
      </sec>
      <sec id="sec-4-3">
        <title>Submitted Systems and Results</title>
        <p>
          Overall, nine teams participated in the first shared
task on same side stance classification. This section
provides a brief overview of the systems that the
teams submitted, along with their results.
Düsseldorf University The system submitted by
Düsseldorf University relies on a Siamese network
trained to predict the similarity of two arguments
on top of a small BERT
          <xref ref-type="bibr" rid="ref9">(Devlin et al., 2018)</xref>
          . As
the maximum token length for BERT is 512 tokens,
a relevance selection component to rank sentences
by relevance is integrated, cutting the ranked input
5With transitive deduction we mean: SameSide(A1; A2)
^ SameSide(A3; A2) ` SameSide(A1; A3)
at 512 tokens. The system achieved an accuracy
of 60% on the within-topic task and 66% across
topics.
        </p>
        <p>IBM Research The system submitted by IBM
is based on a small vanilla BERT model and has
been first fine-tuned to perform standard binary
pro/con stance classification on data extracted from
the IBM Debater project. On top of this model,
another model is initialized and fine-tuned on the
same side classification task. The system obtained
results inverse to the ones of Düsseldorf University:
66% accuracy in the within-topic setting 60% in
the cross-topics setting.</p>
        <p>Leipzig University The system submitted by
Leipzig University uses a pre-trained BERT model
that is fine-tuned on the same side stance
classification task. In addition, a binary classification layer
with one output and cross entropy loss function
is used instead of a multilabel classification layer.
To embed an argument, the first 254 tokens of an
argument are fed through the BERT model. Then,
the last 254 tokens of an argument are embedded.
The concatenation of both embeddings is fed into
the classification layer. The system achieved an
accuracy of 77% in the within-topic setting and
72% on the cross-topics setting.</p>
        <p>LMU The system submitted by the Ludwig
Maximilian University (LMU) relies on a vanilla
pretrained BERT base model that is fine-tuned to the
shared task. The data is organized in a graph with
one graph per topic. Nodes represent arguments,
and edges are labeled with the confidence that the
associated arguments agree with each other. This
graph-based approach has the benefit that more
training data can be generated by a transitive
closure. Its accuracy was 55% in the within topic
setting and 63% in the cross-topic setting.
MLU Halle The system submitted by the
MartinLuther-University (MLU) of Halle-Wittenberg
consists of three system. The first system uses a
treebased learning algorithm as classifier using
standard bag-of-words features. The second is a
rulebased approach that reduces the task to sentiment
classification relying on rules defined over lists of
words with their polarity taken from a sentiment
lexicon. The third is a re-implementation of the
stance classification approach of Bar-Haim et al.
(2017). The best system achieves an accuracy of
54% on the within-topics setting and 50% on the
cross-topics setting.</p>
        <p>
          Paderborn University The system submitted by
Paderborn University relies on a Siamese Neural
Network to map arguments to a new space where
arguments with the same stance are closer to each
other, and other arguments are less close.
Arguments are represented by the contextual word
embeddings provided by the Flair library
          <xref ref-type="bibr" rid="ref3">(Akbik et al.,
2018)</xref>
          . A final sigmoid activation function produces
the output used for same side stance classification.
The system achieved an accuracy of 53% within
topics and 56% across topics.
        </p>
        <p>Trier University The system submitted by Trier
University relies on a pre-trained BERT base model
fine-tuned to the shared task. It was submitted
with different configurations. The best yielded an
accuracy of 77% in the within-topics setting and
73% on the cross-topics setting, the worst 56% and
53% respectively.</p>
        <p>TU Darmstadt The system submitted by the TU
Darmstadt relies on a multi-task deep network on
the basis of the pre-trained large BERT model. The
network is trained on a number of pro/con stance
classification datasets in addition to the shared task
dataset. The system achieved an accuracy of 64%
in the within-topics setting and 63% in the
crosstopic setting.</p>
        <p>University of Potsdam The system submitted by
the University of Potsdam relies on bidirectional
LSTMs to encode the arguments. The embeddings
of both arguments are concatenated, multiplied in
an element-wise fashion, substracted, and fed into
a two-layer MLP as a classification layer. The
system achieved 51% accuracy both within and
across topics.
6</p>
      </sec>
      <sec id="sec-4-4">
        <title>Discussion and Outlook</title>
        <p>The results of the shared task license a number
of interesting conclusions. First of all, the results
have validated our hypothesis that a topic-agnostic
approach to same side stance classification is
feasible. This is clearly conveyed by the fact that the
within-topic and the cross-topics setting seem to
be of a similar complexity. Also, the differences in
accuracy on both tasks are less than 5–6% points,
additionally corroborating the hypothesis.</p>
        <p>A second conclusion is that the effectiveness
of most systems clearly improves over a random
baseline, showing that the task is generally feasible.
At the same time, however, the results show that
there is potential for improvement.</p>
        <p>
          As for other tasks in the field of argumentation,
such as the Argument Reasoning Comprehension
Task, ARCT
          <xref ref-type="bibr" rid="ref10">(Habernal et al., 2018)</xref>
          , encoder-based
models seem to reach top results. In fact, all of the
top-5 performing systems on our task (Trier
University, Leipzig University, IBM Research, TU
Darmstadt, and Düsseldorf University) rely on a BERT
model. They differ mainly in the way the input
is encoded. As the length of input arguments
exceeds the maximum input length for BERT models,
the participants explored and proposed different
approaches, such as encoding the beginning and end
of the arguments separately and then
concatenating these encodings or implementing a relevance
ranking system to encode only the most relevant
sentences of the argument. In any case, the
encoding strategy seems to have a clear impact on the
results and thus deserves further investigation.
        </p>
        <p>For related tasks, e.g. the ARCT, it has been
found recently that encoder-based models seem to
pick up surface cues and artifacts of the dataset
and that they are not really able to learn a model
that shows deeper understanding of how arguments
work. It is up to further investigation whether also
the same side stance classification task bears the
potential for such artifacts that can be picked up
by system. It would be interesting to investigate
which task the encoder-based models actually learn
to solve.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Aseel</given-names>
            <surname>Addawood</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Jodi</given-names>
            <surname>Schneider</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Masooda</given-names>
            <surname>Bashir</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Stance classification of Twitter debates: The encryption debate as a use case</article-title>
          .
          <source>In 8th International Conference on Social Media and Society</source>
          , ACM International Conference Proceeding Series. Association for Computing Machinery.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Yamen</given-names>
            <surname>Ajjour</surname>
          </string-name>
          , Henning Wachsmuth, Johannes Kiesel, Martin Potthast, Matthias Hagen, and
          <string-name>
            <given-names>Benno</given-names>
            <surname>Stein</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Data Acquisition for Argument Search: The args.me corpus</article-title>
          .
          <source>In 42nd German Conference on Artificial Intelligence (KI</source>
          <year>2019</year>
          ). Springer.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Alan</given-names>
            <surname>Akbik</surname>
          </string-name>
          , Duncan Blythe, and
          <string-name>
            <given-names>Roland</given-names>
            <surname>Vollgraf</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Contextual string embeddings for sequence labeling</article-title>
          .
          <source>In Proceedings of the 27th International Conference on Computational Linguistics</source>
          , pages
          <fpage>1638</fpage>
          -
          <lpage>1649</lpage>
          ,
          <string-name>
            <given-names>Santa</given-names>
            <surname>Fe</surname>
          </string-name>
          , New Mexico, USA. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Isabelle</given-names>
            <surname>Augenstein</surname>
          </string-name>
          , Tim Rocktäschel, Andreas Vlachos, and
          <string-name>
            <given-names>Kalina</given-names>
            <surname>Bontcheva</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Stance Detection with Bidirectional Conditional Encoding</article-title>
          .
          <source>In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing</source>
          , pages
          <fpage>876</fpage>
          -
          <lpage>885</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Ramy</given-names>
            <surname>Baly</surname>
          </string-name>
          , Mitra Mohtarami, James Glass, Lluís Màrquez, Alessandro Moschitti, and
          <string-name>
            <given-names>Preslav</given-names>
            <surname>Nakov</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Integrating Stance Detection and Fact Checking in a Unified Corpus</article-title>
          .
          <source>In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>2</volume>
          (
          <issue>Short Papers)</issue>
          , pages
          <fpage>21</fpage>
          -
          <lpage>27</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Roy</given-names>
            <surname>Bar-Haim</surname>
          </string-name>
          , Indrajit Bhattacharya, Francesco Dinuzzo, Amrita Saha, and
          <string-name>
            <given-names>Noam</given-names>
            <surname>Slonim</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Stance Classification of Context-Dependent Claims</article-title>
          .
          <source>In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume</source>
          <volume>1</volume>
          ,
          <string-name>
            <surname>Long</surname>
            <given-names>Papers</given-names>
          </string-name>
          , pages
          <fpage>251</fpage>
          -
          <lpage>261</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Peter</given-names>
            <surname>Bourgonje</surname>
          </string-name>
          , Julian Moreno Schneider, and
          <string-name>
            <given-names>Georg</given-names>
            <surname>Rehm</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>From Clickbait to Fake News Detection: An Approach based on Detecting the Stance of Headlines to Articles</article-title>
          .
          <source>In Proceedings of the 2017 EMNLP Workshop: Natural Language Processing meets Journalism</source>
          , pages
          <fpage>84</fpage>
          -
          <lpage>89</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Di</given-names>
            <surname>Chen</surname>
          </string-name>
          , Jiachen Du, Lidong Bing, and
          <string-name>
            <given-names>Ruifeng</given-names>
            <surname>Xu</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Hybrid Neural Attention for Agreement/Disagreement Inference in Online Debates</article-title>
          .
          <source>In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing</source>
          , pages
          <fpage>665</fpage>
          -
          <lpage>670</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Jacob</given-names>
            <surname>Devlin</surname>
          </string-name>
          ,
          <string-name>
            <surname>Ming-Wei</surname>
            <given-names>Chang</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Kenton</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Kristina</given-names>
            <surname>Toutanova</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Bert: Pre-training of deep bidirectional transformers for language understanding</article-title>
          . arXiv preprint arXiv:
          <year>1810</year>
          .04805.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Ivan</given-names>
            <surname>Habernal</surname>
          </string-name>
          , Henning Wachsmuth, Iryna Gurevych, and
          <string-name>
            <given-names>Benno</given-names>
            <surname>Stein</surname>
          </string-name>
          .
          <year>2018</year>
          . SemEval
          <article-title>-2018 task 12: The argument reasoning comprehension task</article-title>
          .
          <source>In Proceedings of The 12th International Workshop on Semantic Evaluation</source>
          , pages
          <fpage>763</fpage>
          -
          <lpage>772</lpage>
          , New Orleans, Louisiana. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Kazi</given-names>
            <surname>Saidul</surname>
          </string-name>
          Hasan and
          <string-name>
            <given-names>Vincent</given-names>
            <surname>Ng</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Stance Classification of Ideological Debates: Data, Models, Features, and Constraints</article-title>
          .
          <source>In Proceedings of the Sixth International Joint Conference on Natural Language Processing</source>
          , pages
          <fpage>1348</fpage>
          -
          <lpage>1356</lpage>
          .
          <article-title>Asian Federation of Natural Language Processing</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Mohit</given-names>
            <surname>Iyyer</surname>
          </string-name>
          , Peter Enns,
          <string-name>
            <surname>Jordan</surname>
            Boyd-Graber, and
            <given-names>Philip</given-names>
          </string-name>
          <string-name>
            <surname>Resnik</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Political Ideology Detection Using Recursive Neural Networks</article-title>
          .
          <source>In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)</source>
          , pages
          <fpage>1113</fpage>
          -
          <lpage>1122</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>G.</given-names>
            <surname>Levow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Freeman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hrynkevich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ostendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Wright</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Luan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Tran</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Recognition of stance strength and polarity in spontaneous speech</article-title>
          .
          <source>In 2014 IEEE Spoken Language Technology Workshop (SLT)</source>
          , pages
          <fpage>236</fpage>
          -
          <lpage>241</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Stefano</given-names>
            <surname>Menini</surname>
          </string-name>
          , Federico Nanni, Simone Paolo Ponzetto, and
          <string-name>
            <given-names>Sara</given-names>
            <surname>Tonelli</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Topic-based agreement and disagreement in us electoral manifestos</article-title>
          .
          <source>In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing</source>
          , pages
          <fpage>2938</fpage>
          -
          <lpage>2944</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Saif M. Mohammad</surname>
            , Parinaz Sobhani, and
            <given-names>Svetlana</given-names>
          </string-name>
          <string-name>
            <surname>Kiritchenko</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Stance and Sentiment in Tweets</article-title>
          .
          <source>ACM Trans. Internet Technol</source>
          .,
          <volume>17</volume>
          (
          <issue>3</issue>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>Moin</given-names>
            <surname>Nadeem</surname>
          </string-name>
          , Wei Fang, Brian Xu,
          <string-name>
            <given-names>Mitra</given-names>
            <surname>Mohtarami</surname>
          </string-name>
          , and
          <string-name>
            <given-names>James</given-names>
            <surname>Glass</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>FAKTA: An Automatic Endto-End Fact Checking System</article-title>
          .
          <source>In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)</source>
          , pages
          <fpage>78</fpage>
          -
          <lpage>83</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <given-names>Pavithra</given-names>
            <surname>Rajendran</surname>
          </string-name>
          , Danushka Bollegala, and
          <string-name>
            <given-names>Simon</given-names>
            <surname>Parsons</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Contextual stance classification of opinions: A step towards enthymeme reconstruction in online reviews</article-title>
          .
          <source>In Proceedings of the Third Workshop on Argument Mining (ArgMining2016)</source>
          , pages
          <fpage>31</fpage>
          -
          <lpage>39</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <given-names>Sarvesh</given-names>
            <surname>Ranade</surname>
          </string-name>
          , Rajeev Sangal, and
          <string-name>
            <given-names>Radhika</given-names>
            <surname>Mamidi</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Stance Classification in Online Debates by Recognizing Users' Intentions</article-title>
          .
          <source>In Proceedings of the SIGDIAL 2013 Conference</source>
          , pages
          <fpage>61</fpage>
          -
          <lpage>69</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <given-names>Swapna</given-names>
            <surname>Somasundaran</surname>
          </string-name>
          and
          <string-name>
            <given-names>Janyce</given-names>
            <surname>Wiebe</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Recognizing Stances in Online Debates</article-title>
          .
          <source>In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP</source>
          , pages
          <fpage>226</fpage>
          -
          <lpage>234</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <given-names>Orith</given-names>
            <surname>Toledo-Ronen</surname>
          </string-name>
          ,
          <article-title>Roy Bar-Haim, and</article-title>
          <string-name>
            <given-names>Noam</given-names>
            <surname>Slonim</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Expert Stance Graphs for Computational Argumentation</article-title>
          .
          <source>In Proceedings of the Third Workshop on Argument Mining (ArgMining2016)</source>
          , pages
          <fpage>119</fpage>
          -
          <lpage>123</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <given-names>David</given-names>
            <surname>Vilares</surname>
          </string-name>
          and
          <string-name>
            <given-names>Yulan</given-names>
            <surname>He</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Detecting Perspectives in Political Debates</article-title>
          .
          <source>In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing</source>
          , pages
          <fpage>1573</fpage>
          -
          <lpage>1582</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <given-names>Henning</given-names>
            <surname>Wachsmuth</surname>
          </string-name>
          , Nona Naderi, Yufang Hou, Yonatan Bilu, Vinodkumar Prabhakaran, Tim Alberdingk Thijm, Graeme Hirst, and
          <string-name>
            <given-names>Benno</given-names>
            <surname>Stein</surname>
          </string-name>
          . 2017a.
          <article-title>Computational argumentation quality assessment in natural language</article-title>
          .
          <source>In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume</source>
          <volume>1</volume>
          ,
          <string-name>
            <surname>Long</surname>
            <given-names>Papers</given-names>
          </string-name>
          , pages
          <fpage>176</fpage>
          -
          <lpage>187</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <given-names>Henning</given-names>
            <surname>Wachsmuth</surname>
          </string-name>
          , Martin Potthast,
          <string-name>
            <surname>Khalid</surname>
            <given-names>AlKhatib</given-names>
          </string-name>
          , Yamen Ajjour, Jana Puschmann, Jiani Qu, Jonas Dorsch, Viorel Morari, Janek Bevendorff, and
          <string-name>
            <given-names>Benno</given-names>
            <surname>Stein</surname>
          </string-name>
          . 2017b.
          <article-title>Building an argument search engine for the web</article-title>
          .
          <source>In Proceedings of the 4th Workshop on Argument Mining</source>
          , pages
          <fpage>49</fpage>
          -
          <lpage>59</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>